Skip to content

Data leak with goleveldb backend? #374

@mmindenhall

Description

@mmindenhall

The application I'm working on runs on IoT gateways with limited resources (~100MB storage free). Therefore, I submitted #373 to be able to proactively reclaim disk space after deleting "expired" documents from indexes.

I wrote a test where I do the following:

  1. Create and initialize a new index, take snapshot of size of index folder
  2. Do the following 5 times:
  3. Add 1000 documents to the index
  4. Take snapshot of size of index folder
  5. Delete all documents from the index
  6. Call the new Compact method I added in Add compact method to goleveldb store #373
  7. Take snapshot of size of index folder

Here are the sizes of the index folder:

Size at start: 52K

Iteration After adding 1K docs After delete / compact
1 6.7M 876K
2 7.4M 1.6M
3 8.1M 2.2M
4 8.7M 2.9M
5 9.4M 3.5M

Just to be sure the Compact method was actually doing something, I commented out that line and ran again (create 1000, delete 1000, no compact). Without the call to Compact, deleting the 1000 documents actually increases the size of the index:

Iteration After adding 1K docs After delete only
1 6.6M 8.5M
2 7.4M 9.3M
3 8.0M 9.9M
4 8.6M 11M
5 9.4M 11M

Is this a bug? I'm wondering if there's some set of keys that gets created when a document is indexed that is not getting deleted when the document is removed?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions