Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deleting a snapshot does not remove files #5281

Closed
agember opened this issue Dec 19, 2019 · 12 comments
Closed

Deleting a snapshot does not remove files #5281

agember opened this issue Dec 19, 2019 · 12 comments

Comments

@agember
Copy link
Contributor

agember commented Dec 19, 2019

When removing a snapshot (using bf_delete_snapshot with pybatfish), the files associated with the snapshot remain in the containers directory. Reclaiming the disk space by removing the files would be valuable when lots of snapshots are created (e.g., to iterate over failure scenarios) and/or the snapshots are large.

@AlexLardschneider
Copy link

Same issue here. Currently sitting at around 1.5 GB storage used, even though I used bf_delete_snapshot().

@ratulm
Copy link
Member

ratulm commented Oct 24, 2020

@arifogel - is this something that has become easy/safe to do based on your recently changes or is there a workaround other than deleting the containers folder entirely?

@arifogel
Copy link
Member

Currently, Batfish will garbage collect stuff that is at least 10 minutes older than the oldest remaining snapshot.
GC is triggered on snapshot deletion, network deletion, and snapshot upload.

@arifogel
Copy link
Member

I should point out that this is not a documented feature, and we make no guarantees about this behavior.

@AlexLardschneider
Copy link

So what would be the proper way to delete old networks/snapshots?

Even though I am using bf_init_snapshot(overwrite=True) I still end up with dozens of files in the container folder (especially inside the networks folder).

Using bf_delete_snapshot() does not seem to make any difference.

@arifogel
Copy link
Member

@AlexLardschneider are you seeing behavior contrary to what I posted earlier?

Currently, Batfish will garbage collect stuff that is at least 10 minutes older than the oldest remaining snapshot.
GC is triggered on snapshot deletion, network deletion, and snapshot upload.

If so, what files are remaining? Can you please provide instructions to reproduce?

@AlexLardschneider
Copy link

Using bf_init_snapshot(overwrite=True), there should ever only be one active snapshot, according to the documentation, since it will overwrite a snapshot with the same name. This snapshot would then also be the oldest remaining snapshot since it is the only one.

After running batfish a couple of times (and waiting 10 minutes), I am left with the following files (output of tree -L 2):

data/containers
├── answers
│   ├── 02c776641e09ba86bdbcfd4ecafd4a61
│   ├── 0c8baa2a7ac44cdde5af480aad2fedf8
│   ├── 101ff9d5b4b0f28e900889301855ce71
│   ├── 10c7220e256fcef5488dda0e581ddf90
│   ├── 13e9ff711170e63d7f4740d73d52bd1d
│   ├── 181d093b55c409ab6834f68f99a0b274
│   ├── 27d998781492095fe38346a6c4239734
│   ├── 359cf2fdbce53797bc9ff644f84d1aa6
│   ├── 369c515ea204cbbc04ad77be29c7c97f
│   ├── 3b3bede7cf4d21eba317b0e4d8bde060
│   ├── 3ed3e408994e81dc66d1507f6675b21c
│   ├── 4445a46c6ff038bfb57d98c17ed885ce
│   ├── 4745f350d0b4aaad9b4e9078647f2d26
│   ├── 47b7ee520920d69bca1a007596110729
│   ├── 47c04830092e5fb31d20e96289106156
│   ├── 52a6c7323be194ddabb6114e55d81baa
│   ├── 5815c57e0180c8db736ec3723447cc0c
│   ├── 5f22a7beb2a81801ebd3a06ec00d71c3
│   ├── 63eded85f0579b32dac25410c9800029
│   ├── 64a42c988df64276546ebb89dc9de211
│   ├── 66ea880930f88580332a663c4532c8c0
│   ├── 68abd1d039010628c5320ccb4a01195f
│   ├── 6f6cb2066246a383aa68a42ba58873cf
│   ├── 79b67560152595ab1baafb3c8838008f
│   ├── 7d4a7271c06b55fae80ab47bcb36770d
│   ├── 81fcf7232ac6e728f5a7a045c629e664
│   ├── 842d6912274fcfbdab70df8f35504295
│   ├── 84880acbe23fd9f57b51d2ddace62233
│   ├── 8786d46b14d09ab41af1394b7a6a34ec
│   ├── 8a53d496200b0de5ed8e44c52494ee0c
│   ├── 8aaab53972932e13797f6bdb6b53c949
│   ├── 91e75bd50836e5cd00f1ce0152174f82
│   ├── 959cd23214bf64b35bb68383d0b2049e
│   ├── 9abbe3cd07fa1129c53f3402809e6975
│   ├── 9ad7d5afb47e2f0a135bd125ef84c83f
│   ├── a52fc52d77576d48f7445a9ae9724b9a
│   ├── aaec8076f495c9bbd20ee3f8c168d6c4
│   ├── aca3c8cdf565534d2eaae99d77ed17d8
│   ├── af5f74f879eb46840911b8eca89fcc54
│   ├── b7131e406bcc6a6834b40440b90cdb6d
│   ├── b80063fd63e194737e2704d045ca6db9
│   ├── b8539d84799ef17886d8a6fcb44fc1ee
│   ├── c7ac60b5cc9ba7b85f52cfcb86c85c0b
│   ├── c822e6376951280cd64e8eb1d5a8371e
│   ├── f3ec51202717aed3890a56bcdcf2b655
│   ├── f4d8073a58ac155f987750cd6704dd25
│   ├── f69a8f868ebf6138494f5a270e98531f
│   ├── f6eb0ba497615ada9ed2843d9e0bca35
│   └── fe02eff6f30aa4d420b53cebd8abda3d
├── ids
│   └── b3JnLmJhdGZpc2guaWRlbnRpZmllcnMuTmV0d29ya0lk
├── networks
│   ├── 0853d50d-fb19-4cde-9e91-bb943ddb2b44
│   ├── 12147342-8fc6-4ee8-b8e2-5cbcf38f035b
│   ├── 12ca1394-5d02-40af-9cff-f62cbbb06865
│   ├── 33312aac-3f26-4cb5-b7e7-845bcd4bcfe1
│   ├── 600881d9-46ca-4126-8d3a-4f8a985fe4cc
│   ├── 637d8c35-8872-4cda-a770-f8ceac9d0107
│   ├── 779f5f38-704e-4473-b6b7-188d04ad658c
│   ├── b5d7b5e9-d972-4f49-b1c2-be9962664dea
│   ├── cb433b31-48a8-4d39-b0d7-37cb952e8e4b
│   ├── ceffe445-b1c4-40b4-88bd-49c3aa6346ad
│   ├── db462550-e9d4-4e96-a89c-e90917a1a6cf
│   ├── eb20c9dc-a8b9-454e-a566-19928dd3eaf6
│   └── ee8936c6-cf9f-4a91-9330-414146269c28
└── node_roles
    ├── 0df3dc4e3326999af1c899e8037494b2.json
    ├── 40049c19c1bd0f82580ba5168b0d2414.json
    ├── 4512ca3c13cc57ad8f91f18da31e17a7.json
    ├── 55cbe86d64d43280d05a23e84a6ba6d3.json
    ├── 653ae0c898d9b7c80670d149e25ade5c.json
    ├── 6dec06424440c3bff2174b9bb2977a5d.json
    ├── 9b05f69caefa9ddfd86a876586e24240.json
    ├── 9e945c7ed17f05da38e3931652785b84.json
    ├── bf58c5f7ded13d46779fd10242b8534e.json
    ├── e08ba64ca471599490ded2f14c6b6721.json
    └── f5369e07abc01024af872421a440c10c.json

At the moment of writing this, the containers folder is taking up around 1.2 GB.

Steps to reproduce:

bf_init_snapshot('snapshot_dir', name='NAME', overwrite=True)
load_questions()
bf_delete_snapshot(name='NAME')

@arifogel
Copy link
Member

After running batfish a couple of times (and waiting 10 minutes)

Sorry, are you running, then waiting ten minutes, then checking?
Or you are running, waiting at least 10 minutes, then overwriting?

What happens if you don't use overwrite, but init a second snapshot at least 10 minutes after the first one finishes, and then delete the old one?

@AlexLardschneider
Copy link

Sorry, are you running, then waiting ten minutes, then checking?
Or you are running, waiting at least 10 minutes, then overwriting?

Tried both, without success.

What happens if you don't use overwrite, but init a second snapshot at least 10 minutes after the first one finishes, and then delete the old one?

Could you please clarify what you mean? Init a new snapshot with another name, or with the same one?

@dhalperi
Copy link
Member

The current model is that an actively-used Batfish installation will eventually clean up deleted data. But the garbage is collected lazily and may require new operations (initializing a new network, initializing a new snapshot, etc). If the service is used only periodically, there are no guarantees about latency on collection.

The code was introduced by @arifogel in #6081 and some follow-up PRs.

PRs welcome to improve behavior!

@dhalperi
Copy link
Member

dhalperi commented Nov 4, 2020

Update: also, only data older than the oldest snapshot that exists in any network will be deleted.

PRs welcome to improve behavior!

@ratulm
Copy link
Member

ratulm commented Aug 25, 2021

#7237 and #7263 should fix this issue. Answers/snapshot/network data will now be deleted independent of the age of other snapshots.

@ratulm ratulm closed this as completed Aug 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants