Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

subdir for temp files #959

Closed
koen84 opened this issue Aug 21, 2020 · 11 comments · Fixed by #965
Closed

subdir for temp files #959

koen84 opened this issue Aug 21, 2020 · 11 comments · Fixed by #965
Assignees

Comments

@koen84
Copy link

koen84 commented Aug 21, 2020

I'm running turbogeth on a 1TB drive, with 197 GiB in tempfiles it got filled to the brim and got stuck. It would be great if temp / disk usage would be taking into account available space.

Regardless, it would be better if tempfiles would get their own subdir (within the datafolder), so it's easier to mount it on a different disk, considering the large size.

@mandrigin
Copy link
Collaborator

That’s a bit weird, I’m running a node on 1TB drive too and have enough space left (about 200 GB).

But I agree with your point about temp files.

@koen84
Copy link
Author

koen84 commented Aug 22, 2020

My main server has 656GB data + 197GB temp files (+ OS / swap) = full drive.
My secondary only has 13 GB temp files.

Both managed to get to a full sync and maintain it. Though for some reason the production server went disk full. (The other one is less powerful yet has more diskspace.)

BTW, any reason the data is 1 giant file rather than the typical arsenal of smaller files ? This makes it incompatible with COW filesystems (BTRFS, ZFS, etc).

@AskAlexSharov
Copy link
Collaborator

AskAlexSharov commented Aug 22, 2020

Reason for gigantic db file - transactions. But I believe we can split it at-least at 2: blockchain (blocks 120Gb) and everything else. For sure it will not happen in close future, need to be super careful with such change

@AskAlexSharov
Copy link
Collaborator

About btrfs, it’s interesting but hard question. LMDB - is B+ tree, BTRFS is B tree. Tree on tree must be or redundant or weird. But i don’t believe that BTRFS and ZFS don’t support 1Tb files. Or you mean their cool features become not cool?

@AskAlexSharov
Copy link
Collaborator

temporary files are 128mb

@mandrigin
Copy link
Collaborator

Temp files aren't persisted though. They are created during some stages of sync (and their size depends on how much data you are syncing) and then they should be removed.
So if you sync form genesis to the current HEAD block number, during some phases, you might end up with a couple of hundreds of GB of temp files when we generate indexes. Mostly because we generate indexes for the whole chain at that point.
After you catch up and the sync goes from, say block 10.001.000 to 10.002.000, we only need to generate indexes for 1000 blocks, so obviously the size of the temp files will be smaller.

Temp files are used in a couple of stages and during db migrations, not only indexes, but the idea is the same.

@mandrigin
Copy link
Collaborator

but sure, it definitely makes sense to make a subdir for temp files, I'll take a look at that

@koen84
Copy link
Author

koen84 commented Aug 23, 2020

@AskAlexSharov 1TB files on a COW filesystem like BTRFS would surely be brutal for perforamance, seeing they'd get many small writes ? My main reason to want to explore this avenue is snapshotting and send+receive thereof, so that in case of issues i can easily revert back.

@mandrigin currently the amount of temp files still growing. I had successfully completed full sync before, so it's only doing catch up since. (Till it got stuck by full disk, which i've since managed to resume.)

My chaindata is 657GiB.
The tg-sync-sortable-buf<9numbers> amount for 272 GiB. These files are 283-285MiB in size. And while i see the log mentioning it's removing some, the total amount is still growing. It seems what's cleanup are newly created files, the old ones all remain. Did it leave files it forgot ? What's the effect of stopping turboget, removing all these files and starting turbogeth ?
I've put this last part in #969 as it might be an issue, that's unrelated to subdir or not.

@AlexeyAkhunov
Copy link
Contributor

Thanks for your report! The old temp files files are not cleaned automatically, do at the moment they need to be removed manually. If you stop turbo-geth, and remove the files, it will not have any adverse effect

@AskAlexSharov
Copy link
Collaborator

I’m sure that 1Tb usual files and 1Tb of mmap files are not equal terabyte. Because only OS can read/write them, not App, and i’m sure that OS has integration with BTRFS for this case. But, yes, definitely we need explore that snapshotting works well.

@AskAlexSharov
Copy link
Collaborator

What i really mean: BTRFS needs not just small files But knowledge “what changed” for incremental Snapshoting, and OS has information which pages are new in Mmap file and when was updated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants