Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to check if block offsets have been calculated? #10

Closed
hyanwong opened this issue Oct 8, 2020 · 3 comments
Closed

How to check if block offsets have been calculated? #10

hyanwong opened this issue Oct 8, 2020 · 3 comments

Comments

@hyanwong
Copy link

hyanwong commented Oct 8, 2020

Sorry to keep on opening issues! Is there any way to check whether an IndexedBzip2File has had block offsets calculated yet or not? If it takes 8 hours to calculate them for my huge file, it's worth checking before running e.g. a seek(), and outputting a warning that this might take some time.

@mxmlnkn
Copy link
Owner

mxmlnkn commented Oct 9, 2020

:) I'm glad someone uses it and cares enough to give feedback. Initially, this project only started to keep the required C++ bz2 code base for ratarmount separate from the python-only tar mount script. That's why the indexed_bzip2 interface is still fairly limited.

I see no harm in adding such a method. Will do.

@hyanwong
Copy link
Author

hyanwong commented Oct 9, 2020

Thanks. Incidentally, your code has sped up my workflow so that running my script (to parse Wikidata/Wikipedia files for https://www.onezoom.org) now takes under an hour with 64 processes, whereas it used to take 2 days. That's a huge difference, and makes my life so much easier. So I'm very grateful!

@mxmlnkn
Copy link
Owner

mxmlnkn commented Oct 10, 2020

I added a block_offsets_completed function and even a available_block_offsets method, which only returns those already calculated up to this point.

@mxmlnkn mxmlnkn closed this as completed Oct 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants