-
-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
include offset in output of "warcio index ..." #4
Conversation
I was thinking of adding it as a special case field named Yes, you're right I did actually refactor it at one point, and it was ugly and realized it was unnecessary for the main semantics which is Not sure what's happening with the tests here? |
For reference, here's what it looked like as an Iterator, implementing next directly: Well, i guess it wasn't so bad, but harder to follow :) |
3 similar comments
This reverts commit ab31e07.
Better now? :) I noticed the iterator thing while writing code to extract a single record by offset, which involved using next() directly. That's cool you also tried implementing it the other way. What about just renaming the class to ArchiveIterable? Anyhow it doesn't matter too much, I leave it to your discretion. |
I realize that the other fields are configurable and that offset is different because it's not a warc record header. I think it's reasonable to always include the offset in the output though.
By the way strictly speaking ArchiveIterator isn't actually an iterator, you have to do iter(ArchiveIterator(...)) to get the iterator. It looks like it would be a pain to refactor though.