New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parser Update #55
Parser Update #55
Conversation
adding issues link
adding issues link
# get the function out of the lookup_table that matches 'tag' | ||
#func = new_book.lookup_table[tag] | ||
# call the function on the child element | ||
#func(child) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete my example code that you commented out
statics/fixes is_bag strips comments, etc
forgot to set pgcat data (type elements)
This fixes the issue of bags in two ways: one, I was passing the 'root' element and not the iterated item two, I generated a static function leaf_element -- this returns the outermost 'first' element, so be careful if your branch splits on the way and you care
oops, wrong button. traversal issue also fixed. |
This implements __setitem__ and __getitem__ for an Ebook item, while also dumping all the set_ methods. It looks a little kludgier, but it makes it much more flexible
Ok, so this parser is |
headed to different workstation
this Should be able to drop-in replace RDFparse.py now. It generates the pickle differently, so here be dragons.
Updated it to make it fit into GITenberg.py so it can just drop right in. In theory. |
getitem now returns NoneType for items the object doesn't have
This does not yet set mdate/filename for a book though, but it comes close.
Added in book culling, so this should now operate as a true drop-in replacement of rdfparse.py
unsure how necessary this is now, but its in the original, and is EASILY removed.
Add navigational support to help orient newcomers
Fix link to web site in contributing template
Add more links
Closing due to obsolescence. |
This is a nigh complete rewrite of the parser to do proper multi-element support while also being substantially easier to work with as well as being more pythonic.
It does have run-time overhead in that the parse(catalog) consumes ~1gb ram on a 250mb RDF file, but a reasonable computer should process this quickly and release quickly.
This should hopefully resolve Issue #20 while it's at it.