Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replace calls to find_all with custom traversal #64

Merged
merged 1 commit into from
Mar 21, 2016

Conversation

kylewm
Copy link
Collaborator

@kylewm kylewm commented Mar 10, 2016

bs4.find and find_all seem to be unduly expensive operations.
just replacing them with naive get_children and get_descendants
calls cuts the total parse time in half for lots of documents.

Parsing kylewm.com went from around 130ms to 66ms
(epeus.blogspot.com and chocolateandvodka.com -- backcompat
tests -- are about the same), the very large page at
rhiaro.co.uk/travel from 780ms to 420ms

Ref #63

bs4.find and find_all seem to be unduly expensive operations.
just replacing them with naive get_children and get_descendants
calls cuts the parse time in half for lots of documents.
@kylewm
Copy link
Collaborator Author

kylewm commented Mar 12, 2016

this seems to have a reduced average parse times on Bridgy from ~300ms to ~150ms

@snarfed
Copy link
Member

snarfed commented Mar 21, 2016

ping @tommorris? can't wait to use this (released properly) in bridgy. i'd also nominate/second adding @kylewm as a committer to the repo if you're taking suggestions. :P

@kylewm
Copy link
Collaborator Author

kylewm commented Mar 21, 2016

@snarfed oh lol, I already am.

kylewm pushed a commit that referenced this pull request Mar 21, 2016
replace calls to find_all with custom traversal
@kylewm kylewm merged commit 921ef2c into microformats:master Mar 21, 2016
@kylewm
Copy link
Collaborator Author

kylewm commented Mar 21, 2016

released version 1.0.4 https://pypi.python.org/pypi/mf2py/1.0.4

snarfed added a commit to snarfed/bridgy that referenced this pull request Mar 21, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants