Explore rewriting parser with libxml #70

Closed
odrobnik opened this Issue Sep 9, 2011 · 4 comments

Comments

Projects
None yet
2 participants
Collaborator

odrobnik commented Sep 9, 2011

Somebody should explore if parsing performance could be improved by using libxml to parse the document into an XML tree first and then recursively walk through the tree to build the attributed string.

An example of a wrapper around libxml can be found here: https://github.com/zootreeves/Objective-C-HMTL-Parser

Contributor

dhoerl commented Sep 14, 2011

Another option is to use htmlcxx, a C++ HTML parser, that I use now in concert with your project: https://github.com/dhoerl/htmlcxx on the iPhone. It builds recursive lightweight objects that generally point back to the primary document and seems quite fast.

Collaborator

odrobnik commented Jan 21, 2012

Click here to lend your support to: Migrate DTCoreText to libxml2 and make a donation at www.pledgie.com !

Collaborator

odrobnik commented Jan 23, 2012

libxml2 implementation now in libxml2 branch. Looking very good so far. Needs tuning and cleanup, but it's a great start.

odrobnik closed this Jan 23, 2012

Contributor

dhoerl commented Jan 23, 2012

On 1/23/12 2:55 AM, Oliver Drobnik wrote:

libxml2 implementation now in libxml2 branch. Looking very good so far. Needs tuning and cleanup, but it's a great start.


Reply to this email directly or view it on GitHub:
#70 (comment)

Sounds great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment