Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

OOM error by feeding a huge xml to Nokogiri::XML::Reader in JRuby #524

Closed
wants to merge 1 commit into
from

Conversation

Projects
None yet
4 participants

The pure-java Nokogiri::XML::Reader can't read a huge xml
because it always reads whole file on memory.

See(Ja): https://twitter.com/#!/eban/status/98629501499613184

To solve this problem,
I've changed Nokogiri::XML::Reader to use StAX.
It uses less memory and seems to work on better performance.

But it doesn't support Nokogiri::XML::Reader#inner_xml and Nokogiri::XML::Reader#outer_xml.
I didn't come up with how to support these with StAX API.
Does somebody have any idea for this?

Owner

yokolet commented Oct 4, 2011

Thanks for the patch. However, this patch added 10 failures and 8 errors to "rake test" So, I can't merge this as it is.

Using javax.xml.stream API looks a good idea. I'll rethink XmlReader implementation.

Would you file the issue that the original reporter talked about?

We have the same issue. It appears that the Nokogiri::XML::Reader pull-parser is not functional for JRuby. Any reasonably large file results in Java::JavaLang::OutOfMemoryError. Would love to see this patched.

Member

jvshahid commented Nov 12, 2013

There has been no activity on this pr for 2 years and no tests which makes it impossible to tell whether the issue this pr is trying to fix is fixed on master or not. I'll go ahead and close it. If you feel this should be revisited please reopen the pr after you add a test and rebase on master.

Thanks

@jvshahid jvshahid closed this Nov 12, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment