Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement LIBXML_PARSEHUGE #294

Closed
wants to merge 5 commits into from
Closed

Conversation

robmcvey
Copy link

@robmcvey robmcvey commented Mar 9, 2018

Parsing files > 10mb results in "XML error: No memory" regardless of PHP memory_limit settings. Using LibXML's LIBXML_PARSEHUGE param and chunking up the XML data as a resource in a loop fixes (Tested on a 150mb RDF).

@@ -159,6 +159,13 @@ public function testParseFile()
$this->assertSame(null, $name->getDatatype());
}

public function testParseLargeFile()
Copy link

@Daniel-KM Daniel-KM Oct 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove the tabs on all this function? It prevents Travis check.

Suggested change
public function testParseLargeFile()
public function testParseLargeFile()
{
$graph = new Graph();
$count = $graph->parseFile(fixturePath('stw.rdf'));
$this->assertSame(109340, $count);
}


$resource = fopen('data://text/plain,' . $data, 'r');

while ($data = fread($resource, 1024 * 1024)) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You said you test it on a 150 MB xml, but is it a one line xml or a multiline one? If this is a one line (or that contains lines bigger than 1024 x 1024 characters, does it work fine?

@k00ni
Copy link
Contributor

k00ni commented May 7, 2020

What is the status here?

@k00ni
Copy link
Contributor

k00ni commented Jul 6, 2020

This pull request should be merged. Failing test is because of a timeout, please restart test suite.

Ref: #320

k00ni added a commit to k00ni/easyrdf that referenced this pull request Jul 6, 2020
These changes were contributed by @zozlak on my fork:
https://github.com/sweetyrdf/easyrdf/pull/16

There was a code change in the meantime, in comparison to
easyrdf/easyrdf::master. It was merged from easyrdf#294. It introduced the `fopen` call
when loading XML file.

Co-authored-by: Mateusz Żółtak <zozlak@zozlak.org>
Co-authored-by: Konrad Abicht <hi@inspirito.de>
@njh
Copy link
Collaborator

njh commented Aug 27, 2020

Thanks for your contribution @robmcvey.

I have decided to split the fixes out a bit and I am replacing this PR with #352 and #357.

Sorry it didn't get merged.

@njh njh closed this Aug 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants