Skip to content

Prevent eternal loop by ensuring that no libxml errors with level [LIBXML_ERR_FATAL] have occurred#44

Closed
DaanBiesterbos wants to merge 10 commits intosabre-io:masterfrom
DaanBiesterbos:master
Closed

Prevent eternal loop by ensuring that no libxml errors with level [LIBXML_ERR_FATAL] have occurred#44
DaanBiesterbos wants to merge 10 commits intosabre-io:masterfrom
DaanBiesterbos:master

Conversation

@DaanBiesterbos
Copy link

Hi,
I was facing some instability issues regarding invalid XML documents. I would expect an exception but instead the Reader will (in some cases) attempt to keep parsing the same node eternally. I fixed this by adding a validation step. The XMLReader::isValid may not be backward compatible so I made the validation step optional. Altough I did not validate this thouroughly it seems like this method will only work when the something like a XSD, DTD etc is in place. Either way. The feature is optional, so this change should not be problematic.

Update:
The validation did not fix all scenario's. The new solution does. Basically we now check if libxml has fatal errors on each iteration. I did not implement this as a optional feature since it ensures that your application will not crash when invalid XML is provided.

@Hywan Hywan added the bug label May 29, 2015
@Hywan
Copy link
Member

Hywan commented May 29, 2015

Hello :-),

First, can you give us an example of your XML in order to reproduce your issue please?
Second, do you have any idea of the impact on performances?
Third, is it possible to validate the whole document once instead of each node?

@evert
Copy link
Member

evert commented May 29, 2015

Damn... lost my work in progress comment.

@Hywan One of the key concepts of the XMLReader/XMLWriter is that they can operate on streams. So if validation is possible for the whole document, I don't think it would be desired. I'm also not super concerned about speed if it's an optional feature.

@DaanBiesterbos

A few points:

  • @Hywan is right that this change needs a test. A unittest that checks both the valid and invalid case.
  • You can automatically fix all CS-related issues by running : ./bin/sabre-cs-fixer fix .
  • The docblocks should be expanded a bit to explain exactly what the feature does, and how to use it.
  • Don't return $this from the setter.

@evert
Copy link
Member

evert commented May 29, 2015

Lastly... is it possible to emit a more useful exception? It would be great if we can tell the user why their document isn't valid... i assume that XMLReader has a way to access this information?

@evert evert added enhancement and removed bug labels May 29, 2015
@evert evert self-assigned this May 29, 2015
@DaanBiesterbos
Copy link
Author

@Hywan @evert Hi guys. I will prepare something to reproduce the problem and improve the commit as described above. As evert pointed out the isValid method will always only check the current node. But there is a wa to check if the XML is valid according to the schema. (XMLReader::setSchema) This does seem to work for the entire tree. Nodes that violate the schema are detected but syntax errors seem to be ignored completely. I tried to invoke $reader->setParserProperty(XMLReader::VALIDATE, true) as well but in my case that didn't change anyhing so I got rid of it.

One more thing...
I executed the the ./bin/sabre-cs-fixer fix command. But a LogicException was thrown in ./vendor/symfony/finder/Symfony/Component/Finder/Finder.php on line 698. Message: "You must call one of in() or append() methods before iterating over a Finder."
$config->getDir() returns null in "/home/daan/Projecten/SabreXmlReader/sabre-xml/vendor/fabpot/php-cs-fixer/Symfony/CS/Fixer.php" on line 137.

It seems like adding '$out->setDir(DIR . '/../lib');' to "/home/daan/Projecten/SabreXmlReader/sabre-xml/vendor/sabre/cs/bin/php-cs-fixer-config.php" solved the problem locally. I can now execute the command. Not sure what is causing this though. I tried downgrading to symfony 2.6.7, but that did not solve it.

@evert
Copy link
Member

evert commented May 29, 2015

Did you do sabre-cs-fixer fix . ? the last . is important =)

@DaanBiesterbos
Copy link
Author

@evert Ah. You are right. Missed the dot. :-)

@DaanBiesterbos DaanBiesterbos changed the title Add optional validation method to ensure that each XML node is valid Prevent eternal loop by ensuring that no libxml errors with level [LIBXML_ERR_FATAL] have occurred May 30, 2015
@DaanBiesterbos
Copy link
Author

To reproduce the issue you can run the unit test and comment the solution :) All scenario's that should fail because of missing tags or malformed XML will lead to an eternal loop.

@DaanBiesterbos
Copy link
Author

@evert Hi Evert. How do you feel about the provided solution? Were you able to reproduce the problem?

@evert
Copy link
Member

evert commented Jun 8, 2015

Hi Daan,

The reason I have not really gotten to it yet, is because this is a relatively large diff, for a relatively small change. There's a lot of unrelated changes in the source, and things could probably get simplified a bit.

Will make a few comments in-line.

@evert
Copy link
Member

evert commented Jun 8, 2015

The biggest thing I would want to ask is.. how small can you make this unittest? Right now there's a ton of assertions, and a lot of other stuff. Could you create a very minimal schema?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated change, revert this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants