Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XPath – text() should use normalization #21

Open
FilipJirsak opened this issue Jan 9, 2017 · 2 comments
Open

XPath – text() should use normalization #21

FilipJirsak opened this issue Jan 9, 2017 · 2 comments
Assignees
Labels
Milestone

Comments

@FilipJirsak
Copy link
Contributor

Input document can have one element divided into multiple text nodes (text() method than returns multiple nodes). XPath matching than doesn't work - it matches text only with first text node. Text nodes should be normalized before matching XPath.
Input text is divided into multiple text nodes for example when it contains entities - for example < or >.

@FilipJirsak FilipJirsak added the bug label Jan 9, 2017
@FilipJirsak FilipJirsak self-assigned this Jan 9, 2017
@FilipJirsak FilipJirsak added this to the 2.1.0 milestone Jun 17, 2017
@reluxa
Copy link

reluxa commented Oct 30, 2018

I guess it's belong to here: I was also trying to extract text() element form an xml document. the element looked liked the following

<element>TOOXYZ:sometext /TOOXYZ:otherText</element>

The above XML fragment is present many times in the xml document which I was trying to process. Interestingly the last occurrence of the XML could not get parsed correctly the text() has returned to nodes: "TOOXYZ:sometext /TOO" and XYZ:otherText

@FilipJirsak
Copy link
Contributor Author

@reluxa This is correct, there can be multiple adjacent text nodes. You can call document.normalize() to settle text nodes.
This issue is about XPath matching - normalization should probably be done automatically before XPath matching because XPath expects normalized documents.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

2 participants