New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DomCrawler] Fixed filterXPath() chaining #10207
Conversation
Well, to fix the issue fully, it is not the only place which should be changed |
I've found another reference to the magic "_root" node (in parents()), which I've removed. @stof Is that what you mean? If not, could you please give me a hint of the other place(s) you're referring to? Thanks! |
@robbertkl there are also such cases in the handling of forms in |
@stof Thanks, I see what you mean now. I've added changes in |
👍 |
@robbertkl could you rebase your work ? It conflicts with the merge of #10205 |
Done! |
Thanks for fixing this bug @robbertkl. |
This PR was merged into the 2.3 branch. Discussion ---------- [DomCrawler] Fixed filterXPath() chaining | Q | A | ------------- | --- | Bug fix? | yes | New feature? | no | BC breaks? | debatable (see below) | Deprecations? | no | Tests pass? | yes | Fixed tickets | #10206 | License | MIT | Doc PR | As @stof mentions in #10206, each node in the Crawler can belong to a different \DOMDocument. Therefore, I've made each node do its own XPath, relative to itself, and add all the results to a new Crawler. This way, all resulting nodes are still part of their original \DOMDocument and thus can reach all of their parent nodes. No current tests break on this change. I've added a new test for this case, by checking if the number of parents is correct after obtaining a node through chaining of `filterXPath()`. Now for BC: I can think of a number of cases where this change would give a different result. However, it's debatable/unclear if: - the old behavior was a bug in the first place (which would validate this change), or - the old behavior was intended (which would make this a BC breaking change) As an example, consider the following HTML: ```html <div name="a"><div name="b"><div name="c"></div></div></div> ``` What would happen if we run this: ```php echo $crawler->filterXPath('//div')->filterXPath('div')->filterXPath('div')->attr('name'); ``` Aside from breaking reachability of the parent nodes by chaining, with the original code it would echo 'a'. With this patch it would echo 'c', which, to me, makes more sense. Commits ------- 43a7716 [DomCrawler] Fixed filterXPath() chaining
This commit has broken a functional test in one my applications. I am working on putting together a simple example to reproduce. |
reverted as it causes some regression like mentioned in #10260 |
* 2.3: Revert "bug #10207 [DomCrawler] Fixed filterXPath() chaining (robbertkl)" Bypass sigchild detection if phpinfo is not available Conflicts: src/Symfony/Component/DomCrawler/Crawler.php
* 2.4: Revert "bug #10207 [DomCrawler] Fixed filterXPath() chaining (robbertkl)" Bypass sigchild detection if phpinfo is not available
…nt DOM nodes (stof, robbertkl) This PR was merged into the 2.3 branch. Discussion ---------- [DomCrawler] Fixed filterXPath() chaining loosing the parent DOM nodes | Q | A | ------------- | --- | Bug fix? | yes | New feature? | no | BC breaks? | no | Deprecations? | no | Tests pass? | yes | Fixed tickets | #10206 | License | MIT | Doc PR | n/a This is a fixed version of #10207, preserving the BC for XPath queries. It is the rebased version of #10935 targetting 2.3 The example given in #10260 when reporting the regression in the previous attempt is covered by the new tests added in the first commit of the PR. I also added many tests ensuring that the behavior is the same than in the current implementation. Commits ------- 80438c2 Fixed the XPath filtering to have the same behavior than Symfony 2.4 711ac32 [DomCrawler] Fixed filterXPath() chaining 8f706c9 [DomCrawler] Added more tests for the XPath filtering
As @stof mentions in #10206, each node in the Crawler can belong to a different \DOMDocument. Therefore, I've made each node do its own XPath, relative to itself, and add all the results to a new Crawler. This way, all resulting nodes are still part of their original \DOMDocument and thus can reach all of their parent nodes.
No current tests break on this change. I've added a new test for this case, by checking if the number of parents is correct after obtaining a node through chaining of
filterXPath()
.Now for BC: I can think of a number of cases where this change would give a different result. However, it's debatable/unclear if:
As an example, consider the following HTML:
What would happen if we run this:
Aside from breaking reachability of the parent nodes by chaining, with the original code it would echo 'a'.
With this patch it would echo 'c', which, to me, makes more sense.