Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DomCrawler] Fixed filterXPath() chaining #10207

Merged
merged 1 commit into from Feb 5, 2014
Merged

[DomCrawler] Fixed filterXPath() chaining #10207

merged 1 commit into from Feb 5, 2014

Conversation

robbertkl
Copy link
Contributor

Q A
Bug fix? yes
New feature? no
BC breaks? debatable (see below)
Deprecations? no
Tests pass? yes
Fixed tickets #10206
License MIT
Doc PR

As @stof mentions in #10206, each node in the Crawler can belong to a different \DOMDocument. Therefore, I've made each node do its own XPath, relative to itself, and add all the results to a new Crawler. This way, all resulting nodes are still part of their original \DOMDocument and thus can reach all of their parent nodes.

No current tests break on this change. I've added a new test for this case, by checking if the number of parents is correct after obtaining a node through chaining of filterXPath().

Now for BC: I can think of a number of cases where this change would give a different result. However, it's debatable/unclear if:

  • the old behavior was a bug in the first place (which would validate this change), or
  • the old behavior was intended (which would make this a BC breaking change)

As an example, consider the following HTML:

<div name="a"><div name="b"><div name="c"></div></div></div>

What would happen if we run this:

echo $crawler->filterXPath('//div')->filterXPath('div')->filterXPath('div')->attr('name');

Aside from breaking reachability of the parent nodes by chaining, with the original code it would echo 'a'.
With this patch it would echo 'c', which, to me, makes more sense.

@stof
Copy link
Member

stof commented Feb 5, 2014

Well, to fix the issue fully, it is not the only place which should be changed

@robbertkl
Copy link
Contributor Author

I've found another reference to the magic "_root" node (in parents()), which I've removed.

@stof Is that what you mean? If not, could you please give me a hint of the other place(s) you're referring to? Thanks!

@stof
Copy link
Member

stof commented Feb 5, 2014

@robbertkl there are also such cases in the handling of forms in Form::initialize

@robbertkl
Copy link
Contributor Author

@stof Thanks, I see what you mean now.

I've added changes in Form::initialize() and in Field\FormField::__construct() as well. Tests still pass.

@stof
Copy link
Member

stof commented Feb 5, 2014

👍

@stof
Copy link
Member

stof commented Feb 5, 2014

@robbertkl could you rebase your work ? It conflicts with the merge of #10205

@robbertkl
Copy link
Contributor Author

Done!

@fabpot
Copy link
Member

fabpot commented Feb 5, 2014

Thanks for fixing this bug @robbertkl.

fabpot added a commit that referenced this pull request Feb 5, 2014
This PR was merged into the 2.3 branch.

Discussion
----------

[DomCrawler] Fixed filterXPath() chaining

| Q             | A
| ------------- | ---
| Bug fix?      | yes
| New feature?  | no
| BC breaks?    | debatable (see below)
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | #10206
| License       | MIT
| Doc PR        |

As @stof mentions in #10206, each node in the Crawler can belong to a different \DOMDocument. Therefore, I've made each node do its own XPath, relative to itself, and add all the results to a new Crawler. This way, all resulting nodes are still part of their original \DOMDocument and thus can reach all of their parent nodes.

No current tests break on this change. I've added a new test for this case, by checking if the number of parents is correct after obtaining a node through chaining of `filterXPath()`.

Now for BC: I can think of a number of cases where this change would give a different result. However, it's debatable/unclear if:
- the old behavior was a bug in the first place (which would validate this change), or
- the old behavior was intended (which would make this a BC breaking change)

As an example, consider the following HTML:
```html
<div name="a"><div name="b"><div name="c"></div></div></div>
```
What would happen if we run this:
```php
echo $crawler->filterXPath('//div')->filterXPath('div')->filterXPath('div')->attr('name');
```
Aside from breaking reachability of the parent nodes by chaining, with the original code it would echo 'a'.
With this patch it would echo 'c', which, to me, makes more sense.

Commits
-------

43a7716 [DomCrawler] Fixed filterXPath() chaining
@fabpot fabpot merged commit 43a7716 into symfony:2.3 Feb 5, 2014
@robbertkl robbertkl deleted the ticket_10206 branch February 5, 2014 18:12
@tommygnr
Copy link
Contributor

This commit has broken a functional test in one my applications. I am working on putting together a simple example to reproduce.

fabpot added a commit that referenced this pull request Feb 18, 2014
…kl)"

This reverts commit c11c588, reversing
changes made to e453c45.
@fabpot
Copy link
Member

fabpot commented Feb 18, 2014

reverted as it causes some regression like mentioned in #10260

@fabpot fabpot mentioned this pull request Feb 18, 2014
fabpot added a commit that referenced this pull request Feb 18, 2014
* 2.3:
  Revert "bug #10207 [DomCrawler] Fixed filterXPath() chaining (robbertkl)"
  Bypass sigchild detection if phpinfo is not available

Conflicts:
	src/Symfony/Component/DomCrawler/Crawler.php
fabpot added a commit that referenced this pull request Feb 18, 2014
* 2.4:
  Revert "bug #10207 [DomCrawler] Fixed filterXPath() chaining (robbertkl)"
  Bypass sigchild detection if phpinfo is not available
fabpot added a commit that referenced this pull request May 21, 2014
…nt DOM nodes (stof, robbertkl)

This PR was merged into the 2.3 branch.

Discussion
----------

[DomCrawler] Fixed filterXPath() chaining loosing the parent DOM nodes

| Q             | A
| ------------- | ---
| Bug fix?      | yes
| New feature?  | no
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | #10206
| License       | MIT
| Doc PR        | n/a

This is a fixed version of #10207, preserving the BC for XPath queries. It is the rebased version of #10935 targetting 2.3

The example given in #10260 when reporting the regression in the previous attempt is covered by the new tests added in the first commit of the PR.
I also added many tests ensuring that the behavior is the same than in the current implementation.

Commits
-------

80438c2 Fixed the XPath filtering to have the same behavior than Symfony 2.4
711ac32 [DomCrawler] Fixed filterXPath() chaining
8f706c9 [DomCrawler] Added more tests for the XPath filtering
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants