Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DomCrawler] Improve Crawler HTML5 parser need detection #30892

Merged
merged 1 commit into from Apr 6, 2019

Conversation

@tgalopin
Copy link
Member

commented Apr 6, 2019

Q A
Branch? master
Bug fix? kind of
New feature? no
BC breaks? no
Deprecations? no>
Tests pass? yes
Fixed tickets -
License MIT
Doc PR -

Live from #eu-fossa

Follow up of #29306

This PR introduces a better detection mechanism to choose when to parse using the HTML5 parser or not, and fix a subcrawler parsing issue as well.

@stof I'd be super interested by your review :) !

@tgalopin tgalopin force-pushed the tgalopin:improve-html5-crawler-detection branch from 57c9bee to 96b5f69 Apr 6, 2019

@javiereguiluz

This comment has been minimized.

Copy link
Member

commented Apr 6, 2019

@tgalopin will you add a config option in FrameworkBundle to enable/disable the HTML5 parser? Thanks.

Show resolved Hide resolved src/Symfony/Component/DomCrawler/Crawler.php Outdated
@tgalopin

This comment has been minimized.

Copy link
Member Author

commented Apr 6, 2019

@javiereguiluz the usage of the HTML5 parser is when 1/ the library is available 2/ the doctype is HTML5. I think it's better than having a configuration option (and was recommended by Nicolas)

Show resolved Hide resolved src/Symfony/Component/DomCrawler/Crawler.php Outdated

wouterj added a commit to symfony/symfony-docs that referenced this pull request Apr 6, 2019

minor #11316 Minor improvement in the DomCrawler + HTML5 explanation …
…(javiereguiluz)

This PR was squashed before being merged into the master branch (closes #11316).

Discussion
----------

Minor improvement in the DomCrawler + HTML5 explanation

Minor improvement according to @tgalopin comments in symfony/symfony#30892.

Commits
-------

1c7eb79 Minor improvement in the DomCrawler + HTML5 explanation

@tgalopin tgalopin force-pushed the tgalopin:improve-html5-crawler-detection branch from 96b5f69 to 92cd9ec Apr 6, 2019

@tgalopin tgalopin force-pushed the tgalopin:improve-html5-crawler-detection branch from 92cd9ec to 9bbdab6 Apr 6, 2019

@tgalopin

This comment has been minimized.

Copy link
Member Author

commented Apr 6, 2019

Updated

@stof

stof approved these changes Apr 6, 2019

@fabpot

fabpot approved these changes Apr 6, 2019

@fabpot

This comment has been minimized.

Copy link
Member

commented Apr 6, 2019

Thank you @tgalopin.

@fabpot fabpot merged commit 9bbdab6 into symfony:master Apr 6, 2019

1 of 3 checks passed

continuous-integration/appveyor/pr Waiting for AppVeyor build to complete
Details
continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
fabbot.io Your code looks good.
Details

fabpot added a commit that referenced this pull request Apr 6, 2019

feature #30892 [DomCrawler] Improve Crawler HTML5 parser need detecti…
…on (tgalopin)

This PR was merged into the 4.3-dev branch.

Discussion
----------

[DomCrawler] Improve Crawler HTML5 parser need detection

| Q             | A
| ------------- | ---
| Branch?       | master
| Bug fix?      | kind of
| New feature?  | no
| BC breaks?    | no
| Deprecations? | no>
| Tests pass?   | yes
| Fixed tickets | -
| License       | MIT
| Doc PR        | -

Live from #eu-fossa

Follow up of #29306

This PR introduces a better detection mechanism to choose when to parse using the HTML5 parser or not, and fix a subcrawler parsing issue as well.

@stof I'd be super interested by your review :) !

Commits
-------

9bbdab6 [DomCrawler] Improve Crawler HTML5 parser need detection

@tgalopin tgalopin deleted the tgalopin:improve-html5-crawler-detection branch Apr 6, 2019

@nicolas-grekas nicolas-grekas added this to the next milestone Apr 7, 2019

@nicolas-grekas nicolas-grekas removed this from the next milestone Apr 30, 2019

@nicolas-grekas nicolas-grekas added this to the 4.3 milestone Apr 30, 2019

@fabpot fabpot referenced this pull request May 9, 2019

Merged

Release v4.3.0-BETA1 #31435

nicolas-grekas added a commit that referenced this pull request May 9, 2019

minor #31257 [DomCrawler] fix HTML5 parser integration (nicolas-grekas)
This PR was merged into the 4.3 branch.

Discussion
----------

[DomCrawler] fix HTML5 parser integration

| Q             | A
| ------------- | ---
| Branch?       | master
| Bug fix?      | yes
| New feature?  | no
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | -
| License       | MIT
| Doc PR        | -

Spotted while reviewing #30892
The current logic is context-dependent: by changing the order of calls, you can get different behaviors.

Commits
-------

ba83bda [DomCrawler] fix HTML5 parser integration
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.