-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Properly yield results from html5lib
parsing
#10846
Conversation
ad0f754
to
d7ebb68
Compare
Thanks for finding the issue! I was looking suspiciously at that line of code but I can't easily test changes to pip inside my corporate environment. That's a subtle bug, I can see why it was missed. You think you're returning a generator but because |
Yea! I missed this thing in the original PR. That there were no tests for this |
The earlier variant _returned_ an iterable object from a generator. This did not properly handle the fallback, resulting in the html5lib code path not being executed.
d7ebb68
to
80609e8
Compare
Merging this without additional reviews, since this seems pretty safe. Holler if there's any concerns! |
Bumps [pip](https://github.com/pypa/pip) from 21.3.1 to 22.0.2. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/pypa/pip/blob/main/NEWS.rst">pip's changelog</a>.</em></p> <blockquote> <h1>22.0.2 (2022-01-30)</h1> <h2>Deprecations and Removals</h2> <ul> <li>Instead of failing on index pages that use non-compliant HTML 5, print a deprecation warning and fall back to <code>html5lib</code>-based parsing for now. This simplifies the migration for non-compliant index pages, by letting such indexes function with a warning. (<code>[#10847](pypa/pip#10847) <https://github.com/pypa/pip/issues/10847></code>_)</li> </ul> <h1>22.0.1 (2022-01-30)</h1> <h2>Bug Fixes</h2> <ul> <li>Accept lowercase <code><!doctype html></code> on index pages. (<code>[#10844](pypa/pip#10844) <https://github.com/pypa/pip/issues/10844></code>_)</li> <li>Properly handle links parsed by html5lib, when using <code>--use-deprecated=html5lib</code>. (<code>[#10846](pypa/pip#10846) <https://github.com/pypa/pip/issues/10846></code>_)</li> </ul> <h1>22.0 (2022-01-29)</h1> <h2>Process</h2> <ul> <li>Completely replace :pypi:<code>tox</code> in our development workflow, with :pypi:<code>nox</code>.</li> </ul> <h2>Deprecations and Removals</h2> <ul> <li> <p>Deprecate alternative progress bar styles, leaving only <code>on</code> and <code>off</code> as available choices. (<code>[#10462](pypa/pip#10462) <https://github.com/pypa/pip/issues/10462></code>_)</p> </li> <li> <p>Drop support for Python 3.6. (<code>[#10641](pypa/pip#10641) <https://github.com/pypa/pip/issues/10641></code>_)</p> </li> <li> <p>Disable location mismatch warnings on Python versions prior to 3.10.</p> <p>These warnings were helping identify potential issues as part of the sysconfig -> distutils transition, and we no longer need to rely on reports from older Python versions for information on the transition. (<code>[#10840](pypa/pip#10840) <https://github.com/pypa/pip/issues/10840></code>_)</p> </li> </ul> <h2>Features</h2> <ul> <li> <p>Changed <code>PackageFinder</code> to parse HTML documents using the stdlib :class:<code>html.parser.HTMLParser</code> class instead of the <code>html5lib</code> package.</p> <p>For now, the deprecated <code>html5lib</code> code remains and can be used with the <code>--use-deprecated=html5lib</code> command line option. However, it will be removed in a future pip release. (<code>[#10291](pypa/pip#10291) <https://github.com/pypa/pip/issues/10291></code>_)</p> </li> <li> <p>Utilise <code>rich</code> for presenting pip's default download progress bar. (<code>[#10462](pypa/pip#10462) <https://github.com/pypa/pip/issues/10462></code>_)</p> </li> <li> <p>Present a better error message when an invalid wheel file is encountered, providing more context where the invalid wheel file is. (<code>[#10535](pypa/pip#10535) <https://github.com/pypa/pip/issues/10535></code>_)</p> </li> <li> <p>Documents the <code>--require-virtualenv</code> flag for <code>pip install</code>. (<code>[#10588](pypa/pip#10588) <https://github.com/pypa/pip/issues/10588></code>_)</p> </li> <li> <p><code>pip install <tab></code> autocompletes paths. (<code>[#10646](pypa/pip#10646) <https://github.com/pypa/pip/issues/10646></code>_)</p> </li> <li> <p>Allow Python distributors to opt-out from or opt-in to the <code>sysconfig</code> installation scheme backend by setting <code>sysconfig._PIP_USE_SYSCONFIG</code> to <code>True</code> or <code>False</code>. (<code>[#10647](pypa/pip#10647) <https://github.com/pypa/pip/issues/10647></code>_)</p> </li> <li> <p>Make it possible to deselect tests requiring cryptography package on systems where it cannot be installed. (<code>[#10686](pypa/pip#10686) <https://github.com/pypa/pip/issues/10686></code>_)</p> </li> <li> <p>Start using Rich for presenting error messages in a consistent format. (<code>[#10703](pypa/pip#10703) <https://github.com/pypa/pip/issues/10703></code>_)</p> </li> <li> <p>Improve presentation of errors from subprocesses. (<code>[#10705](pypa/pip#10705) <https://github.com/pypa/pip/issues/10705></code>_)</p> </li> </ul> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/pypa/pip/commit/c721f03190ef888711e8e9efe6ca8345ec6464f3"><code>c721f03</code></a> Bump for release</li> <li><a href="https://github.com/pypa/pip/commit/844b799c9cf68629cc3c45b75c6e3b7f41086d49"><code>844b799</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/pypa/pip/issues/10847">#10847</a> from pradyunsg/better-html5lib-fallback</li> <li><a href="https://github.com/pypa/pip/commit/a78845ab3387adfe590e2a0ae044a6d3b20ada55"><code>a78845a</code></a> Pacify functional tests that don't start with <code>\<!doctype html></code></li> <li><a href="https://github.com/pypa/pip/commit/c3a42f0679d06bfdb3475801618273be5bbce1e8"><code>c3a42f0</code></a> 📰</li> <li><a href="https://github.com/pypa/pip/commit/c01b0b2729ee096c73416059de825e2f2f01bed9"><code>c01b0b2</code></a> Gracefully fallback to html5lib for parsing non-compliant index pages</li> <li><a href="https://github.com/pypa/pip/commit/cc35c930b2d7096babb267ddce9382c30c58c7e1"><code>cc35c93</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/pypa/pip/issues/10850">#10850</a> from pradyunsg/release/22.0.1</li> <li><a href="https://github.com/pypa/pip/commit/1b6ef5d0b387d84b1909799d1e18d0b4431038c3"><code>1b6ef5d</code></a> Bump for development</li> <li><a href="https://github.com/pypa/pip/commit/c73ac8d6bcf4f64041cafeccd2125cca052abed9"><code>c73ac8d</code></a> Bump for release</li> <li><a href="https://github.com/pypa/pip/commit/9a9c1def6e3bba1f2a860874c8c73b5c55f7f43c"><code>9a9c1de</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/pypa/pip/issues/10846">#10846</a> from pradyunsg/fix-html5lib-fallback</li> <li><a href="https://github.com/pypa/pip/commit/80609e8c20a8db26c97037f252b29307ab44b0e2"><code>80609e8</code></a> Properly yield results from <code>html5lib</code> parsing</li> <li>Additional commits viewable in <a href="https://github.com/pypa/pip/compare/21.3.1...22.0.2">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=pip&package-manager=pip&previous-version=21.3.1&new-version=22.0.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details>
Bumps [pip](https://github.com/pypa/pip) from 21.3.1 to 22.0.3. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/pypa/pip/blob/main/NEWS.rst">pip's changelog</a>.</em></p> <blockquote> <h1>22.0.3 (2022-02-03)</h1> <h2>Features</h2> <ul> <li>Print the exception via <code>rich.traceback</code>, when running with <code>--debug</code>. (<code>[#10791](pypa/pip#10791) <https://github.com/pypa/pip/issues/10791></code>_)</li> </ul> <h2>Bug Fixes</h2> <ul> <li> <p>Only calculate topological installation order, for packages that are going to be installed/upgraded.</p> <p>This fixes an <code>AssertionError</code> that occured when determining installation order, for a very specific combination of upgrading-already-installed-package + change of dependencies + fetching some packages from a package index. This combination was especially common in Read the Docs' builds. (<code>[#10851](pypa/pip#10851) <https://github.com/pypa/pip/issues/10851></code>_)</p> </li> <li> <p>Use <code>html.parser</code> by default, instead of falling back to <code>html5lib</code> when <code>--use-deprecated=html5lib</code> is not passed. (<code>[#10869](pypa/pip#10869) <https://github.com/pypa/pip/issues/10869></code>_)</p> </li> </ul> <h2>Improved Documentation</h2> <ul> <li>Clarify that using per-requirement overrides disables the usage of wheels. (<code>[#9674](pypa/pip#9674) <https://github.com/pypa/pip/issues/9674></code>_)</li> </ul> <h1>22.0.2 (2022-01-30)</h1> <h2>Deprecations and Removals</h2> <ul> <li>Instead of failing on index pages that use non-compliant HTML 5, print a deprecation warning and fall back to <code>html5lib</code>-based parsing for now. This simplifies the migration for non-compliant index pages, by letting such indexes function with a warning. (<code>[#10847](pypa/pip#10847) <https://github.com/pypa/pip/issues/10847></code>_)</li> </ul> <h1>22.0.1 (2022-01-30)</h1> <h2>Bug Fixes</h2> <ul> <li>Accept lowercase <code><!doctype html></code> on index pages. (<code>[#10844](pypa/pip#10844) <https://github.com/pypa/pip/issues/10844></code>_)</li> <li>Properly handle links parsed by html5lib, when using <code>--use-deprecated=html5lib</code>. (<code>[#10846](pypa/pip#10846) <https://github.com/pypa/pip/issues/10846></code>_)</li> </ul> <h1>22.0 (2022-01-29)</h1> <h2>Process</h2> <ul> <li>Completely replace :pypi:<code>tox</code> in our development workflow, with :pypi:<code>nox</code>.</li> </ul> <p>Deprecations and Removals</p> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/pypa/pip/commit/44018de50cafba25445a225c1a1986d6312e1ef3"><code>44018de</code></a> Bump for release</li> <li><a href="https://github.com/pypa/pip/commit/65f096c432d60d5f0214793becd592e1c1c3b624"><code>65f096c</code></a> Update AUTHORS.txt</li> <li><a href="https://github.com/pypa/pip/commit/7d50964bcb1b25f9fe2c49fe447ab58aad2b4247"><code>7d50964</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/pypa/pip/issues/10876">#10876</a> from mbacchi/vcs_support_typo</li> <li><a href="https://github.com/pypa/pip/commit/ff8dbb458a59905c5462d339a63536257aad497a"><code>ff8dbb4</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/pypa/pip/issues/10867">#10867</a> from mauritsvanrees/maurits-topoligical-weights-req...</li> <li><a href="https://github.com/pypa/pip/commit/b3f5cad73241e25a25ce7d50eb9175dbafcfd8db"><code>b3f5cad</code></a> Update news/10851.bugfix.rst</li> <li><a href="https://github.com/pypa/pip/commit/cf4655f474cb8a04fa6b274ee0edaf774546a79b"><code>cf4655f</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/pypa/pip/issues/10869">#10869</a> from pradyunsg/put-html5lib-behind-flag</li> <li><a href="https://github.com/pypa/pip/commit/3608b42ef0ab39a2d50335356644f8f3464f651a"><code>3608b42</code></a> Fix minor typo in vcs support doc</li> <li><a href="https://github.com/pypa/pip/commit/6c92a33b6e22f099edac8f4df594ffe6a18eb6e2"><code>6c92a33</code></a> Place the link as "context" instead of "Link:"</li> <li><a href="https://github.com/pypa/pip/commit/7a3b0f1ae1cc59ae6566694e47887728a7976ab9"><code>7a3b0f1</code></a> 📰</li> <li><a href="https://github.com/pypa/pip/commit/d7fed8fe9382c4f4442d7aa6216f41c8ed6f1ea3"><code>d7fed8f</code></a> Use rich.traceback with debug mode (<a href="https://github-redirect.dependabot.com/pypa/pip/issues/10832">#10832</a>)</li> <li>Additional commits viewable in <a href="https://github.com/pypa/pip/compare/21.3.1...22.0.3">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=pip&package-manager=pip&previous-version=21.3.1&new-version=22.0.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details>
Closes #10845