Skip to content

chore(deps): update dependency lxml to v6 #13460

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

renovate-bot
Copy link
Contributor

This PR contains the following updates:

Package Change Age Adoption Passing Confidence
lxml (source, changelog) ==5.2.1 -> ==6.0.0 age adoption passing confidence

Release Notes

lxml/lxml (lxml)

v6.0.0

Compare Source

==================

Features added

  • GH#463: lxml.html.diff is faster and provides structurally better diffs.
    Original patch by Steven Fernandez.

  • GH#405: The factories Element and ElementTree can now be used in type hints.

  • GH#448: Parsing from memoryview and other buffers is supported to allow zero-copy parsing.

  • GH#437: lxml.html.builder was missing several HTML5 tag names.
    Patch by Nick Tarleton.

  • GH#458: CDATA can now be written into the incremental xmlfile() writer.
    Original patch by Lane Shaw.

  • A new parser option decompress=False was added that controls the automatic
    input decompression when using libxml2 2.15.0 or later. Disabling this option
    by default will effectively prevent decompression bombs when handling untrusted
    input. Code that depends on automatic decompression must enable this option.
    Note that libxml2 2.15.0 was not released yet, so this option currently has no
    effect but can already be used.

  • The set of compile time / runtime supported libxml2 feature names is available as
    etree.LIBXML_COMPILED_FEATURES and etree.LIBXML_FEATURES.
    This currently includes
    catalog, ftp, html, http, iconv, icu,
    lzma, regexp, schematron, xmlschema, xpath, zlib.

Bugs fixed

  • GH#353: Predicates in .find*() could mishandle tag indices if a default namespace is provided.
    Original patch by Luise K.

  • GH#272: The head and body properties of lxml.html elements failed if no such element
    was found. They now return None instead.
    Original patch by FVolral.

  • Tag names provided by code (API, not data) that are longer than INT_MAX
    could be truncated or mishandled in other ways.

  • .text_content() on lxml.html elements accidentally returned a "smart string"
    without additional information. It now returns a plain string.

  • LP#2109931: When building lxml with coverage reporting, it now disables the sys.monitoring
    support due to the lack of support ihttps://github.com/nedbat/coveragepy/issues/179090

Other changes

  • Support for Python < 3.8 was removed.

  • Parsing directly from zlib (or lzma) compressed data is now considered an optional
    feature in lxml. It may get removed from libxml2 at some point for security reasons
    (compression bombs) and is therefore no longer guaranteed to be available in lxml.

    As of this release, zlib support is still normally available in the binary wheels
    but may get disabled or removed in later (x.y.0) releases. To test the availability,
    use "zlib" in etree.LIBXML_FEATURES.

  • The Schematron class is deprecated and will become non-functional in a future lxml version.
    The feature will soon be removed from libxml2 and stop being available.

  • GH#438: Wheels include the arm7l target.

  • GH#465: Windows wheels include the arm64 target.
    Patch by Finn Womack.

  • Binary wheels use the library versions libxml2 2.14.4 and libxslt 1.1.43.
    Note that this disables direct HTTP and FTP support for parsing from URLs.
    Use Python URL request tools instead (which usually also support HTTPS).
    To test the availability, use "http" in etree.LIBXML_FEATURES.

  • Windows binary wheels use the library versions libxml2 2.11.9, libxslt 1.1.39 and libiconv 1.17.
    They are now based on VS-2022.

  • Built using Cython 3.1.2.

  • The debug methods MemDebug.dump() and MemDebug.show() were removed completely.
    libxml2 2.13.0 discarded this feature.

v5.4.0

Compare Source

==================

Bugs fixed

  • LP#2107279: Binary wheels use libxml2 2.13.8 and libxslt 1.1.43 to resolve several CVEs.
    (Binary wheels for Windows continue to use a patched libxml2 2.11.9 and libxslt 1.1.39.)
    Issue found by Anatoly Katyushin.

v5.3.2

Compare Source

==================

This release resolves CVE-2025-24928 as described in
https://gitlab.gnome.org/GNOME/libxml2/-/issues/847

Bugs fixed

  • Binary wheels use libxml2 2.12.10 and libxslt 1.1.42.

  • Binary wheels for Windows use a patched libxml2 2.11.9 and libxslt 1.1.39.

v5.3.1

Compare Source

==================

Bugs fixed

  • GH#440: Some tests were adapted for libxml2 2.14.0.
    Patch by Nick Wellnhofer.

  • LP#2097175: DTD(external_id="…") erroneously required a byte string as ID value.

  • GH#450: iterparse() internally triggered the `DeprecationWarning`` added in lxml 5.3.0 when parsing HTML.

Other changes

  • GH#442: Binary wheels for macOS no longer use the linker flag -flat_namespace.

v5.3.0

Compare Source

==================

Features added

  • GH#421: Nested CDATA sections are no longer rejected but split on output
    to represent ]]> correctly.
    Patch by Gertjan Klein.

Bugs fixed

  • LP#2060160: Attribute values serialised differently in xmlfile.element() and xmlfile.write().

  • LP#2058177: The ISO-Schematron implementation could fail on unknown prefixes.
    Patch by David Lakin.

Other changes

  • LP#2067707: The strip_cdata option in HTMLParser() turned out to be useless and is now deprecated.

  • Binary wheels use the library versions libxml2 2.12.9 and libxslt 1.1.42.

  • Windows binary wheels use the library versions libxml2 2.11.8 and libxslt 1.1.39.

  • Built with Cython 3.0.11.

v5.2.2

Compare Source

==================

Bugs fixed

  • GH#417: The test_feed_parser test could fail if lxml_html_clean was not installed.
    It is now skipped in that case.

  • LP#2059910: The minimum CPU architecture for the Linux x86 binary wheels was set back to
    "core2", without SSE 4.2.

  • If libxml2 uses iconv, the compile time version is available as etree.ICONV_COMPILED_VERSION.


Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

Rebasing: Never, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.


  • If you want to rebase/retry this PR, check this box

This PR was generated by Mend Renovate. View the repository job log.

@renovate-bot renovate-bot requested review from a team as code owners June 26, 2025 21:38
@product-auto-label product-auto-label bot added the samples Issues that are directly related to samples. label Jun 26, 2025
@trusted-contributions-gcf trusted-contributions-gcf bot added kokoro:force-run Add this label to force Kokoro to re-run the tests. owlbot:run Add this label to trigger the Owlbot post processor. labels Jun 26, 2025
@product-auto-label product-auto-label bot added the api: cloudmedia Issues related to the Media Livestream API. label Jun 26, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @renovate-bot, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request updates the lxml Python dependency from version 5.2.1 to 6.0.0. This is a major version upgrade for lxml, introducing several new features, bug fixes, and important breaking changes related to Python version compatibility, security defaults, and network parsing capabilities.

Highlights

  • Dependency Update: This pull request upgrades the lxml library from version 5.2.1 to 6.0.0.
  • Python Version Compatibility: lxml v6.0.0 removes support for Python versions older than 3.8.
  • Security Default Change: A new decompress=False parser option is introduced and is now the default, preventing decompression bombs. Code relying on automatic decompression will need to explicitly enable it.
  • Network Parsing Changes: Direct HTTP and FTP support for parsing from URLs has been disabled in the binary wheels, requiring the use of Python's standard URL request tools for such operations.
  • Deprecations: The Schematron class is deprecated and will be removed in future versions of lxml.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@kokoro-team kokoro-team removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Jun 26, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request updates the lxml dependency from version 5.2.1 to 6.0.0 in the requirements.txt file. No other changes were made.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: cloudmedia Issues related to the Media Livestream API. owlbot:run Add this label to trigger the Owlbot post processor. samples Issues that are directly related to samples.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants