-
Notifications
You must be signed in to change notification settings - Fork 6.6k
chore(deps): update dependency lxml to v6 #13460
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
chore(deps): update dependency lxml to v6 #13460
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @renovate-bot, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request updates the lxml
Python dependency from version 5.2.1
to 6.0.0
. This is a major version upgrade for lxml
, introducing several new features, bug fixes, and important breaking changes related to Python version compatibility, security defaults, and network parsing capabilities.
Highlights
- Dependency Update: This pull request upgrades the
lxml
library from version5.2.1
to6.0.0
. - Python Version Compatibility:
lxml
v6.0.0 removes support for Python versions older than 3.8. - Security Default Change: A new
decompress=False
parser option is introduced and is now the default, preventing decompression bombs. Code relying on automatic decompression will need to explicitly enable it. - Network Parsing Changes: Direct HTTP and FTP support for parsing from URLs has been disabled in the binary wheels, requiring the use of Python's standard URL request tools for such operations.
- Deprecations: The
Schematron
class is deprecated and will be removed in future versions oflxml
.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
The pull request updates the lxml dependency from version 5.2.1 to 6.0.0 in the requirements.txt
file. No other changes were made.
This PR contains the following updates:
==5.2.1
->==6.0.0
Release Notes
lxml/lxml (lxml)
v6.0.0
Compare Source
==================
Features added
GH#463:
lxml.html.diff
is faster and provides structurally better diffs.Original patch by Steven Fernandez.
GH#405: The factories
Element
andElementTree
can now be used in type hints.GH#448: Parsing from
memoryview
and other buffers is supported to allow zero-copy parsing.GH#437:
lxml.html.builder
was missing several HTML5 tag names.Patch by Nick Tarleton.
GH#458:
CDATA
can now be written into the incrementalxmlfile()
writer.Original patch by Lane Shaw.
A new parser option
decompress=False
was added that controls the automaticinput decompression when using libxml2 2.15.0 or later. Disabling this option
by default will effectively prevent decompression bombs when handling untrusted
input. Code that depends on automatic decompression must enable this option.
Note that libxml2 2.15.0 was not released yet, so this option currently has no
effect but can already be used.
The set of compile time / runtime supported libxml2 feature names is available as
etree.LIBXML_COMPILED_FEATURES
andetree.LIBXML_FEATURES
.This currently includes
catalog
,ftp
,html
,http
,iconv
,icu
,lzma
,regexp
,schematron
,xmlschema
,xpath
,zlib
.Bugs fixed
GH#353: Predicates in
.find*()
could mishandle tag indices if a default namespace is provided.Original patch by Luise K.
GH#272: The
head
andbody
properties oflxml.html
elements failed if no such elementwas found. They now return
None
instead.Original patch by FVolral.
Tag names provided by code (API, not data) that are longer than
INT_MAX
could be truncated or mishandled in other ways.
.text_content()
onlxml.html
elements accidentally returned a "smart string"without additional information. It now returns a plain string.
LP#2109931: When building lxml with coverage reporting, it now disables the
sys.monitoring
support due to the lack of support ihttps://github.com/nedbat/coveragepy/issues/179090
Other changes
Support for Python < 3.8 was removed.
Parsing directly from zlib (or lzma) compressed data is now considered an optional
feature in lxml. It may get removed from libxml2 at some point for security reasons
(compression bombs) and is therefore no longer guaranteed to be available in lxml.
As of this release, zlib support is still normally available in the binary wheels
but may get disabled or removed in later (x.y.0) releases. To test the availability,
use
"zlib" in etree.LIBXML_FEATURES
.The
Schematron
class is deprecated and will become non-functional in a future lxml version.The feature will soon be removed from libxml2 and stop being available.
GH#438: Wheels include the
arm7l
target.GH#465: Windows wheels include the
arm64
target.Patch by Finn Womack.
Binary wheels use the library versions libxml2 2.14.4 and libxslt 1.1.43.
Note that this disables direct HTTP and FTP support for parsing from URLs.
Use Python URL request tools instead (which usually also support HTTPS).
To test the availability, use
"http" in etree.LIBXML_FEATURES
.Windows binary wheels use the library versions libxml2 2.11.9, libxslt 1.1.39 and libiconv 1.17.
They are now based on VS-2022.
Built using Cython 3.1.2.
The debug methods
MemDebug.dump()
andMemDebug.show()
were removed completely.libxml2 2.13.0 discarded this feature.
v5.4.0
Compare Source
==================
Bugs fixed
(Binary wheels for Windows continue to use a patched libxml2 2.11.9 and libxslt 1.1.39.)
Issue found by Anatoly Katyushin.
v5.3.2
Compare Source
==================
This release resolves CVE-2025-24928 as described in
https://gitlab.gnome.org/GNOME/libxml2/-/issues/847
Bugs fixed
Binary wheels use libxml2 2.12.10 and libxslt 1.1.42.
Binary wheels for Windows use a patched libxml2 2.11.9 and libxslt 1.1.39.
v5.3.1
Compare Source
==================
Bugs fixed
GH#440: Some tests were adapted for libxml2 2.14.0.
Patch by Nick Wellnhofer.
LP#2097175:
DTD(external_id="…")
erroneously required a byte string as ID value.GH#450:
iterparse()
internally triggered the `DeprecationWarning`` added in lxml 5.3.0 when parsing HTML.Other changes
-flat_namespace
.v5.3.0
Compare Source
==================
Features added
CDATA
sections are no longer rejected but split on outputto represent
]]>
correctly.Patch by Gertjan Klein.
Bugs fixed
LP#2060160: Attribute values serialised differently in
xmlfile.element()
andxmlfile.write()
.LP#2058177: The ISO-Schematron implementation could fail on unknown prefixes.
Patch by David Lakin.
Other changes
LP#2067707: The
strip_cdata
option inHTMLParser()
turned out to be useless and is now deprecated.Binary wheels use the library versions libxml2 2.12.9 and libxslt 1.1.42.
Windows binary wheels use the library versions libxml2 2.11.8 and libxslt 1.1.39.
Built with Cython 3.0.11.
v5.2.2
Compare Source
==================
Bugs fixed
GH#417: The
test_feed_parser
test could fail iflxml_html_clean
was not installed.It is now skipped in that case.
LP#2059910: The minimum CPU architecture for the Linux x86 binary wheels was set back to
"core2", without SSE 4.2.
If libxml2 uses iconv, the compile time version is available as
etree.ICONV_COMPILED_VERSION
.Configuration
📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).
🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.
♻ Rebasing: Never, or you tick the rebase/retry checkbox.
🔕 Ignore: Close this PR and you won't be reminded about this update again.
This PR was generated by Mend Renovate. View the repository job log.