CVE-2020-29128: XXE in petl < 1.68
Summary
petl is a Python library that provides functions for extraction, transformation, and loading (ETL) of data.
petl before 1.68, in some configurations, allows resolution of entities in XML input.
An attacker who is able to submit XML input to an application using petl can disclose arbitrary files on the file system in the context of the user under which the application is running.
Affected Applications
Applications that:
- accept user-supplied XML input that is processed using
petl< 1.68 - configure
lxmlas the underlying XML processing library used bypetl
Impact
Information Disclosure
Mitigation
- Update to
petl>= 1.68
Analysis
The fromxml function in the petl.io.xml module converts an XML document to a tabular structure using an XML parsing library. petl supports using Python's built-in xml library or lxml for parsing XML. lxml is the recommended option.
In petl < 1.68, the fromxml function creates an lxml parser with default settings. By default, lxml is configured to resolve local entities.
Example application that would be vulnerable using petl < 1.68 with lxml:
from petl.io.xml import fromxml
petl_table = fromxml('input.xml', 'tr', 'td')
To disclose the /etc/passwd file running on the application's host, an attacker could supply a crafted XML file like this:
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [
<!ELEMENT table ANY >
<!ENTITY xxe SYSTEM "file:///etc/passwd" >]>
<table>
<tr>
<td>a</td><td>&xxe;</td>
</tr>
</table>
References
- petl-developers/petl#527
- https://petl.readthedocs.io/en/stable/changes.html
- https://owasp.org/www-community/vulnerabilities/XML_External_Entity_(XXE)_Processing
Disclosure Timeline
- Oct. 2, 2020: Notified vendor
- Oct. 5, 2020:
petl1.68 released with mitigation - Nov. 27, 2020: Public disclosure