Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable entities while parsing XML #3

Merged
merged 1 commit into from Aug 25, 2021

Conversation

jorgectf
Copy link
Contributor

If an attacker supplies a url returning malicious XML content, they may be able to leak internal information such as files, and/or cause a denial of service.

Entrypoint:

multiNER/ner.py

Line 657 in c044094

url = request.args.get('url')

Getting into ocr_to_dict:

multiNER/ner.py

Line 676 in c044094

parsed_text = ocr_to_dict(url)

User-controlled URL request:

multiNER/ner.py

Line 770 in c044094

req = requests.get(url, timeout=TIMEOUT)

Vulnerable parser declaration:

multiNER/ner.py

Lines 782 to 784 in c044094

parser = lxml.etree.XMLParser(ns_clean=False,
recover=True,
encoding='utf-8')

Sink:

multiNER/ner.py

Line 786 in c044094

xml = lxml.etree.fromstring(text.encode(), parser=parser)

More information:

@WillemJan WillemJan merged commit 7dd92e2 into KBNLresearch:master Aug 25, 2021
@jorgectf jorgectf deleted the fix-xxe branch August 26, 2021 21:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants