-
-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve XML parsing with defusedxml #4367
base: beta
Are you sure you want to change the base?
Conversation
51631dc
to
c77a361
Compare
c77a361
to
d837aa9
Compare
from defusedxml.ElementTree import fromstring, tostring | ||
from defusedxml.lxml import tostring |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @JacquelineMorrissette, could you double-check this PR with a few things in mind?
fromstring
andtostring
are awfully generic-sounding things. I think the usual convention of… import ElementTree as ET
, an then using stuff likeET.fromstring
is good to avoid ambiguity.- Watch out for shadowing one import with another, as in this case where the second
tostring
import completely replaces the first.
- Watch out for shadowing one import with another, as in this case where the second
- Why are we still using
xml.etree.ElementTree
andlxml
versusdefusedxml.ElementTree
anddefusedxml.lxml
? Are there certain things we need that the defusedxml alternatives don't provide?
Edit: whoops, I just saw this in defusedxml's readme:
defusedxml.lxml
DEPRECATED The module is deprecated and will be removed in a future release.
Given that, please don't use defusedxml.lxml
anywhere. Try to see how our uses of lxml
could be replaced with ElementTree
, and don't hesitate to ask questions :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Will do, I'll give it another pass
- The reason I kept some uses of
xml.etree.ElementTree
(and others) is because the defusedxml library isn't a complete replacement for most features of these and only replaces specific methods such astostring
andfromstring
. For example: defusexml doesn't have an equivalent tofrom xml.etree.ElementTree import ElementTree
The documentation here lists what it replaces, in case it helps: https://pypi.org/project/defusedxml/#defusedxml-elementtree
d0b0d13
to
fd5246d
Compare
Checklist
Description
Switch to defusedxml for xml parsing
Other notes
The defusedxml library doesn't replace everything in the native XML library so it's only been changes where necessary
Related issues
kobotoolbox/kobocat#869