-
-
Notifications
You must be signed in to change notification settings - Fork 30.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Element.findall(path, dict) doesn't insert null namespace #74670
Comments
The findall method for ElementTree.Element handles namespace prefixes by tokenising the path and inserting the full namespace in braces based on entries in a dictionary. Unfortunately, this does not work for a namespace without a prefix, so if you have files containing namespaces with and without prefixes, you still need to manually add the namespace url for the unprefixed path. The function xpath_tokenizer checks to see if tokens contain a colon and only adds in the namespace url in that instance. This could be changed to add the url if their is a colon, or if there is not, and the empty string key is present in the namespaces dictionary. |
I agree that this is a recurring irritant. |
Agreed that this should be added. I think the key should be None, though, not the empty string. I attached a quick patch for lxml's corresponding file. It's mostly the same for ET. |
Patch replaced by pull request. |
I've merged the PR. It matches the implementation that has been released in lxml almost two years ago. |
I am not sure but this commit seems to have broken Azure CI on master for Windows. The build checks were green when the PR while merging. I can see the ValueError added in e9927e1 raised in the below CI log and it's occurring consistently |
Interesting. Thanks for investigating this. It looks like the script "appxmanifest.py" uses an empty string as prefix for a lookup: File "D:\a\1\s\PC\layout\support\appxmanifest.py", line 407, in get_appxmanifest I don't know where that script comes from, but it suggests that strictly rejecting this kind of invalid library usage might not be the right thing to do for now. It seems to be a case where users pass an arbitrary prefix-namespace dict into .find() that they happen to have lying around, expecting unused mappings to be ignored. I pushed a PR that removes the exception for now. |
The relevant line is at cpython/PC/layout/support/appxmanifest.py Line 407 in cd46655
|
The script seems to generally assume that "" is a good representation for "no prefix", i.e. the default namespace, although that is IMHO more correctly represented as None. It's not very likely that this is the only script out there that makes that assumption. In fact, this might not be an entirely stupid assumption, given that None doesn't sort together with strings, for example. And sorting prefixes is not an unusual thing to do. That makes it a case of "practicality beats purity", I guess … I'll change the implementation to allow empty strings. |
I don't recall where I got the empty string from, but it's certainly what I've used there for a while. Maybe it's required in register_namespace() to set the default namespace? |
I submitted a PR that changes the API back to an empty string. While lxml uses None here, an all-strings mapping is simply more convenient. I will start supporting both in lxml from the next release. Comments welcome. |
New changeset e8113f5 by Stefan Behnel in branch 'master': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: