Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support defaultdict for namespace mapping #211

Merged
merged 1 commit into from Mar 28, 2019
Merged

Support defaultdict for namespace mapping #211

merged 1 commit into from Mar 28, 2019

Conversation

nathanalderson
Copy link
Contributor

In my use case, I want to ignore all namespaces. I hoped I could do this by using a defaultdict which returned None for all keys, but two minor issues in the code (see comments inline) prevented this from working as you would expect.

With this change, the following works:

>>> import collections
>>> xml = """
... <root xmlns="http://defaultns.com/"
...       xmlns:a="http://a.com/"
...       xmlns:b="http://b.com/">
...   <x>1</x>
...   <a:y>2</a:y>
...   <b:z>3</b:z>
... </root>
... """
>>> namespaces = collections.defaultdict(lambda: None)
>>> xmltodict.parse(xml, process_namespaces=True, namespaces=namespaces) == {
...     'root': {
...         'x': '1',
...         'y': '2',
...         'z': '3',
...     },
... }
True

@@ -70,13 +70,16 @@ def __init__(self,
self.force_list = force_list

def _build_name(self, full_name):
if not self.namespaces:
if self.namespaces is None:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although a defaultdict essentially has infinite items, until an item is actually inserted (either explicitly or on first access), it has zero length and evaluates to False.

try:
short_namespace = self.namespaces[namespace]
except KeyError:
short_namespace = namespace
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the documentation on defaultdict:

Note that __missing__() is not called for any operations besides __getitem__(). This means that get() will, like normal dictionaries, return None as a default rather than using default_factory.

While a bit more verbose, this change uses __getitem__() instead of get(), so the defaultdict behaves as expected.

@martinblech
Copy link
Owner

Great contribution, thanks a lot!

@martinblech martinblech merged commit 6a0bb9c into martinblech:master Mar 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants