Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dzip (zip but for dictionaries) #664

Closed
khoda81 opened this issue Dec 27, 2022 · 2 comments
Closed

dzip (zip but for dictionaries) #664

khoda81 opened this issue Dec 27, 2022 · 2 comments
Labels
deferred We're not inclined to fix this issue, but please link to it if the same thing comes up later

Comments

@khoda81
Copy link

khoda81 commented Dec 27, 2022

Description

Zips multiple mappings to an iterable of (k, (v1, v2, ..., vn)).

References

My use case:
DictEncoder is a torch module that stores a dict of modules and encodes data sample of dict[str, Tensor] by applying every module on the corresponding value in the given dict. Basically, every method was a dzip on (self, sample) so I wanted to abstract this away.

Examples

Here is an example implementation:

import functools

def dzip(*mappings):
    keys = functools.reduce(
        lambda a, b: a & b,
        (mapping.keys() for mapping in mappings),
    )

    for k in keys:
        yield k, tuple(mapping[k] for mapping in mappings)
@bbayles
Copy link
Collaborator

bbayles commented Dec 28, 2022

Thanks for the suggestion. I'm probably not going to add this one, since it's pretty close to the existing map_reduce function.

from collections import ChainMap
from more_itertools import defaultdict

dzip_results = map_reduce(ChainMap(*mappings).items(), keyfunc=lambda x: x[0])

@bbayles bbayles added the deferred We're not inclined to fix this issue, but please link to it if the same thing comes up later label Dec 28, 2022
@khoda81
Copy link
Author

khoda81 commented Dec 29, 2022

@bbayles Thanks for the reply, Your implementation does not produce the expected result, dzip is supposed to return all values in a tuple, but ChainMap only returns the first one:

from collections import ChainMap
from more_itertools import map_reduce

a = { "a": 1, "b": 3 }
b = { "a": 2, "b": 2 }

map_reduce(ChainMap(a, b).items(), keyfunc=lambda x: x[0])
# defaultdict(None, {'a': [('a', 1)], 'b': [('b', 3)]})

dict(ChainMap(a, b).items())
# {'a': 1, 'b': 3}

dict(dzip(a, b))
# {'a': (1, 2), 'b': (3, 2)}

It would be possible to implement dzip with map_reduce, but probably look bad and allocate all the map which actually was my reason for writing dzip as a generator.

@bbayles bbayles closed this as completed Aug 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deferred We're not inclined to fix this issue, but please link to it if the same thing comes up later
Projects
None yet
Development

No branches or pull requests

2 participants