Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Making conditional_data_trans better, and generally; recursively applying wrappers #10

Open
thorwhalen opened this issue Jun 11, 2022 · 2 comments
Labels
enhancement New feature or request

Comments

@thorwhalen
Copy link
Member

thorwhalen commented Jun 11, 2022

This theme comes up regularly.

When we wrap a store, only the first level is wrapped. If the store's nature is nested, one often expects that the wrap be applied to the nested levels, but one would be wrong to expect this, since it would be an undesirable effect in the general case.
Therefore we need to make it easy for the user to control this aspect if they need it.

To abstract? Here's an example with add_path_access (code to that function here).

The add_path_access does this: doesn't carry on to values.

>>> s = add_path_access({'a': {'b': {'c': 42}}})
>>> s['a', 'b', 'c']
42

But the wrapping doesn't carry on to values, so if you ask for s['a'], and then want to get ['b', 'c'] from it, it won't work:

>>> s['a']['b', 'c']
Traceback (most recent call last):
  ...
KeyError: ('b', 'c')

That's because add_path_access applies to the "top level" only.
Once you ask for the 'a' key, it gives you that value, but that value is a "pure dict", not one wrapped with add_path_access like s is. Of course we could do this:

>>> add_path_access(s['a'])['b', 'c']
42

The reason why we don't do this automatically is that it may not always be desirable.
If one wanted to though, one could use wrap_kvs(obj_of_data=...) to wrap
specific values with add_path_access.

For example, if you wanted to wrap all mappings recursively, you could:

>>> from typing import Mapping
>>> from dol.trans import wrap_kvs
>>> def add_path_access_if_mapping(v):
...     if isinstance(v, Mapping):
...         return wrap_kvs(
...             add_path_access(v),
...             obj_of_data=add_path_access_if_mapping
... )
...     return v
>>> s = add_path_access_if_mapping({'a': {'b': {'c': 42}}})
>>> s['a', 'b', 'c']
42
>>> # But now this works:
>>> s['a']['b', 'c']
42

conditional_data_trans

This function:

def conditional_data_trans(store, *, condition, data_transform): ...

implements the pattern above. We can now do this:

>>> from typing import Mapping
>>> from dol.util import instance_checker
>>> add_path_access_if_mapping = conditional_data_trans(
...     condition=instance_checker(Mapping), data_trans=add_path_access
... )
>>> s = add_path_access_if_mapping({'a': {'b': {'c': 42}}})
>>> s['a', 'b', 'c']
42
>>> # But now this works:
>>> s['a']['b', 'c']
42

Tasks

wrapper does not work with types

Well, it just doesn't work at the first level:

from dol.trans import conditional_data_trans
from dol import instance_checker
from typing import Mapping
from dol import add_path_access
from dol import Pipe

# f = Pipe(add_path_access, conditional_data_trans(condition=instance_checker(Mapping), data_trans=add_path_access))

d = {'a': {'b': {'c': 42}}}

s = conditional_data_trans(d, condition=instance_checker(Mapping), data_trans=add_path_access)
assert s['a']['b', 'c'] == 42 == d['a']['b']['c']
assert s['a', 'b', 'c'] == 42 == d['a']['b']['c']

f = conditional_data_trans(condition=instance_checker(Mapping), data_trans=add_path_access)
s = f(d)
assert s['a']['b', 'c'] == 42 == d['a']['b']['c']
assert s['a', 'b', 'c'] == 42 == d['a']['b']['c']


S = conditional_data_trans(dict, condition=instance_checker(Mapping), data_trans=add_path_access)
s = S(d)
assert s['a']['b', 'c'] == 42 == d['a']['b']['c']
# assert s['a', 'b', 'c'] == 42 == d['a']['b']['c']  # doesn't work.

Generalize condition

Is checking the value general enough?

We could use the postget argument of wrap_kvs to condition on both value and key.
But in this case it's strange: We need to include the condition and the trans both in data_trans then.
We might want a condition: Callable[[Key, Value], bool].

We might also want to handle a condition: Callable[[Path, Key, Value], bool] for more (full?) control -- say if we wanted to condition on the nested depth.

And what about according to the path itself? Perhaps we'd like to

┆Issue is synchronized with this Asana task by Unito

@thorwhalen
Copy link
Member Author

thorwhalen commented Mar 7, 2023

Related:

A simple flatten_dict function:

from collections.abc import MutableMapping

def flatten_dict(d: MutableMapping, parent_key: str = '', sep: str ='.') -> MutableMapping:
    items = []
    for k, v in d.items():
        new_key = parent_key + sep + k if parent_key else k
        if isinstance(v, MutableMapping):
            items.extend(flatten_dict(v, new_key, sep=sep).items())
        else:
            items.append((new_key, v))
    return dict(items)

Found this in flatten a dict in 4 different ways article. One we should read before tackling this issue.

Other stuff to look at as well:

This was referenced Mar 7, 2023
@thorwhalen
Copy link
Member Author

Recursive stores

Consider the following various use cases:

  • In Files you can set a maximum number of levels, but perhaps we'd want more control over what folders to stop out (perhaps conditioned on the folder path, or number of files in it, etc.)
  • DirReader let's you navigate folders from gradually, level by level, but doesn't give you a "if file, then get value, if folder, than DirReader(folder) recursively".
  • {root: Files(root) for root, _, _ in os.walk(rootdir)}
  • FilesOfZip, but when a key in the zip is a zip itself, call FilesOfZip on it (postget, except, the function is not on v but on k) before returning

Or further, consider the following:

image

Say boxes represent folders, ellipses files, and the e.name and e.date are some contents in the file e that encodes some kind of mapping.

Say we wanted a store, when given a, to list b, c and d, and when the value of d is given, a similar store be returned on d. Also, when the value for e is returned, it should interpret the mapping, returning a particular store on e.
Essentially, we want the behavior of each level of stores to be customizable. In fact, the condition is not on the level, but more generally on the path (including key) and possibly values of the tree.

Method

One way to do this is with kv_walk, or some similar logic.

We can have the stores pass on the "DNA" of the rules to any children mappings that might need it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant