# Dicitonary Functionals

Provides functionals for diciontary related stuff. In most cases, named tuples are also supported.

I'm not too sure about the jargon here: I'm using _functional_ to mean a function that returns a function. Typically, a functional's parameters determine the parameter of the function that's returned.

In [2]:
"""Dictionary functionals contains functions that return dictionary manipulating functions.

Functionals take zero or more parameters that specify how to manipulate a dictionary. Their returned 
functions take one or more dictionaries and return a value.

Functionals
-----------
has_keys:
    Tests whether dictionary has the keys in a list.

filter_values:
    Tests whether values in a dictionary satisfy a predicate.

extract_keys:
    Return a dictionary including only certain keys.

drop_keys:
    Return a dictionary excluding certain keys.

map_values:
    Apply a mapping to specified keys, returning the transformed dictionary.

flatten_dictionary:
    Return a dictionary that inherits the key-value pairs of any dictionary-valued key.

"""

'Dictionary functionals contains functions that return dictionary manipulating functions.\n\nFunctionals take zero or more parameters that specify how to manipulate a dictionary. Their returned \nfunctions take one or more dictionaries and return a value.\n\nFunctionals\n-----------\nhas_keys:\n    Tests whether dictionary has the keys in a list.\n\nfilter_values:\n    Tests whether values in a dictionary satisfy a predicate.\n\nextract_keys:\n    Return a dictionary including only certain keys.\n\ndrop_keys:\n    Return a dictionary excluding certain keys.\n\nmap_values:\n    Apply a mapping to specified keys, returning the transformed dictionary.\n\nflatten_dictionary:\n    Return a dictionary that inherits the key-value pairs of any dictionary-valued key.\n\n'

In [52]:
from typing import List, Callable, Dict, Any
from functools import singledispatch, lru_cache
from collections import namedtuple

In [15]:
FunctionalMap = Dict[Any, Callable]

In [16]:
def has_keys(keys: List, every: bool = True, only: bool = True) -> Callable[[dict], dict]:
    """True if all/only keys are dictionary.
    
    every: bool
        Requires all keys to be in a dictionary.

    only: bool
        Requires all dictionary keys to be in keys.
    
    If both all and only are false, returns True if and only if the dictionary is not
    empty.

    """
    def _func(dict_):
        if every and only:
            has_all = all([k in keys for k in dict_])
            has_only = all([k in dict_.keys() for k in keys])
            return has_all and has_only
        if every:
            return all([k in dict_ for k in keys])
        if only:
            return all([k in keys for k in dict_])
        return bool(dict_)
    return _func


In [67]:
type(namedtuple)

@lru_cache(maxsize=32)
def nt_builder(name, *args):
    """Builds named tuples with a cache."""
    return namedtuple(name, args)


False

In [63]:
repr(nt_builder('test', *['a', 'b'])._make((1, 2)))

'test(a=1, b=2)'

In [29]:
def filter_values(mapping: FunctionalMap):
    """Return True if predicate in mapping is true for all.""" 
    @singledispatch
    def _func(dict_: dict) -> Callable[[Dict[Any, Callable]], dict]:
        return all(mapping[k](dict_[k]) for k in dict_)

    @_func.register
    def _(dict_: tuple):
        return all(mapping[k](getattr(dict_, k)) for k in mapping)
    return _func

test_nt = namedtuple('test_nt', ['a', 'b'])
d = {'a': lambda x: x > 2}
assert not filter_values(d)({'a': 1})
assert filter_values(d)({'a': 3})
assert filter_values(d)(test_nt(3, 1))


['a']


True

In [None]:
list_of_dicts = [
    {'a': 1, 'b': 2},
    {'c': 2, 'b': 3},
    {'a': 2, 'b': 3, 'c': 0}
]

has_keys_test = list(filter(has_keys(['b'], every=False), list_of_dicts))
print(has_keys_test)
assert len(has_keys_test) == 3
assert all([('b' in dict_) for dict_ in has_keys_test])

In [52]:
def extract_keys(keys: List[str]) -> Callable[[dict], dict]:
    """Returns dictionary whose only keys are keys parameter."""
    @singledispatch
    def _func(dict_: dict) -> dict:
        return {k: v for k, v in dict_.items() if k in keys}

    @_func.register
    def _(dict_: tuple):
        fields = [field for field in dict_.fieldnames if field in keys]
        new_namedtuple = nt_builder('extracted', *fields)
        return new_namedtuple._make(getattr(dict_, f) for f in fields)

    return _func

def drop_keys(keys: List[str]) -> Callable[[dict], dict]:
    """Returns dictionary without keys in keys parameter."""
    def _func(dict_: dict) -> dict:
        return {k: v for k, v in dict_.items() if k not in keys}

    @_func.register
    def _(dict_: tuple):
        fields = [field for field in dict_.fieldnames if field not in keys]
        new_namedtuple = nt_builder('dropped', *fields)
        return new_namedtuple._make(getattr(dict_, f) for f in fields)

    return _func


In [None]:
def map_values(mapping: FunctionalMap):
    """Apply a mapping to each column."""
    def _func(dict_: dict) -> dict:
        anon = (lambda k, v: mapping[k](v) if k in mapping else v)
        return {k: anon(k, v) for k, v in dict_.items()}
    return _func


{'a': 1, 'b': 2}
[{'a': 1, 'b': 2}, {'c': 2, 'b': 3}, {'a': 2, 'b': 3, 'c': 0}]


[{'a': 6, 'b': 2}, {'c': 2, 'b': 3}, {'a': 12, 'b': 3, 'c': 0}]

In [None]:
def flatten_dict():
    """Flatten the dictionary, i.e., make each key have a non-dict value,
    adding keys from dict valued keys.
    """
    def _func(dict_):
        out = {k: v for k, v in dict_.items() if type(v) != dict}
        for key in dict_:
            if type(dict_[key]) == dict:
                out.update({k: v for k, v in dict_[key].items()})
        return out
    return _func
    
assert flatten_dict()({"a": 1, 'd': {"b":1, 'c': 1}}) == {'a': 1, 'b': 1, 'c': 1}

In [27]:
import math
def sequential_func(*functions):
    """Apply functions in order."""
    def _func(*args):
        for function in functions:
            try:
                args = function(*args)
            except TypeError:
                args = function(args)
        return args
    return _func

squaring = lambda x, y: (x**2, y**2)
summing = lambda x, y: x + y
rooting = lambda x: math.sqrt(x)

assert sequential_func(rooting, rooting)(16) == 2
assert sequential_func(squaring, summing, rooting)(3, 4) == 5

In [28]:
%timeit sequential_func(squaring, summing, rooting)(3, 4)

2.09 µs ± 26.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [40]:
%timeit rooting(summing(*squaring(3,4)))

838 ns ± 3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [None]:

mapping = {'a': lambda x: x * 6}
print(map_values(mapping)(list_of_dicts[0]))
assert map_values(mapping)(list_of_dicts[0]) == {'a': 6, 'b': 2}
assert len(list(map(map_values(mapping), list_of_dicts))) == 3


{'a': 6, 'b': 2}


In [None]:
from functools import reduce 

reduce(lambda x, y: x * y, [1, 2, 4, 5])

40

## "Aggregation" reduction

One of the standard functional opperations is _reduce_. In the context of a list, reduce takes the first pair of elements in a list, applies a function, and then applies the function to the result and the next element in the list until the list is consumed. In the context of applying function to a list of dictionaries, reduction might apply to a key for each dictionary in the list or it might apply to a dictionary for each key in the dictionary. This is akin to applying a function to columns or rows in a multidimensional array. 

If we want to follow a standard functional approach, we would apply reduce to columns. Note that since reduce is not necessarily using an associative funciton (i.e. func(a, b) is not necessarily equal to func(b, a)).

So, you're basically looking at column_name: reduce_func. reduce(sum, dict) -> sum(sum ... sum(dict3, sum(dict1, dict2)) ...).
```
def sum(dict1, dict2):
    return {k : dict1[k] + dict2[k] for k in dict1}

### more general
def reducte_dict(mapping):
    return{k: mapping[k](dict1, dict2) for k in mapping}

In [None]:
# def transpose_dict(keys):
#     # Is this a good idea?
#     def _func(list_of_dicts):


def aggregate_dict(mapping):
    def _func(dict1, dict2):
        return{k: mapping[k](dict1[k]) for k in mapping.keys()}
    return _func
sum_b = aggregate_dict({'b': lambda x, y: x + y})
list(sum_b({list_of_dicts}))

TypeError: unhashable type: 'list'