# User Guide

The user guide walks you through the main concepts and possible pitfalls of the library. The imports in the next code block will cover most of the examples.

In [75]:
from sigmaepsilon.deepdict import DeepDict
from pprint import pprint

## How to create a DeepDict?

You can simply create a DeepDict the same way you would create an ordinary ``dict``: 

In [76]:
data = DeepDict(a=1, b=DeepDict(c=2))
data["d"] = 2

pprint(data)

DeepDict({'a': 1, 'b': DeepDict({'c': 2}), 'd': 2})


You can create the same dictionary like this:

In [77]:
data = DeepDict(a=1, d=2)
data["b", "c"] = 2

pprint(data)

DeepDict({'a': 1, 'd': 2, 'b': DeepDict({'c': 2})})


The only difference here is the order. No surprise, since you provided the values with different order. We can easily make up for this:

In [78]:
data = DeepDict(a=1)
data["b", "c"] = 2
data["d"] = 2

pprint(data)

DeepDict({'a': 1, 'b': DeepDict({'c': 2}), 'd': 2})


Now this is truly the same as the first one.

### Wrapping a regular ``dict`` instance

In [79]:
d = {
    "a" : {"aa" : 1},
    "b" : 2,
    "c" : {"cc" : {"ccc" : 3}}, 
}

DeepDict.wrap(d)["c", "cc", "ccc"]

3

Note that it is also possible to just provide the dictionary to the creator (remember, DeepDicts are dictionaries)

In [80]:
d = {
    "a" : {"aa" : 1},
    "b" : 2,
    "c" : {"cc" : {"ccc" : 3}}, 
}
try:
    DeepDict(d)["c", "cc", "ccc"]
except KeyError as e:
    print(e)

('cc', 'ccc')


but you need to treat it as one:

In [81]:
DeepDict(d)["c"]["cc"]["ccc"]

3

## Array-like indexing

As you've probably already noticed, DeepDict instances support array-like indexing, so the following expressions would all work and return the same value

```python
d["a"]["b"]["c"]
d["a", "b", "c"]
d["a"]["b", "c"]
d[("a", "b", "c")]
d[["a", "b", "c"]]
```

## Iterating over a DeepDict

One of the key motives behind the DeepDict class is the ability to introduce layout to an ordinary dictionary, without changing the meaning or the accessibility of the data. To illustrate what we mean here, let's have a dictionary like this:

In [82]:
data = {
    "key 1" : "value 1",
    "key 2" : "value 2",
    "key 3" : "value 3",
    "key 4" : "value 4",
}

If we iterate over the items of this dictionary we get back the items as expected.

In [83]:
[(k,v) for k, v in data.items()]

[('key 1', 'value 1'),
 ('key 2', 'value 2'),
 ('key 3', 'value 3'),
 ('key 4', 'value 4')]

This is all good, but sometimes we want to add some structure to our data and it is exactly what the DeepDict class makes us able to do. For instance, we can group the values like this:

In [84]:
data = {
    "group 1" : {
        "subgroup 1" : {"key 1" : "value 1"},
        "subgroup 2" : {"key 2" : "value 2"},
    },
    "group 2" : {
        "subgroup 1" : {"key 3" : "value 3"},
        "subgroup 2" : {"key 4" : "value 4"},
    }
}
data = DeepDict.wrap(data)

In [85]:
[(k,v) for k, v in my_dict.items(deep=True)]

[('key 1', 'value 1'),
 ('key 2', 'value 2'),
 ('key 3', 'value 3'),
 ('key 4', 'value 4')]

We can see that the results are the same, but our data is much more organized. All we had to do was to add the keyword argument ``deep=True`` to allow the parser to go into subdirectories. If you don't provide this argument, or you call the method with ``deep=False``, the instance would work as a regular ``dict`` instance.

In [86]:
[(k,v) for k, v in my_dict.items()]

[('group 1', DeepDict({'key 1': 'value 1', 'key 2': 'value 2'})),
 ('group 2', DeepDict({'key 3': 'value 3', 'key 4': 'value 4'}))]

You can also pass the argument ``return_address=True`` if you want to get the address of a value relative to the root.

In [87]:
for *addr, key in data.keys(deep=True, return_address=True):
    print(f"Address: {addr}, Key: {key}")

Address: ['group 1', 'subgroup 1'], Key: key 1
Address: ['group 1', 'subgroup 2'], Key: key 2
Address: ['group 2', 'subgroup 1'], Key: key 3
Address: ['group 2', 'subgroup 2'], Key: key 4


### Iterating over sub-dictionaries

In [88]:
for c in data.containers():
    pprint(c)

DeepDict({'subgroup 1': DeepDict({'key 1': 'value 1'}), 'subgroup 2': DeepDict({'key 2': 'value 2'})})
DeepDict({'key 1': 'value 1'})
DeepDict({'key 2': 'value 2'})
DeepDict({'subgroup 1': DeepDict({'key 3': 'value 3'}), 'subgroup 2': DeepDict({'key 4': 'value 4'})})
DeepDict({'key 3': 'value 3'})
DeepDict({'key 4': 'value 4'})


Maybe you noticed, that 'data' itself was not printed. You can call `containers` with the argument `inclusive=True`, in which case the outermost container is also included:

In [89]:
for c in data.containers(inclusive=True):
    pprint(c)

DeepDict({'group 1': DeepDict({'subgroup 1': DeepDict({'key 1': 'value 1'}), 'subgroup 2': DeepDict({'key 2': 'value 2'})}), 'group 2': DeepDict({'subgroup 1': DeepDict({'key 3': 'value 3'}), 'subgroup 2': DeepDict({'key 4': 'value 4'})})})
DeepDict({'subgroup 1': DeepDict({'key 1': 'value 1'}), 'subgroup 2': DeepDict({'key 2': 'value 2'})})
DeepDict({'key 1': 'value 1'})
DeepDict({'key 2': 'value 2'})
DeepDict({'subgroup 1': DeepDict({'key 3': 'value 3'}), 'subgroup 2': DeepDict({'key 4': 'value 4'})})
DeepDict({'key 3': 'value 3'})
DeepDict({'key 4': 'value 4'})


If you only want to get the containers that have no subdictionaries, you can do this:

In [90]:
list(filter(lambda d: d.is_leaf(), data.containers(inclusive=True)))

[DeepDict({'key 1': 'value 1'}),
 DeepDict({'key 2': 'value 2'}),
 DeepDict({'key 3': 'value 3'}),
 DeepDict({'key 4': 'value 4'})]

The `containers` method also accepts the argument `deep`, but it is `True` by default.

## Parent-child relationships and membership testing

When you create a DeepDict with many levels, each subdirectory knows who they parents are and what are their keys in those parents.

In [91]:
data = DeepDict()
data['a', 'b', 'c', 'e'] = 1
data['a', 'b', 'c'].parent is data['a', 'b'], data['a', 'b', 'c'].key

(True, 'c')

This information makes it possible for membership tests among DeepDict instances:

In [92]:
data['a', 'b', 'c'] in data['a', 'b']

True

## Locking the layout

Previously you have seen, that a DeepDict instance can be created like this:

In [93]:
data = DeepDict()
data['a', 'b', 'c', 'e'] = 1

This rises some questions. Can a DeepDict isntance raise a `KeyError` at all? The answer is that it depends. Be default, they can't. Whenever a key is missing, a deeper level is created immediately. When you type `data['a'] = 1`, first a DeepDict is assigned to `data` with the key 'a', then it gets overwritten by the value 1. However, you can freeze the layout of a DeepDict when you feel that you are ready building your dataset.

In [94]:
data.lock()
data.locked

True

Now adding a missing key would raise a `KeyError`.

In [95]:
try:
    data["b"] = 1
except KeyError as e:
    print(e)

"Missing key 'b' and the object is locked!"


Of course you can unlock the layout of the instance whenever you want it.

In [96]:
data.unlock()
data.locked

False

And you can add your new data

In [97]:
data["b"] = 1

Locking your `DeepDict` is essential in some situations, otherwise there is no way to tell if you are in the wrong or not. Typos are a real threat here.

## Layout information

Every container inside a DeepDict has a parent. The only container that has no parent is the outermost container itself (here 'data').

In [98]:
data = DeepDict()
data['a', 'b', 'c', 'e'] = 1
data['a', 'b', 'c'].parent.key

'b'

As you might have already guessed, nested containers also know how they are stored in their parent via attributes like `key` and `address`.

In [99]:
data['a', 'b', 'c'].parent.key, data['a', 'b', 'c'].parent.address

('b', ['a', 'b'])

The nested containers also keep a reference to the outermost container (or none of these):

In [100]:
data['a', 'b', 'c'].root

DeepDict({'a': DeepDict({'b': DeepDict({'c': DeepDict({'e': 1})})})})

You can easily check if a container is a root, or a leaf:

In [101]:
data['a', 'b', 'c'].is_root(), data['a', 'b', 'c'].is_leaf()

(False, True)

## Differences between ``dict`` and ``DeepDict``

In most cases a DeepDict works identically to regular dictionaries. One difference is how they provide access to deep levels.

Let say we create a dictionary like this:

In [102]:
{(1, 2): 'A'}

{(1, 2): 'A'}

Since tuples ar immutable, you can use them as keys in a dictionary. If you do the same with a DeepDict, the result is going to be different:

In [103]:
d = DeepDict()
d[(1, 2)] = "A"
d

DeepDict({1: DeepDict({2: 'A'})})

As you can see, in the second case, the value 'A' is in a nested dictionary with key 2, which itself is in a dictionary with key 1. The reason for this is that the previous cell is identical to the following one.

In [104]:
d = DeepDict()
d[1, 2] = "A"
d

DeepDict({1: DeepDict({2: 'A'})})

To keep the array-like index mechanism is more important and is a design decision here. The good news is that at the end of the day, the behaviour is the same (at least in tis case):

In [105]:
{(1, 2): 'A'}[(1, 2)]

'A'

In [106]:
DeepDict.wrap({(1, 2): 'A'})[(1, 2)]

'A'

In [107]:
(1, 2) in d

True

In [108]:
from sigmaepsilon.deepdict import Key

d = DeepDict()
d[Key((1, 2))] = "A"
d

DeepDict({(1, 2): 'A'})

In [109]:
d[Key((1, 2))]

'A'

In [110]:
(1, 2) in d, Key((1, 2)) in d

(False, True)

## Printing

It is possible to print a `DeepDict`, or a regular `dict` instance as a tree, using the `asciitree` package. Install it with

```console
$ pip install asciitree
```

and use the `asciiprint` method from `sigmaepsilon.deepdict`:

In [111]:
from sigmaepsilon.deepdict import asciiprint

d = {
    "a" : {"aa" : 1},
    "b" : 2,
    "c" : {"cc" : {"ccc" : 3}}, 
}

data = DeepDict.wrap(d)
data.name = "Data"

asciiprint(data)

Data
 +-- a
 +-- c
     +-- cc


For more comprehensive and detailed information about the `asciitree` library, please refer to the [official documentation](https://pythonhosted.org/asciitree/#).

## Customizing the behaviour upon joining or leaving a parent

In [112]:
class CustomDict(DeepDict):
    def __before_join_parent__(self, parent, key):
        print(f"'{self.name}' is about to join team '{parent.name}' with role '{key}'")

    def __after_join_parent__(self, parent, key):
        print(f"'{self.name}' joined team '{parent.name}' with role '{key}'")
        super().__after_join_parent__(parent, key)
        
    def __before_leave_parent__(self):
        parent, key = self.parent, self.key
        print(f"'{self.name}' is about to leave team '{parent.name}' with role '{key}'")

    def __after_leave_parent__(self):
        parent, key = self.parent, self.key
        print(f"'{self.name}' has left team '{parent.name}' with role '{key}'")
        super().__after_leave_parent__()


team = DeepDict()
team.name = "Velocity Raptors"

member = CustomDict()
member.name = "Rebeca"

team["data engineer"] = member

'Rebeca' is about to join team 'Velocity Raptors' with role 'data engineer'
'Rebeca' joined team 'Velocity Raptors' with role 'data engineer'


In [113]:
del team["data engineer"]

'Rebeca' is about to leave team 'Velocity Raptors' with role 'data engineer'
'Rebeca' has left team 'Velocity Raptors' with role 'data engineer'


In [114]:
member in team, member.is_root()

(False, True)