#### Base Non-binary node functionality


Optional: Create a (conda) environment and activate it, install the package

```bash
    conda create -y -n conda_nbnode python=3.8
    conda activate conda_nbnode
    git clone https://github.com/ggrlab/nbnode
    cd nbnode
    pip install --upgrade pip
    pip install . 
```


Base-functionality of the package is to enable non-binary trees. The following creates
a tree with a root node ``a`` and three children ``a0``, ``a1`` and ``a2``. ``a1`` is the only child with another child ``a1a``.

```
    a
    ├── a0
    ├── a1
    │   └── a1a
    └── a2
```


A basic non-binary node (``NBNode``) consists of four important attributes:

    - ``name`` The name of the node. This is the only mandatory attribute.
    - ``parent`` The parent node of this node.
    - ``decision_name`` The name of the value leading to this node. 
    - ``decision_value`` The value leading to this node.

    
The name of the node must only be unique within all childs of the parent node.
The ``decision_name`` and ``decision_value`` are the named values leading to this node. Note that 
``decision_name`` must be a string, but ``decision_value`` can be anything, including strings, integers, floats, etc.

To build the tree above, we can use the following code:


In [45]:
from nbnode.nbnode import NBNode
simple_tree = NBNode("a")
NBNode("a0", parent=simple_tree, decision_value=-1, decision_name="m1")
a1 = NBNode("a1", parent=simple_tree, decision_value=1, decision_name="m1")
NBNode("a2", parent=simple_tree, decision_value="another", decision_name="m3")
NBNode("a1a", parent=a1, decision_value="test", decision_name="m2")

NBNode('/a/a1/a1a', counter=0, decision_name='m2', decision_value='test')

We can check if the previous tree was built correctly: 

In [46]:
simple_tree.pretty_print()

a (counter:0)
├── a0 (counter:0)
├── a1 (counter:0)
│   └── a1a (counter:0)
└── a2 (counter:0)


And we can show additional information about each node of the tree:

In [53]:
simple_tree.pretty_print("__long__")

a (counter:0, decision_name:None, decision_value:None)
├── a0 (counter:0, decision_name:m1, decision_value:-1)
├── a1 (counter:0, decision_name:m1, decision_value:1)
│   └── a1a (counter:0, decision_name:m2, decision_value:test)
└── a2 (counter:0, decision_name:m3, decision_value:another)


In [None]:
# Alternatively, we prepared the tree already for you:
import nbnode.nbnode_trees as nbtree
simple_tree = nbtree.tree_simple()
simple_tree.pretty_print("__long__")

Finally, we use the tree to predict the final node of a new data point.
The following values, supplied as two lists ``values`` and ``names`` are used to predict the final node.

In [48]:
single_prediction = simple_tree.predict(
        values=[1, "test", 2], names=["m1", "m2", "m3"]
    )
print(single_prediction)

NBNode('/a/a1/a1a', counter=0, decision_name='m2', decision_value='test')


This returns the identified NBnode object defined by the values. 
``NBNode`` can additionally handle the following data types: 

In [49]:
print("\nDictionary")
value_dict = {"m1": 1, "m2": "test", "m3": 2}
print(value_dict)
pred_dict = simple_tree.predict(values=value_dict)
print("Prediction: ")
print(pred_dict)



Dictionary
{'m1': 1, 'm2': 'test', 'm3': 2}
Prediction: 
NBNode('/a/a1/a1a', counter=0, decision_name='m2', decision_value='test')


In [50]:
print("\nPandas DataFrame")
import pandas as pd
value_df = pd.DataFrame.from_dict([value_dict])
print(value_df)
print("\nPrediction: ")
pred_df = simple_tree.predict(values=value_df)
print(pred_df)


Pandas DataFrame
   m1    m2  m3
0   1  test   2

Prediction: 
0    (((NBNode('/a/a1/a1a', counter=0, decision_nam...
dtype: object


In [51]:
print("\nNumpy array: Only for numerical values")
import numpy as np
values_np = np.array([[-1, 0, 0]])
print(values_np)
pred_np = simple_tree.predict(values=values_np,  names=["m1", "m2", "m3"])
print(pred_np)



Numpy array: Only for numerical values
[[-1  0  0]]
0    (((NBNode('/a/a0', counter=0, decision_name='m...
dtype: object


# NBNode basic methods

``NBNode`` has a large number of implemented basic methods: 

In [66]:
from nbnode.nbnode import NBNode
import nbnode.nbnode_trees as nbtree
simple_tree = nbtree.tree_simple()

# Print the tree
simple_tree.pretty_print("__long__")
# Print specific attributes of the tree as list
simple_tree.pretty_print(["counter"])
simple_tree.pretty_print(["decision_name", "decision_value"])
simple_tree.__dict__


# Access nodes
# Access a child of any (here root) node
simple_tree.children
a1 = simple_tree.children[1]
print(a1)

# You can also access nodes by their _full_ name
# full name is the path from root to the node, not the decision name, nor the node name
# You can retrieve the full name of a node by
print(a1.get_name_full())
# Mind the "/" ("root") at the beginning of the path
a1_by_name = simple_tree["/a/a1"]
print(a1_by_name)

# We can compare nodes! Here we have the exact same node, so it is identical. 
assert a1_by_name == a1


a (counter:0, decision_name:None, decision_value:None)
├── a0 (counter:0, decision_name:m1, decision_value:-1)
├── a1 (counter:0, decision_name:m1, decision_value:1)
│   └── a1a (counter:0, decision_name:m2, decision_value:test)
└── a2 (counter:0, decision_name:m3, decision_value:another)
a (counter:0)
├── a0 (counter:0)
├── a1 (counter:0)
│   └── a1a (counter:0)
└── a2 (counter:0)
a (decision_name:None, decision_value:None)
├── a0 (decision_name:m1, decision_value:-1)
├── a1 (decision_name:m1, decision_value:1)
│   └── a1a (decision_name:m2, decision_value:test)
└── a2 (decision_name:m3, decision_value:another)
NBNode('/a/a1', counter=0, decision_name='m1', decision_value=1)
/a/a1
NBNode('/a/a1', counter=0, decision_name='m1', decision_value=1)


# Decision cutoffs  

``NBNode`` can also be used to split and then decide on continuous features. 

In [74]:
continuous_tree = NBNode("a")
NBNode("a0", parent=continuous_tree, decision_value=1, decision_name="m1", decision_cutoff=0.5)
NBNode("a1", parent=continuous_tree, decision_value=-1, decision_name="m1", decision_cutoff=0.5)
continuous_tree.pretty_print("__long__")

a (counter:0, decision_name:None, decision_value:None)
├── a0 (counter:0, decision_name:m1, decision_value:1)
└── a1 (counter:0, decision_name:m1, decision_value:-1)


The above ``continuous_tree`` contains two nodes, which both decide on the value of ``m1`` with either 1 or -1. Additionally, they have a decision cutoff. 
Until now, ``NBNode`` needed an **exact** match of the decision value. With ``decision_cutoff``, the value in ``decision_name`` is first cut at the cutoff and returns: 

```python
    True if >= 0.5
    False if < 0.5
```

In [75]:
print(continuous_tree.predict(values=[0.6], names=["m1"]))
print(continuous_tree.predict(values=[0.4], names=["m1"]))

print(continuous_tree.predict(values=[1], names=["m1"]))
print(continuous_tree.predict(values=[-1], names=["m1"]))

print(continuous_tree.predict(values=[10], names=["m1"]))
print(continuous_tree.predict(values=[-10], names=["m1"]))

NBNode('/a/a0', counter=0, decision_name='m1', decision_value=1)
NBNode('/a/a1', counter=0, decision_name='m1', decision_value=-1)
NBNode('/a/a0', counter=0, decision_name='m1', decision_value=1)
NBNode('/a/a1', counter=0, decision_name='m1', decision_value=-1)


# Multiple decision values

Some nodes need not only a single value to decide on the endnode but multiple. With NBNode, you can decide on any number of features. 

In [85]:
from nbnode.nbnode import NBNode

mytree = NBNode("a")
# a0 =
NBNode("a0", parent=mytree, decision_value=-1, decision_name="m1")
a1 = NBNode("a1", parent=mytree, decision_value=1, decision_name="m1")
# a2 =
NBNode("a2", parent=mytree, decision_value="another", decision_name="m3")
# a1a =
NBNode("a1a", parent=a1, decision_value="test", decision_name="m2")
NBNode(
    "a3",
    parent=mytree,
    decision_value=["test", 1],
    decision_name=["m2", "m4"],
    decision_cutoff=[None, 0],
)

mytree.pretty_print("__long__")

print("\n\nPredictions")
print(mytree.predict(values=[None, "test", None, 3], names=["m1", "m2", "m3", "m4"]))
try: 
    print(mytree.predict(
        values=[None, "NOT_test", None, 3], names=["m1", "m2", "m3", "m4"]
        ))
except ValueError:
    print("ValueError: Could not find a fitting endnode for the data you gave. You also did not allow for part predictions.")

a (counter:0, decision_name:None, decision_value:None)
├── a0 (counter:0, decision_name:m1, decision_value:-1)
├── a1 (counter:0, decision_name:m1, decision_value:1)
│   └── a1a (counter:0, decision_name:m2, decision_value:test)
├── a2 (counter:0, decision_name:m3, decision_value:another)
└── a3 (counter:0, decision_name:['m2', 'm4'], decision_value:['test', 1])
Predictions


NBNode('/a/a3', counter=0, decision_name=['m2', 'm4'], decision_value=['test', 1])
ValueError: Could not find a fitting endnode for the data you gave. You also did not allow for part predictions.
