<a href="https://colab.research.google.com/github/francescoS01/Bayesian-Network/blob/main/bayesian_network.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>



### **1. Bayesian node class**
**1.2 value generate:** <br>
This method allows setting the current value (`self.current_value`) of the node based on its parents and following the probabilities defined in the conditional probability table (CPT) (defined as an array of objects, detailed later). For example, considering a node $X$ with two parents $Y$ and $Z$ having values $v_y$ and $v_z$, this method will set the value of node $X$ based on the probabilities of $X$ taking a specific value $v_i$ given its parents: $P(X=v_i | Y=y_i, Z=z_i)$.
<br>
Let's break down the three phases of this method using the earlier example of analyzing node $X$, which takes values $v_i$ in {X possible values} and has two parents $Y$ and $Z$ with values $v_y$ and $v_z$:

- Create a dictionary containing the parent names with their values: ${Y:y_i, Z:z_i}$.
- Search within the CPT table for objects (one for each possible value $X$ can take) that have $Y:y_i$ and $Z:z_i$, and extract the necessary information, specifically, what value the node $X$ takes and with what probability given the parent values $y_i$, $z_i$. Store all this information in a dictionary where keys are the possible values of node $X$ and values are the associated probabilities. At the end of this phase, the dictionary will look like this: $\left\{ v_1 : \text{prob}_1, \ldots, v_n : \text{prob}_n \right\}$ where the $v_i$ are the values taht node $X$ can take. (Note: sum of $prob_i$ = 1)
- Generate a random number between 0 and 1. This random number determines which value the node $X$ will take based on the assigned probabilities for each value. For example, if the random number falls within a higher probability interval, node $X$ will assume the corresponding value associated with that interval.






In [38]:
import random
from typing import List

class BayesianNode:
    def __init__(self, node_name, possible_value, cpt, current_value=None, children=[], parents=[]):
        self.node_name = node_name
        self.possible_value = possible_value
        self.children = children
        self.parents = parents
        self.cpt = cpt
        self.current_value = current_value  
        self.children_update() 

    def children_update(self):
        for parent in self.parents:
            parent.set_children(self)

    def set_children(self, new_child: 'BayesianNode'):
        self.children.append(new_child)

    def set_current_state(self, value): 
        self.current_value = value

    def get_name(self):
        return self.node_name

    def get_parents(self):
        return self.parents
    
    def get_children(self): 
        return self.children
     
    def get_current_value(self): 
        return self.current_value
    
    def get_cpt(self): 
        return self.cpt 
    
    def value_generate(self):
        # create a dictionary with paretns and them value
        parent_value_dict = {}
        for parent in self.parents:
            parent_name = parent.get_name()
            value = parent.get_current_value()
            parent_value_dict[parent_name] = value

        # create a dictionary of probability of self node knowing parents and them values 
        probability_distribution = {}
        for value in self.possible_value:
            new = parent_value_dict.copy()
            new[self.node_name] = value
            for dict in self.cpt: 
                new['prob'] = dict['prob']
                if dict == new:
                    probability_distribution[value] = dict['prob']
                    
        # following the probility, extract one value of the self node 
        numero_random = random.random() # 0-1 
        accumulate = 0
        for key, prob in probability_distribution.items():
            accumulate += prob
            if accumulate >= numero_random:
                self.current_value = key
                return key

### **2. Network node class**
The Network class contains a single attribute, an array of objects of type Node.<br>
`Topological_sort`: sorts the list of nodes (self_nodes) in topological order, enabling subsequent use for sampling. <br><br>
**2.1 sampling_create** <br>
Performs a topological sort of the nodes, and then for each node, calls the `value_generate` method (defined in the Node class). All nodes with their generated values are collected into a sampling object, which is returned.

In [39]:
class Network:
    def __init__(self, nodes:List[BayesianNode]):
        self.net_nodes = nodes

    def  add_node(self, node: BayesianNode):
        if not isinstance(node, BayesianNode):
            raise TypeError("Input must be of type 'BayesianNode'")
        else:
            self.net_nodes.append(node)

    def topological_sort(self):
        visited = []
        parents_count = {node: 0 for node in self.net_nodes}
        for node in self.net_nodes:
            for child in node.children:
                parents_count[child] += 1
        no_parents = [node for node in self.net_nodes if parents_count[node] == 0]
        while no_parents:
            node = no_parents.pop(0)
            visited.append(node)
            for child in node.children:
                parents_count[child] -= 1
                if parents_count[child] == 0:
                    no_parents.append(child)
        self.net_nodes = visited 
        return visited
    
    def sampling_create(self): 
        nodes = self.topological_sort()
        sampling = {}
        for node in nodes: 
            node_name = node.get_name()
            node_value = node.value_generate()
            sampling[node_name] = node_value
            node.set_current_state(node_value)
        return sampling

### **3. Node creation**
In this section, node objects are created with all the necessary instance variables, such as node name, parents, etc. <br><br>
**3.1 CPT**<br>
The conditional probability table (CPT) is created for each node as an array of dictionaries. For each possible value of the node under consideration, a number of dictionaries (inside the array) are created equal to all possible combinations considering the values that the parents can take. Taking the usual example of node $X$ with two parents $Y$ and $Z$, let's examine a generic dictionary contained in the CPT of node $X: \left\{X:v_x, Y:v_y, Z:v_z, \text{'prob'}:p \right\}$, where $p$ indicates the probability that X assumes the value vx, knowing that Y and Z respectively assume the values vy and vz.

<br><br>
**3.2 Struttura bayesian network**<br>
![Img](https://drive.google.com/uc?export=view&id=1Ocse1AFURWeeybF5vT8wmxm8i-DiC9m3)



In [40]:
#OSS: the first values of CPT is always the  first states of the variable itself 

# NODE 1: Nutrition node
nutr_possible_value = ["good", "not good"]
nutr_name ="nutrition"
parents = []
children = []
current_value = None
cpt = [                                                                                                 
    {nutr_name: 'good', 'prob':0.5},
    {nutr_name: 'not good', 'prob':0.5}]
nutr_node = BayesianNode(nutr_name, nutr_possible_value , cpt, current_value, children, parents)

# NODE 2: physical exercise node
pysicalex_possible_value = ['good', 'not good']
pysicalex_name = 'pysical exercise'
parents = []
children = []
current_value = None
cpt = [                                                                                                 
    {pysicalex_name: 'good', 'prob':0.6},
    {pysicalex_name: 'not good', 'prob':0.4}]
pysicalex_node = BayesianNode(pysicalex_name, pysicalex_possible_value , cpt, current_value, children, parents)

# NODE 3: healt node
healt_possibile_value = ['good', 'not good']
healt_name = 'healt'
parents = [nutr_node, pysicalex_node]
children = []
current_value = None
nutr_name = nutr_node.get_name()
pysicalex_name = pysicalex_node.get_name()
cpt = [                                                                                                 
    {healt_name: 'good', nutr_name: 'good', pysicalex_name: 'good', 'prob':0.8},
    {healt_name: 'good', nutr_name: 'good', pysicalex_name: 'not good', 'prob':0.7},
    {healt_name: 'good', nutr_name: 'not good', pysicalex_name: 'good', 'prob':0.6}, 
    {healt_name: 'good', nutr_name: 'not good', pysicalex_name: 'not good', 'prob':0.3},
    {healt_name: 'not good', nutr_name: 'good', pysicalex_name: 'good', 'prob':0.2},
    {healt_name: 'not good', nutr_name: 'good', pysicalex_name: 'not good', 'prob':0.3},
    {healt_name: 'not good', nutr_name: 'not good', pysicalex_name: 'good', 'prob':0.4},
    {healt_name: 'not good', nutr_name: 'not good', pysicalex_name: 'not good', 'prob':0.7}]
healt_node = BayesianNode(healt_name, healt_possibile_value, cpt, current_value, children, parents)

# NODE 4: stress node
stress_possibile_value = ['high', 'not high']
stress_name = 'stress'
parents = [healt_node]
children = []
current_value = None
healt_name = healt_node.get_name()
cpt = [                                                                                                 
    {stress_name: 'high', healt_name: 'good', 'prob':0.4},
    {stress_name: 'high', healt_name: 'not good', 'prob':0.8},
    {stress_name: 'not high', healt_name: 'good', 'prob':0.6}, 
    {stress_name: 'not high', healt_name: 'not good', 'prob':0.2}]
stress_node = BayesianNode(stress_name, stress_possibile_value, cpt, current_value, children, parents)

# NODE 5: recovery node
recovery_possibile_value = ['good', 'not good']
recovery_name = 'recovery'
parents = [stress_node]
children = []
current_value = None
stress_name = stress_node.get_name()
cpt = [                                                                                                 
    {recovery_name: 'good', stress_name: 'high', 'prob':0.2},
    {recovery_name: 'good', stress_name: 'not high', 'prob':0.7},
    {recovery_name: 'not good', stress_name: 'high', 'prob':0.8}, 
    {recovery_name: 'not good', stress_name: 'not high', 'prob':0.3}]
recovery_node = BayesianNode(recovery_name, recovery_possibile_value, cpt, current_value, children, parents)

# NODE 6: mood node
mood_possibile_value = ['good', 'not good']
mood_name = 'mood'
parents = [stress_node]
children = []
current_value = None
stress_name = stress_node.get_name()
cpt = [                                                                                                 
    {mood_name: 'good', stress_name: 'high', 'prob':0.2},
    {mood_name: 'good', stress_name: 'not high', 'prob':0.7},
    {mood_name: 'not good', stress_name: 'high', 'prob':0.8}, 
    {mood_name: 'not good', stress_name: 'not high', 'prob':0.3}]
mood_node = BayesianNode(mood_name, mood_possibile_value, cpt, current_value, children, parents)

# NODE 7: energy node
energy_possibile_value = ['high', 'not high']
energy_name = 'energy'
parents = [recovery_node, nutr_node]
children = []
current_value = None
recovery_name = recovery_node.get_name()
nutr_name = nutr_node.get_name()
cpt = [                                                                                                 
    {energy_name: 'high', nutr_name: 'good', recovery_name: 'good', 'prob':0.9},
    {energy_name: 'high', nutr_name: 'good', recovery_name: 'not good', 'prob':0.5},
    {energy_name: 'high', nutr_name: 'not good', recovery_name: 'good', 'prob':0.6}, 
    {energy_name: 'high', nutr_name: 'not good', recovery_name: 'not good', 'prob':0.1},
    {energy_name: 'not high', nutr_name: 'good', recovery_name: 'good', 'prob':0.1},
    {energy_name: 'not high', nutr_name: 'good', recovery_name: 'not good', 'prob':0.4},
    {energy_name: 'not high', nutr_name: 'not good', recovery_name: 'good', 'prob':0.9},
    {energy_name: 'not high', nutr_name: 'not good', recovery_name: 'not good', 'prob':0.9}]
energy_node = BayesianNode(energy_name, energy_possibile_value, cpt, current_value, children, parents)

# NODE 8: productivity node
prod_possibile_value = ['high', 'not high']
prod_name = 'productivity'
parents = [energy_node, mood_node]
children = []
current_value = None
energy_name = energy_node.get_name()
mood_name = mood_node.get_name()
cpt = [                                                                                                 
    {prod_name: 'high', energy_name: 'high', mood_name: 'good', 'prob':0.8},
    {prod_name: 'high', energy_name: 'high', mood_name: 'not good', 'prob':0.6},
    {prod_name: 'high', energy_name: 'not high', mood_name: 'good', 'prob':0.6}, 
    {prod_name: 'high', energy_name: 'not high', mood_name: 'not good', 'prob':0.2},
    {prod_name: 'not high', energy_name: 'high', mood_name: 'good', 'prob':0.2},
    {prod_name: 'not high', energy_name: 'high', mood_name: 'not good', 'prob':0.4},
    {prod_name: 'not high', energy_name: 'not high', mood_name: 'good', 'prob':0.4},
    {prod_name: 'not high', energy_name: 'not high', mood_name: 'not good', 'prob':0.8}]
productivity_node = BayesianNode(prod_name, prod_possibile_value, cpt, current_value, children, parents)

# NODE 9: wellness node
wellness_possibile_value = ['high', 'not high']
wellness_name = 'wellness'
parents = [pysicalex_node, mood_node]
children = []
current_value = None
pysicalex_name = pysicalex_node.get_name()
mood_name = mood_node.get_name()
cpt = [                                                                                                 
    {wellness_name: 'high', pysicalex_name: 'good', mood_name: 'good', 'prob':0.8},
    {wellness_name: 'high', pysicalex_name: 'good', mood_name: 'not good', 'prob':0.6},
    {wellness_name: 'high', pysicalex_name: 'not good', mood_name: 'good', 'prob':0.6}, 
    {wellness_name: 'high', pysicalex_name: 'not good', mood_name: 'not good', 'prob':0.2},
    {wellness_name: 'not high', pysicalex_name: 'good', mood_name: 'good', 'prob':0.2},
    {wellness_name: 'not high', pysicalex_name: 'good', mood_name: 'not good', 'prob':0.4},
    {wellness_name: 'not high', pysicalex_name: 'not good', mood_name: 'good', 'prob':0.4},
    {wellness_name: 'not high', pysicalex_name: 'not good', mood_name: 'not good', 'prob':0.8}]
wellness_node = BayesianNode(wellness_name, wellness_possibile_value, cpt, current_value, children, parents)

# NODE 10: mood node
fitness_possibile_value = ['good', 'not good']
fitness_name = 'fitness'
parents = [pysicalex_node]
children = []
current_value = None
pysicalex_name = pysicalex_node.get_name()
cpt = [                                                                                                 
    {fitness_name: 'good', pysicalex_name: 'good', 'prob':0.7},
    {fitness_name: 'good', pysicalex_name: 'not good', 'prob':0.3},
    {fitness_name: 'not good', pysicalex_name: 'good', 'prob':0.3}, 
    {fitness_name: 'not good', pysicalex_name: 'not good', 'prob':0.7}]
fitness_node = BayesianNode(fitness_name, fitness_possibile_value, cpt, current_value, children, parents)


#### **4. Sampling creation**

In [41]:
# ------------------------- NET TEST --------------------------------
net = Network([mood_node, nutr_node, healt_node, energy_node, pysicalex_node, stress_node, recovery_node, productivity_node , wellness_node, fitness_node])
for i in range(1, 11):
    x = net.sampling_create()
    print(x)

{'nutrition': 'not good', 'pysical exercise': 'not good', 'healt': 'not good', 'fitness': 'good', 'stress': 'high', 'recovery': 'not good', 'mood': 'not good', 'energy': 'not high', 'wellness': 'not high', 'productivity': 'not high'}
{'nutrition': 'not good', 'pysical exercise': 'not good', 'healt': 'not good', 'fitness': 'good', 'stress': 'high', 'recovery': 'not good', 'mood': 'not good', 'energy': 'not high', 'wellness': 'not high', 'productivity': 'not high'}
{'nutrition': 'good', 'pysical exercise': 'good', 'healt': 'good', 'fitness': 'good', 'stress': 'not high', 'recovery': 'good', 'mood': 'good', 'energy': 'high', 'wellness': 'high', 'productivity': 'not high'}
{'nutrition': 'not good', 'pysical exercise': 'not good', 'healt': 'good', 'fitness': 'good', 'stress': 'high', 'recovery': 'good', 'mood': 'not good', 'energy': 'high', 'wellness': 'not high', 'productivity': 'high'}
{'nutrition': 'not good', 'pysical exercise': 'good', 'healt': 'good', 'fitness': 'not good', 'stress': 