---

**APRP: 4. Parameter and Structure Learning**

---


**Contents:**

--------
1. Short context
2. Parameter Learning in a Bayes Net
3. Structure Learning in a Bayes Net




### 1. Short context

So far we have been focused on how Bayesian networks efficiently encode a probability distribution over a set of variables.
This Assignment will be about obtaining a Bayesian network, given a set of sample data. First we focus on the problem of **parameter Learning** for a given network and secondly on **learning the structure** itself.




### 2. Parameter Learning in a Bayes Net
## 2.1. Introductory example


In [3]:
!pip install pgmpy
#Based on https://pgmpy.org/detailed_notebooks/10.%20Learning%20Bayesian%20Networks%20from%20Data.html
import numpy as np
import pandas as pd
from pgmpy.models import BayesianModel
from pgmpy.estimators import BayesianEstimator
from pgmpy.estimators import MaximumLikelihoodEstimator

# Model
data = pd.DataFrame(data={'fruit': ["banana", "apple", "banana", "apple", "banana","apple", "banana",
                                    "apple", "apple", "apple", "banana", "banana", "apple", "banana",],
                          'tasty': ["yes", "no", "yes", "yes", "yes", "yes", "yes",
                                    "yes", "yes", "yes", "yes", "no", "no", "no"],
                          'size': ["large", "large", "large", "small", "large", "large", "large",
                                    "small", "large", "large", "large", "large", "small", "small"]})
model = BayesianModel([('fruit', 'tasty'), ('size', 'tasty')]) 


# Using a MLE estimator to obtain the CPDs tables
model.fit(data, estimator=MaximumLikelihoodEstimator)
for cpd in model.get_cpds():
    print(cpd)

# Using a Bayesian estimator to obtain the CPDs tables
model.fit(data, estimator=BayesianEstimator, prior_type="BDeu")
for cpd in model.get_cpds():
    print(cpd)


Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting pgmpy
  Downloading pgmpy-0.1.22-py3-none-any.whl (1.9 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.9/1.9 MB[0m [31m26.7 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: pgmpy
Successfully installed pgmpy-0.1.22
+---------------+-----+
| fruit(apple)  | 0.5 |
+---------------+-----+
| fruit(banana) | 0.5 |
+---------------+-----+
+------------+--------------+-----+---------------+
| fruit      | fruit(apple) | ... | fruit(banana) |
+------------+--------------+-----+---------------+
| size       | size(large)  | ... | size(small)   |
+------------+--------------+-----+---------------+
| tasty(no)  | 0.25         | ... | 1.0           |
+------------+--------------+-----+---------------+
| tasty(yes) | 0.75         | ... | 0.0           |
+------------+--------------+-----+---------------+
+-------------+----------+
| size(large) | 0.714286



## 2.2. Questions
Based on the previous code provide an answer the following questions:

1. For MLE, why small bananas are not tasty?
2. Why the CPDs tables for MLE and Bayesian estimators differ?
3. What is the goal of the *prior_type*?

### 3. Structure Learning
## 3.1. Scoring function. Example 1:

In [4]:
from pgmpy.estimators import BDeuScore, K2Score, BicScore

bdeu = BDeuScore(data, equivalent_sample_size=5)
k2 = K2Score(data)
bic = BicScore(data)

model1 = BayesianModel([('fruit', 'tasty'), ('size', 'tasty')]) # fruit -> tasty <- size
model2 = BayesianModel([('tasty', 'fruit'), ('tasty', 'size')]) # fruit <- tasty -> size

print(bdeu.score(model1))
print(k2.score(model1))
print(bic.score(model1))

print(bdeu.score(model2))
print(k2.score(model2))
print(bic.score(model2))




-30.12792467904587
-30.3772093643128
-32.859257093436106
-29.99714276768256
-30.551081866620226
-32.45409104969264


## 3.2. Scoring function. Example 2

In [5]:
# create random data sample with 3 variables, where Z is dependent on X, Y:
data = pd.DataFrame(np.random.randint(0, 4, size=(5000, 2)), columns=list('XY'))
data['Z'] = data['X'] + data['Y']

bdeu = BDeuScore(data, equivalent_sample_size=5)
k2 = K2Score(data)
bic = BicScore(data)

model1 = BayesianModel([('X', 'Z'), ('Y', 'Z')])  # X -> Z <- Y
model2 = BayesianModel([('X', 'Z'), ('X', 'Y')])  # Y <- X -> Z


print(bdeu.score(model1))
print(k2.score(model1))
print(bic.score(model1))

print(bdeu.score(model2))
print(k2.score(model2))
print(bic.score(model2))

-13940.064774913622
-14330.964883079389
-14296.100426036432
-20911.9573808926
-20938.776897222077
-20955.99232141435




## 3.3. Putting all toghether.

Example with HillClimbSearch and BicScore.

In [None]:
from pgmpy.estimators import HillClimbSearch

# create some data with dependencies
data = pd.DataFrame(np.random.randint(0, 5, size=(2500, 8)), columns=list('ABCDEFGH'))
data['A'] += data['B'] + data['C']
data['H'] = data['G'] - data['A']

hc = HillClimbSearch(data)
best_model = hc.estimate(scoring_method=BicScore(data))
print(best_model.edges())


  0%|          | 0/1000000 [00:00<?, ?it/s]

[('A', 'C'), ('A', 'B'), ('B', 'C'), ('G', 'A'), ('H', 'A'), ('H', 'G')]


## 3.4. Questions
Based on the previous code provide an answer the following questions:

1. Analyse the results from 3.1 and 3.2
2. Analyse the structure of the obtained network.


### 4. Chalenge


1. Consider the file cancer.bif with the cancer disease BN already used.
2. Load the network and use the method *inference.likelihood_weighted_sample* to sample a minimumm of 5000 examples.
3. Calculate the CPDs and compare with the original ones.
4. Apply diferent strategies (scores and parameters) to determine a BN structure. Compare with the original.






