<a href="https://colab.research.google.com/github/mahault/eduplate/blob/main/Eduplate_toy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [2]:
!pip install pgmpy

Collecting pgmpy
  Downloading pgmpy-0.1.25-py3-none-any.whl (2.0 MB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/2.0 MB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.1/2.0 MB[0m [31m1.9 MB/s[0m eta [36m0:00:02[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━[0m [32m1.1/2.0 MB[0m [31m16.5 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.0/2.0 MB[0m [31m18.9 MB/s[0m eta [36m0:00:00[0m
Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch->pgmpy)
  Using cached nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)
Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch->pgmpy)
  Using cached nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)
Collecting nvidia-cuda-cupti-cu12==12.1.105 (from torch->pgmpy)
  Using cached nvidia_cuda_cupti_cu12-12.1.105-py3-non

1. Define the Bayesian Network
Based on the provided graph, the variables are:



*   Active Ing (Active Ingredient)
* Food
* Quantity
* Effectiveness
* Types of Effects

We will use the pgmpy library to define the Bayesian Network.



In [5]:
from pgmpy.models import BayesianNetwork
from pgmpy.factors.discrete import TabularCPD
from pgmpy.inference import VariableElimination

# Define the structure of the Bayesian Network
model = BayesianNetwork([('ActiveIng', 'Effectiveness'),
                         ('Food', 'Effectiveness'),
                         ('Quantity', 'Effectiveness'),
                         ('Effectiveness', 'TypesOfEffects')])

# Define the CPDs (using dummy probabilities for illustration)
cpd_active_ing = TabularCPD(variable='ActiveIng', variable_card=2, values=[[0.7], [0.3]])
cpd_food = TabularCPD(variable='Food', variable_card=2, values=[[0.6], [0.4]])
cpd_quantity = TabularCPD(variable='Quantity', variable_card=2, values=[[0.5], [0.5]])

cpd_effectiveness = TabularCPD(variable='Effectiveness', variable_card=2,
                               values=[[0.95, 0.8, 0.8, 0.6, 0.8, 0.6, 0.6, 0.2],
                                       [0.05, 0.2, 0.2, 0.4, 0.2, 0.4, 0.4, 0.8]],
                               evidence=['ActiveIng', 'Food', 'Quantity'],
                               evidence_card=[2, 2, 2])

cpd_types_of_effects = TabularCPD(variable='TypesOfEffects', variable_card=2,
                                  values=[[0.9, 0.8],  # P(TypesOfEffects=0 | Effectiveness=0) and P(TypesOfEffects=0 | Effectiveness=1)
                                          [0.1, 0.2]], # P(TypesOfEffects=1 | Effectiveness=0) and P(TypesOfEffects=1 | Effectiveness=1)
                                  evidence=['Effectiveness'],
                                  evidence_card=[2])

# Add the CPDs to the model
model.add_cpds(cpd_active_ing, cpd_food, cpd_quantity, cpd_effectiveness, cpd_types_of_effects)

# Check if the model is valid
assert model.check_model()

# Define the inference object
inference = VariableElimination(model)

2. Data Preprocessing

Extract relevant facts from the transcript. For simplicity, let's assume we have a dictionary of extracted facts.

In [6]:
# Example observations extracted from the transcript
observations = {'ActiveIng': 1, 'Food': 0, 'Quantity': 1}


3. Perform Inference and Generate Explanations

Perform inference to compute the probability of Effectiveness and TypesOfEffects.

In [7]:
# Perform inference to find the probability of Effectiveness
prob_effectiveness = inference.query(variables=['Effectiveness'], evidence=observations)
print(f"Probability of Effectiveness: {prob_effectiveness}")

# Perform inference to find the probability of Types of Effects
prob_types_of_effects = inference.query(variables=['TypesOfEffects'], evidence=observations)
print(f"Probability of Types of Effects: {prob_types_of_effects}")

# Generate explanations
explanation_effectiveness = f"Given the Active Ingredient, Food, and Quantity, the probability of Effectiveness being high is {prob_effectiveness.values[1]:.2f}."
explanation_types_of_effects = f"Given the Effectiveness, the probability of having significant Types of Effects is {prob_types_of_effects.values[1]:.2f}."

print(f"Explanation for Effectiveness: {explanation_effectiveness}")
print(f"Explanation for Types of Effects: {explanation_types_of_effects}")


Probability of Effectiveness: +------------------+----------------------+
| Effectiveness    |   phi(Effectiveness) |
| Effectiveness(0) |               0.6000 |
+------------------+----------------------+
| Effectiveness(1) |               0.4000 |
+------------------+----------------------+
Probability of Types of Effects: +-------------------+-----------------------+
| TypesOfEffects    |   phi(TypesOfEffects) |
| TypesOfEffects(0) |                0.8600 |
+-------------------+-----------------------+
| TypesOfEffects(1) |                0.1400 |
+-------------------+-----------------------+
Explanation for Effectiveness: Given the Active Ingredient, Food, and Quantity, the probability of Effectiveness being high is 0.40.
Explanation for Types of Effects: Given the Effectiveness, the probability of having significant Types of Effects is 0.14.
