# Create node Conditional Probability Tables (CPTs) from the classification tree splits

The classification tree generates counts for each conjunction of splits that determine the full joint over the model's uncertain variables. 

JMA 24 March 2025


In [1]:
# Imports from the python standard library
import math, re, os, sys 
from pathlib import Path
import itertools            # to flatten lists
from numpy.random import default_rng

# Import array and dataframe packages
import numpy as np
# import numpy.linalg as la
import pandas as pd
import torch

In [None]:
# TODO ( Parse classification tree output to obtain counts.)

## Use CART's leaf node counts to estimate the variables' joint probability distribution

The count of [peach, lemon] in the four leaf nodes represent the probability of discrete variables based on the splitting thresholds that CART generated.

In [None]:
# dims: (ignition, carb, y).  
# y == state variable
joint_cnts = torch.tensor([[[304, 6], [2, 103]], [ [7, 431],[142,5]]])
# Normalize the counts by the number of training samples total
joint_p = joint_cnts / joint_cnts.sum()
'Joint  p; ', joint_p 

('Joint  p; ',
 tensor([[[0.3040, 0.0060],
          [0.0020, 0.1030]],
 
         [[0.0070, 0.4310],
          [0.1420, 0.0050]]]))

In [None]:
# The implicit prior -- the state empirical distribution. 
Py = joint_p.sum(axis=(0,1))
'Raw prior, y: ', Py

('Raw prior, y: ', tensor([0.4550, 0.5450]))

### Apply a "likelihood message" to the prior

The state variable (lemon, peach) prior can be adjusted when learning the classification tree to meet the actual belief, regardless of the class imbalance in the training set. 

Computationally this is the Bayes network analog of sending a likelihood message to the prior distribution of the joint probability.  Equivalently this can be done by setting the "params" argument in `rpart()`

In [None]:
# adjust priors 
adjustment0 = 0.2 / Py[0]
adjustment1 = (1- Py[0]*adjustment0)/Py[1]
# Pc_adjusted
adjustment = torch.tensor([adjustment0.item(), adjustment1.item()])
'New prior: ',adjustment * Py

('New prior: ', tensor([0.2000, 0.8000]))

In [6]:
# Apply the adjustment to the joint
new_joint_p = joint_p * adjustment.expand(2,2,2)
# Check the new prior
new_joint_p.sum(axis=(0,1))

tensor([0.2000, 0.8000])

### Conditioning the joint on the features obtains the posterior consistent with the prior:

$ P (y\ |\ ig, carb )$

The "features" are the independent variables, "ignition_test" and "carburator_test". 

In [7]:
# Condition to get p(y | g, c)
# Note, this is just the conditional probabilities at the node leaves. 
y_norm = new_joint_p.sum(2)
Py_given_gc = new_joint_p / y_norm.expand(2,2,2).permute(1,2,0)
Py_given_gc

tensor([[[0.9382, 0.0618],
         [0.0058, 0.9942]],

        [[0.0048, 0.9952],
         [0.8948, 0.1052]]])

### Complete factoring the joint

Condition to obtain the CPT for "ignition_test":

$ P ( ig\ |\ carb)$

In [8]:
# Condition to get (g | c)
# Sum out y: remaining dimensions are g, c
Pgc = new_joint_p.sum(2)
gc_norm = Pgc.sum(1)
Pg_given_c = Pgc / gc_norm.expand(2,2).permute(1,0)
Pg_given_c

tensor([[0.4836, 0.5164],
        [0.9011, 0.0989]])

## The full factorization of the three model variables is

$ P( y, ig, carb) = P (y\ |\ ig, carb ) P ( ig\ |\ carb) P( carb )$

The factors are the CPTs for the nodes in the influence diagram. 

Here is the final term in the factorization:

In [9]:
# Find the preposterior on c
Pc = new_joint_p.sum(axis=(0,2))
Pc

tensor([0.7782, 0.2218])