# PGM Advanced Topics

This section describes details and advanced uses of PGMs (and compiled PGMs).

## Overview

The primary module for PGMs is `ck.pgm` which supports the in-memory creation of probabilistic graphical models.

A probabilistic graphical model (PGM) defines a joint probability distribution over
a set of random variables. Specifically, a `PGM` object represents a factor graph with
discrete random variables.

A random variable is represented by a `RandomVariable` object. Each random variable has a
fixed, finite number of states. Many algorithms will assume at least two states.
Every `RandomVariable` object belongs to exactly one `PGM` object. A random variable
has a name (for human convenience) and its states are indexed by integers, counting
from zero. The states of a random variables optionally may be given names (for human
convenience).

A PGM also has factors, represented by `Factor` objects. Each `Factor` of a `PGM` connects
a set of `RandomVariable` objects of the PGM. In general, the order of the random variables
of a factor is functionally irrelevant, but is practically relevant for operating with `Factor`
objects, thus the random variables of a factor form a list. The "shape"
of a factor is the list of the numbers of states of the factor's random variables (co-indexed
with the list of random variables of the factor).

If a `PGM` object is representing a Bayesian network, then each factor represents a conditional
probability table (CPT) and the first random variable of each factor is taken to be the child
random variable, with the remaining random variables being the parents.

Every `Factor` has associated with it a potential function, represented by a `PotentialFunction` object.
A potential function maps each combination of states of the factor's random variables to a value (of type float).
A combination of states of random variables is represented as a `Key`. A `Key` is essentially
a list of state indexes, co-indexed with the factor's random variables.

A potential function is a map from all possible keys (according to the potential function's
shape) to a float value. Each potential function has zero or more "parameters" which may be
adjusted to change the potential function's mapping. The parameters of a potential function
are indexed sequentially from zero.

In a simple case, there is a parameter for each key, and each parameter may be independently
set by a user of the potential function. I.e., setting a parameter of a potential function
sets the value of the potential function for that key. However, the relationship between
parameters and keys may be more complicated, and sometimes parameters are not adjustable
by the user. How parameters work for a particular potential function depends on its
concrete subclass of the abstract `PotentialFunction` class.

Each parameter of a potential function is always associated with one or more keys. The value of the
parameter is the value of the potential function for it's associated keys. Conversely, each
key of a potential function is associate with zero or one parameters. That is, it is possible
that a potential function maps multiple keys to the same parameter.

If a key of a potential function is associated with a parameter, then the value of
the potential function for that key is the value of the parameter.

If a key of a potential function is not associated with a parameter, then the value of
the potential function for that key is zero. Furthermore, the key is referred to as
"guaranteed-zero", meaning that no change in the parameter values of the potential function
will change the value for that key away from zero.

`RandomVariable` objects are immutable and hashable. States of random variables are
 also immutable and hashable.

A `Factor` object is associated with random variables at construction time and that
association cannot change for the lifetime of the object. However, the `PotentialFunction`
associated with a `Factor` may be updated.

Factors may share a `PotentialFunction` object, so long as the factors have the same shape.

`PotentialFunction` objects cannot change their shape, but may be otherwise mutable and
are generally not hashable. A particular class of `PotentialFunction` may allow updating
of: (1) its number of parameters, (2) its parameter values,and (3) its relationship between
parameters and keys.

There are many kinds of potential function.

A `DensePotentialFunction` has exactly one parameter for each possible key (no "guaranteed-zero" keys)
and there are no shared parameters. Internally, a `DensePotentialFunction` is an array that stores
the potential function value for each possible key. I.e., is store the function value for each possible
combination of states of the random variables.

There is a special class of potential function called a `ZeroPotentialFunction` which
(like `DensePotentialFunction`) has a parameter for each possible key (and thus no
key is "guaranteed-zero"). However, the value of each parameter is set to zero and there
is no mechanism to update these parameter values. A `ZeroPotentialFunction` is the default
potential function for a factor. It gets used as a light-weight placeholder until replaced
by some other potential function. In particular a `ZeroPotentialFunction` behaves exactly
like a `DensePotentialFunction` except parameter values cannot be updated away from the
initial value of zero, which is useful for some machine learning methods.

A `SparsePotentialFunction` only has parameters for explicitly mentioned keys. That is,
if a value for a given key is zero, then it has no parameter associated with the key and is
"guaranteed-zero". However, the value for any key may be set to any float value and the
parameters will be automatically adjusted as needed. Setting the value for a key to zero
disassociates the key from its parameter and thus makes that key "guaranteed zero".

A `CompactPotentialFunction` function is sparse, where values for keys
that are same value are represented by a single parameter.
There is one parameter for each unique, non-zero key value.
The user may set the value for any key and parameters will
be automatically reconfigured as needed. Setting the value for
a key to zero disassociates the key from its parameter and
thus makes that key "guaranteed zero".

A `ClausePotentialFunction` represents a clause from a CNF formula.
I.e. a clause over variables X, Y, Z, is a disjunction of the form: 'X=x or Y=y or Z=z'.
A clause potential function is guaranteed zero for a key where the clause is false,
i.e., when 'X != x and Y != y and Z != z'.
For keys where the clause is true, the value of the potential function
is given by the only parameter of the potential function. That parameter
is called the clause 'weight' and is notionally 1.
The weight of a clause is permitted to be zero, but that is _not_ equivalent to
guaranteed-zero.

A `CPTPotentialFunction` implements a sparse Conditional Probability Table (CPT).
The first random variable in the signature is the child, and the remaining random
variables are parents. For each instantiation of the parent random variables there is a Conditioned Probability Distribution (CPD) over the states of the child random variable.
If a CPD is not provided for a parent instantiation, then that parent instantiation
is taken to have probability zero (i.e., all values of the CPD are "guaranteed zero").

Each `RandomVariable` has an index (`idx`) which is a sequence number, starting from zero,
indicating when that `RandomVariable` was added to its `PGM`. Random variables cannot
be removed from a PGM once added, so for some random variable, `rv`, it is always true
that `rv.pgm.rvs[rv.idx] is rv`.

A `RandomVariable` can be treated as a sequence of `Indicator` objects, that are
co-indexed with states of the random variable. Each `Indicator` object represent
 a particular random variable being in a particular state.
`Indicator` objects are immutable and hashable. They only record the random variable's
index and the state's index. Thus, if a PGM is copied, then the indicators of the
copy are functionally interchangeable with the indicators of the original.

Each `Factor` has an  index (`idx`) which is a sequence number, starting from zero,
indicating when that Factor was added to its PGM. Factors cannot
be removed from a PGM once added, so for some factor, `f`, it is always true
that `f.pgm.factors[f.idx] is f`.


## PGM name ##

In CK it is possible to give a PGM a name.

In [1]:
from ck.pgm import PGM

pgm = PGM('cancer')

print(pgm.name)

cancer


## State names ##

The states of random variables can also be given names, either as a tuple or list. State names can be a mix of types: `int`, `str`, `bool`, `float`, or `None`.
In fact, the default names are integers, 0, 1, ..., $n - 1$, for $n$ number of states.

In [2]:
pollution = pgm.new_rv('pollution', ('low', 'high'))
smoker = pgm.new_rv('smoker', ('true', 'false'))
cancer = pgm.new_rv('cancer', ('true', 'false'))
xray = pgm.new_rv('xray', ('positive', 'negative'))
dyspnoea = pgm.new_rv('dyspnoea', ('true', 'false'))


Remember that a random variable behaves like a list of indicators...

In [3]:
len(pollution)

2

In [4]:
list(pollution)

[Indicator(rv_idx=0, state_idx=0), Indicator(rv_idx=0, state_idx=1)]

It is possibled to directly access the indicators of a random variable...

In [5]:
pollution.indicators

(Indicator(rv_idx=0, state_idx=0), Indicator(rv_idx=0, state_idx=1))

We also have access to the states names of a random variable.

In [6]:
pollution.states

('low', 'high')

It is possible to access a random variable's indicators by state index or state name...

In [7]:
pollution[0]  # access by state index using square brackets

Indicator(rv_idx=0, state_idx=0)

In [8]:
pollution('low')  # access by state name using round brackets

Indicator(rv_idx=0, state_idx=0)

State names are also used when pretty-printing indicators.

In [9]:
pgm.indicator_str(cancer('true'), smoker('false'))

'cancer=true, smoker=false'

## Random variable index and offset ##

Every random variable has an index, which is its location in the PGM array of random variables.

In [10]:
print(pollution.idx, smoker.idx, cancer.idx, xray.idx, dyspnoea.idx)

0 1 2 3 4


The index of a random variable says where it appears in its PGM `rvs` array.

In [11]:
print([str(rv) for rv in pgm.rvs])

['pollution', 'smoker', 'cancer', 'xray', 'dyspnoea']


The `offset` of a random variable is the sum of lengths of random variables that have a lower index than it. This can be useful when indicators of a PGM are laid out in random variable order, then the indicators of a random variable `rv` will be in the range `rv.offset` to `rv.offset + len(rv) - 1`.

In [12]:
print(pollution.offset, smoker.offset, cancer.offset, xray.offset, dyspnoea.offset)

0 2 4 6 8


## Advanced use of WMCProgram ##

Let's add some factors to the PGM and compile it to a `WMCProgram`.

In [13]:
pgm.new_factor(pollution).set_cpt().set_cpd((), (0.9, 0.1))
pgm.new_factor(smoker).set_cpt().set_cpd((), (0.3, 0.7))
pgm.new_factor(cancer, pollution, smoker).set_cpt().set(
    ((0, 0), (0.03,  0.97)),
    ((1, 0), (0.05,  0.95)),
    ((0, 1), (0.001, 0.999)),
    ((1, 1), (0.02,  0.98)),
)
pgm.new_factor(xray, cancer).set_cpt().set(
    (0, (0.9, 0.1)),
    (1, (0.2, 0.8)),
)
pgm.new_factor(dyspnoea, cancer).set_cpt().set(
    (0, (0.65, 0.35)),
    (1, (0.3,  0.7)),
)


<ck.pgm.CPTPotentialFunction at 0x1e96ebc64b0>

In [14]:
from ck.pgm_circuit.wmc_program import WMCProgram
from ck.pgm_compiler import DEFAULT_PGM_COMPILER as pgm_compiler

wmc = WMCProgram(pgm_compiler(pgm))

State names can make probability queries more intuitive.

In [15]:
wmc.probability(cancer('true'), condition=smoker('false'))

0.0029000000000000002

A `WMCProgram` uses a `ProbabilityMixin` to provide many additional queries based on probabilities.

For example, consider a marginal distribution, which is returned as a numpy array...

In [16]:
wmc.marginal_distribution(cancer)

array([0.01163, 0.98837])

Let's make that more pretty...

In [17]:
for state, pr in zip(cancer.states, wmc.marginal_distribution(cancer)):
    print(f'P({cancer}={state}) is {pr}')

P(cancer=true) is 0.01163
P(cancer=false) is 0.98837


MAP calculation are also possible using functionality from `ProbabilityMixin`

In [18]:
pr, states = wmc.map(cancer, xray, condition=smoker('true'))
pgm.indicator_str(cancer[states[0]], xray[states[1]]) + f' with probability {pr}'

'cancer=false, xray=negative with probability 0.7744'

Many other probabilistic calculations are possible.

In [19]:
print('correlation =', wmc.correlation(cancer[0], smoker[0]))
print('total_correlation =', wmc.total_correlation(cancer, smoker))
print('entropy =', wmc.entropy(cancer))
print('conditional entropy =', wmc.conditional_entropy(cancer, smoker))
print('joint entropy =', wmc.joint_entropy(cancer, smoker))
print('mutual information =', wmc.mutual_information(cancer, smoker))
print('covariant normalised mutual information =', wmc.covariant_normalised_mutual_information(cancer, smoker))
print('uncertainty =', wmc.uncertainty(cancer, smoker))
print('symmetric uncertainty =', wmc.symmetric_uncertainty(cancer, smoker))
print('information quality ratio =', wmc.iqr(cancer, smoker))



correlation = 0.12438070141828558
total_correlation = 0.11027571817587367
entropy = 0.09141503487673329
conditional entropy = 0.08133417625362895
joint entropy = 0.9626250754843215
mutual information = 0.010080858623104302
covariant normalised mutual information = 0.03551641048843064
uncertainty = 0.011438741319017601
symmetric uncertainty = 0.020727453734215563
information quality ratio = 0.010472258493819478


There are two underlying methods used for many probabilistic queries.

The first is `wmc` which provides the weight of worlds matching given indicators.

In [20]:
wmc.wmc(cancer[0], smoker[0])

0.0096

The second is `z` which returns the summed weight of all possible worlds.

In this case `z` is 1, but that is not always the case for a PGM.

In [21]:
wmc.z

1.0

## Extra PGM methods ##


A PGM (and related objects) also have other useful methods.

Here are the factors of the PGM...

In [22]:
pgm.factors

(<ck.pgm.Factor at 0x1e96ebc5d30>,
 <ck.pgm.Factor at 0x1e94febac30>,
 <ck.pgm.Factor at 0x1e96ebc6d80>,
 <ck.pgm.Factor at 0x1e96ebc6e10>,
 <ck.pgm.Factor at 0x1e96ebc6300>)

In [23]:
for factor in pgm.factors:
    print(factor)

('pollution')
('smoker')
('cancer', 'pollution', 'smoker')
('xray', 'cancer')
('dyspnoea', 'cancer')


We can iterate over all the factors connected to a random variable...

In [24]:
for factor in pollution.factors():
    print(factor)

('pollution')
('cancer', 'pollution', 'smoker')


We can get the Markov blanket of a random variable, which is the set of random variables directly connected to it by a factor.

In [25]:
pollution.markov_blanket()

{<ck.pgm.RandomVariable at 0x1e94feba030>,
 <ck.pgm.RandomVariable at 0x1e94febb020>}

In [26]:
for rv in pollution.markov_blanket():
    print(rv)

smoker
cancer
