Implementation of a 'function' container class containing a numpy array and knowledge about its variables.
It was developed to simplify calculus with multi-dimensional tensors/matrices. For example, when modelling discrete probability distributions of multiple variables in terms of multi-dimensional arrays, then multiplication and addition requires to specify the axes that correspond to the same dimensions. When using the class func
defined in pr_func.py
we only have to specify the variables for each distribution, and then do the calculations naively by using the standard operators.
Here is an example of how to calculate the conditional probability p(x|y,z)
from given distributions p(z|x,y)
and p(x,y)
, i.e. p(x|y,z) = p(z|x,y)*p(x,y)/p(y,z)
, where p(y,z) = sum_x p(z|x,y)*p(x,y)
.
The numpy variant using the powerful einsum
method would be:
X,Y,Z = 10,15,20 # fixing the possible number of values for each of the random variables
pxy = np.ones((X,Y))/(X*Y) # uniform distributions for simplicty
pz_xy = np.ones((X,Y,Z))/Z # normalizing a non-uniform distribution would require another sum/einsum
px_yz = np.einsum('ijk,ij,jk->ijk', pz_xy, pxy, 1.0/np.einsum('ijk,ij->jk',pz_xy, pxy)
And here is the pr_func variant:
pr.set_dims([('x',10),('y',15),('z',20)]) # setting up the dimensions
pxy = pr.func(vars=['x','y'], val='unif').normalize() # an instance of `func` depending on x and y
pz_xy = pr.func(vars=['x','y','z'], val='unif').normalize(['z']) # ... on x,y,z, and normalizing
px_yz = pz_xy*pxy/pr.sum(pz_xy*pxy,['x']) # simple multiplication, division, and sums
As we can see, the setup requires to define the dimensions, but then the distributions can be multiplied like numbers, since they are instances of the pr_func
class knowing which dimensions correspond to each other. Note that, since we implemented a normalize()
method, the last row of the pr_func
variant could have been even simpler:
px_yz = (pz_xy*pxy).normalize(['x'])
Examples of how to use pr_func
for efficiently implementing Blahut-Arimoto type algorithms can be found here.
import pr_func as pr
pr.set_dims([('x',10),('y',15),('z',5)]) # setting up the dimensions
F = pr.func('f(x,y)', val='rnd') # an instance depending on x and y with random values
G = pr.func('f(z)', val=np.array([1,1,2])) # an instance depending on z with given values
H = 5*pr.func(vars=['x','z'], val='unif') # an instance depending on x and z with the same value for each entry
F*G, F/G, 2*H, F+H, 1+G-F, ... # results of basic operations are also func instances
pr.sum(F) # summation over all variables of F
pr.sum(['z'],G*H) # summation over z
pr.sum(F+H,['z']) # summation over all variables of F+H except z
F.normalize() # normalization with respect to all variables
(F+G).normalize(['y','z']) # normalization with respect to y and z
F.val # numpy array, here: F.val = np.random.rand(10,15)
F.vars # list of the variables, here: F.vars = ['x','y']
F.r # positions of the vars in dims, here: F.r = [0,1]
f = F.eval('x',4) # f(y) = F(x=4,y)