# tansey / functionalmf

Functional matrix factorization via Bayesian tensor filtering
Python

## Latest commit Fetching latest commit…
Cannot retrieve the latest commit at this time.

## Files

Type Name Latest commit message Commit time
Failed to load latest commit information. doseresponse examples functionalmf img .gitignore LICENSE MANIFEST.in README.md setup.cfg setup.py

# functionalmf: Bayesian factorization of functional matrices

This package implements a Bayesian tensor filtering (BTF), a method for factorizing matrices where each entry is an entire curve or function, rather than a scalar. Install with:

``````pip install functionalmf
``````

Please open an issue if you have any trouble!

## Factorizing a functional matrix with `functionalmf`

The general problem that `functionalmf` solves is that you have data in a 3- or 4-dimensional array:

```import numpy as np
data = ... # get your data as a numpy 3- or 4-tensor
print('Data shape is {}'.format(data.shape))
# Output: Data shape is (10, 11, 12, 3)
# Means 10 rows, 11 functional columns, 12 points to evaluate each function, 3 repeated draws at each point```

And you want to factorize the matrix under the assumption:

• dim 1: rows with latent attributes that remain fixed
• dim 2: columns with latent attributes that change mostly-smoothly in dim 3
• dim 4: replicates. if you have only a 3-tensor, just make this dim size 1 or the code will do it for you implicitly.

Missing data: If you have missing data, it should be passed in as `np.nan` values in the data array.

The specific class you should use depends on the likelihood function for your observations.

### Gaussian observations

If your observations are real-valued and generally assumed to follow a normal distribution for errors:

```from functionalmf.factor import GaussianBayesianTensorFiltering

def init_model(nembeds=3, tf_order=2, lam2=0.1, sigma2=0.5, nu2=1):
# Setup the model
return GaussianBayesianTensorFiltering(nrows, ncols, ndepth,
nembeds=nembeds, tf_order=tf_order,
lam2_init=lam2, nu2_init=nu2)

model = init_model()```

You can then run the Gibbs sampler on the model with:

`results = model.run_gibbs(Y_missing, nburn=nburn, nthin=nthin, nsamples=nsamples, print_freq=50, verbose=True)`

And you can get the sampler results along with the inferred means:

```Ws = results['W'] # posterior samples of row embeddings
Vs = results['V'] # posterior samples of functional column embeddings

# Get the Bayes estimate
Mu_hat = np.einsum('znk,zmtk->znmt', Ws, Vs) # dot product of all row x (column,depth) embeddings
Mu_hat_mean = Mu_hat.mean(axis=0) # average over all posterior samples```

See `examples/gaussian_tensor_filtering.py` for a full example. Results should look like: ### Binomial, Bernoulli, or Negative Binomial observations

If your observations are binomial (or binary or negative binomial), you can use the Binomial sampler:

```from functionalmf.factor import BinomialBayesianTensorFiltering
def init_model(nembeds=3, tf_order=2, lam2=0.1, sigma2=0.5):
# Setup the model
return BinomialBayesianTensorFiltering(nrows, ncols, ndepth,
nembeds=nembeds, tf_order=tf_order,
lam2_init=lam2)

model = init_model()```

The result follows the Gaussian example above, EXCEPT you need to pass the resulting means through the inverse logit transform:

```from functionalmf.utils import ilogit
Mu_hat = ilogit(np.einsum('znk,zmtk->znmt', Ws, Vs))
Mu_hat_mean = Mu_hat.mean(axis=0)```

See `examples/binomial_tensor_filtering.py` for a full example. Results should look like: ### Black box observations (Poisson example)

If you have a likelihood that is not in one of the conjugate or conditionally-conjugate categories above, you can use the non-conjugate sampler. This example focuses on a Poisson likelihood, common with count data, where the latent rate is constrained to be positive everywhere:

```from functionalmf.factor import ConstrainedNonconjugateBayesianTensorFiltering

def rowcol_loglikelihood(Y, WV, row=None, col=None):
if row is not None:
Y = Y[row]
if col is not None:
Y = Y[:,col]
if len(Y.shape) > len(WV.shape):
WV = WV[...,None]
import warnings
with warnings.catch_warnings():
warnings.simplefilter("ignore", category=RuntimeWarning)
z = np.nansum(poisson.logpmf(Y, WV))
return z

def init_model(tf_order=0, lam2=0.1, sigma2=0.5):
# Constraints requiring positive means
C_zero = np.concatenate([np.eye(ndepth), np.zeros((ndepth,1))], axis=1)

# Setup the model
return ConstrainedNonconjugateBayesianTensorFiltering(nrows, ncols, ndepth,
rowcol_loglikelihood,
C_zero,
nembeds=nembeds, tf_order=tf_order,
lam2_true=lam2)   ```

The code requires you provide it a log-likelihood function (`rowcol_loglikelihood`) that can optionally accept a `row` or `column` specifier. If those are specified, the corresponding `WV` matrix is limited to that specific row or column.

The rest of the code works the same as the above Gaussian sampler case. The results should look like this: The blue bars show the NMF-based EP approximation that centers the sampler. The orange line is the fit.

See `examples/poisson_tensor_filtering.py` for the complete example code.

## Dose-Response Modeling

Dealing with dose-response modeling in multi-drug, multi-sample studies requires handling experimental technical error. An example dose-response model is not part of the `functionalmf` package, but is implemented on top of it. See `doseresponse/` for a complete example of dose-response modeling.

## Generalized analytic slice sampling

Included in the BTF code at `functionalmf/gass.py` is a standalone tool for sampling from posteriors with truncated multivariate normal priors. The model enables linear constraints and arbitrary likelihoods. The script includes a runnable example of a truncated, monotone GP. The result looks something like this: Orange bands represent 90% Bayesian credible intervals.

# Installation issues

Known issues that come up installing and using the library are below.

• You may have trouble with `sksparse` on MacOSX using conda. A solution is to install suitesparse: `conda install -c conda-forge suitesparse`

# Citing this code

If you use this code, please cite the following paper (available here):

``````Bayesian Tensor Filtering: Smooth, Locally-Adaptive Factorization of Functional Matrices
W. Tansey, C. Tosh, D.M. Blei
arXiv preprint arXiv:1906.04072
``````

Bibtex entry:

``````@article{tansey:etal:2019:btf,
title={Bayesian Tensor Filtering: Smooth, Locally-Adaptive Factorization of Functional Matrices},
author={Tansey, Wesley and Tosh, Christopher and Blei, David M.},
journal={arXiv preprint arXiv:1906.04072},
year={2019}
}
``````
You can’t perform that action at this time.