Skip to content

StefanUlbrich/citrate

Repository files navigation

Welcome to the Citrate workspace

This workspace includes the crates that I developed to learn Rust and to support my research. It is named after the salts of the citric acid and the design and development process and experience will be documented on my personal blog at lemonfold.io.

The Potpourri Crate

A crate for models that can be trained by Expectation Maximization with a focus on Mixture Models. Its generic, flexible design that extensively uses the composition patterns is agnostic of external dependencies (i.e., the numerical library chosen) and users only need to implement a trait with few methods.

The crate is still heavily work in progress and currently, only clustering with Gaussian Mixture Models is in a rudimentary working state. Future development, may include classification and regression with Mixture models, support for more densities (Poisson, negative Binomial, van Mises, ..), SOM, $k$-means, HMM, or Kalman filter.

To run the algorithm itself, you currently need to call a unit test:

RUST_BACKTRACE=full cargo test -F ndarray --package potpourri --lib -- mixture::tests::em_step --exact --nocapture

Currently, there are only models implemented for the NDArrary backend but clusters and GPGPU is planned.

The Cerebral Crate

A crate for self-organizing neural networks written in Rust. The crate is still work in progress and not suited for use in other projects yet.

This is an experiment on writing a Python extension for numerical computation / machine learning in Rust with Python bindings. The aim of this project is to create an implementation of the self-organizing maps algorithm with support for parallelization and eventually, GPGPU. The algorithm has been chosen for its efficiency and simplicity (both in terms of implementation and comprehensibility). Plus, it's my favorite and I'm doing active research with it.

Feel free use as a basis for your own projects (MIT licensed).

Development

For the monorepo, there is a top-level python project defined using poetry. This project is mainly for dependency locking, integration testing and the overall project's documentation. The python module build with Maturin is in a nested folder and it's project file is also created with poetry. This is necessary as the top-level project can only add local packages that are either created with Poetry or contain a setup.py file.

While this setup supports a monorepo setup (and should support integration testing on the imported local packages), there is another caveat. Building the python extension with pip creates a temp directory which does not copy the local rust dependencies. Newer versions of pip build within the tree, so this limitation can be avoided easily. Use the following commands to setup the environment.

⚠️ We want to avoid creating a virtual environment for the nested packages. Work in a top-level shell instead.

cd self-organization
pyenv shell 3.10.2
poetry env use 3.10.2
poetry run pip install -U pip # only required until pip>=22.0 becomes the default
poetry install
# to debug / develop the extension
poetry shell
cd pysom
maturin develop

To install the virtual environment as a kernel for jupyter:

python -m ipykernel install --user --name py310_selforganization --display-name "Python3.10 (self-organization)"

About

Flexible self-organizing maps written in Rust with Python bindings

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published