# What is probabilistic programming? And why?

A probabilistic programming system is a system for specifying stochastic
generative models, and reasoning about them. Or, as
[Fabiana Clemente puts it](https://towardsdatascience.com/intro-to-probabilistic-programming-b47c4e926ec5)

> Probabilistic programming is about doing statistics using the tools of
> computer science.

The kinds of problem that we attempt to solve with probabilistic programming are
ones where the inputs and outputs of programs are not just numbers, but whole
probability distributions the space of possibly values of those numbers. Here,
probability distributions are first class citizens.

Conceptually, probabilistic programming is a way of automating operations in
[probabilistic graphical models](https://en.wikipedia.org/wiki/Graphical_model),
similarly to how automatic differentiation frameworks (like Pytorch or
Tensorflow) can automate the operations of forward and backward propagation. a
PPL provides tools that make it easy to build a programmatic representation of
your probabilistic model, even if you've never thought of it specifically in
terms of a *graphical* model (e.g. if you only have your model as a set of
equations). Once that's accomplished, you can use the PPL's built-in functions
to perform typical operations on the model, such as sampling, conditioning,
computing probabilities, etc. This means that you can do verious nifty things,
like Bayesian inference, uncertainty quantification, experiment design etc.

One important benefit of modern toolkits is that you (usually) get
backpropagation of gradients for free. This is extremely useful if you want to
apply gradient-based methods for inference, especially if your model is complex
and uses more obscure distributions than a mixture of Gaussians. We will see
over the course of the hackfest just how quick and easy it is to turn an
arbitrary set of equations into a working probabilistic model that you can then
run ML algorithms on.


## Why `pyro`?

There are [too many PPLs](https://en.wikipedia.org/wiki/Probabilistic_programming)!
But pyro is the one that is probably closest to gaining critical mass.
It is not the simplest one, but 

1. it _does_ integrate high quality versions of hip neural network methods into the
   classic PPL methods. This is typically a good predictor of a framework's
   longevity, so hopefully the skills you develop in this hackfest won't become
   obsolete too quickly.
2. It is built of pytorch, which many of us here already use.
3. Finally, we have more experience inside the FSP with pyro than with 
   any other PPL.

## Other PPLs

* [Stan](https://mc-stan.org/) (R/Python/CLI/…)
* [Turing.jl](https://turing.ml/stable/) (julia)
* [Gen](https://www.gen.dev/) (julia)
* [NumPyro documentation](http://num.pyro.ai/en/stable/) (jax)
* [Edward2](https://github.com/google/edward2) (Tensorflow/Jax)
* Why not write your own? Everyone else seems to.


## Additional Reading

* Jan-Willem van de Meent, Brooks Paige, Hongseok Yang, Frank Wood, [An Introduction to Probabilistic Programming](https://arxiv.org/abs/1809.10756)
* [Bayes for Hackers](https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers)
* [Rob Salomone’s course](https://robsalomone.com/course-deep-probabilistic-models/) is great for showing off some super modern techniques, like flows and amortized inference.
* [Statistical Rethinking | Richard McElreath](https://xcelab.net/rm/statistical-rethinking/) has gone viral as an introduction to some of this stuff.
  It is [available on O’Reilly](https://learning.oreilly.com/library/view/statistical-rethinking-2nd/9780429639142/) (free for CSIRO people).
  There is a 
  [PyMC3](https://github.com/gbosquechacon/statrethink_course_in_pymc3)
  [and a numpyro](https://github.com/asuagar/statrethink-course-in-numpyro/) version.
* Kevin Murphy, [Probabilistic Machine Learning: An Introduction](https://probml.github.io/pml-book/book1.html)
* Wi Ji Ma, Konrad Paul Kording, Daniel Goldreich, [Bayesian models of perception and action](https://www.bayesianmodeling.com)
* Noah D. Goodman, Joshua B. Tenenbaum et al, [Probabilistic Models of Cognition - 2nd Edition](http://probmods.org/)
* Noah D. Goodman and Andreas Stuhlmüller, [The Design and Implementation of Probabilistic Programming Languages](http://dippl.org/)
* [Bayesian statistics for _Nature_ readers](https://www.nature.com/articles/s43586-020-00001-2)



