Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about Utility of Discrete Sampling Relative to STAN and other Probabilistic Programming Packages #547

Closed
rswarnick1 opened this issue Sep 17, 2018 · 1 comment

Comments

@rswarnick1
Copy link

Hello,

We're deciding between STAN and Turing to potentially implement a code involving Hidden Markov Models, and as a tool in the future to continue some development for broader Bayesian Models.

We have an implementation of HMMs in STAN following along with the outline described here, and the Turing implementation in the Wiki. Ignoring for now the normal emission distribution in the STAN implementation, because we're more interested in the accumulators and log probabilities required in the STAN implementation than the properties of the model.

Our question is related to the syntactical advantage of coding in Turing relative to STAN. We were under the impression that, due to discrete sampling offered in Turing, we could avoid the auxiliary variables introduced in the Turing HMM implementation (the accumulator variables), and just specify the discrete conditionals in the form of a Markov model as in lines 44 and 45 of HMM.model.jl, with Dirichlet priors as in lines 10 and 14. The Turing model obviously has more brevity, but at the cost of more computing time.

To clarify, is this simply a property of HMMs in general when dealing with probabilistic programming packages? And could one, for example, introduce the additionally auxiliary variables in the STAN implementation to obtain more speed relative to STAN in Turing? Our goal is ease of implementation and not necessarily speed, but if Turing has the possibility to avoid some of the discrete variable specifications and introduce an HMM sampler more analogous to STAN's with higher performance it would also be helpful.

@yebai
Copy link
Member

yebai commented Sep 17, 2018

Hi @rswarnick1, thanks for the message.

There is no fundamental difference between Stan or Turing. You can write HMMs in Turing similar as the Stan version (i.e. marginalising out latent states using forward-backward algorithm), and then perform inference using the HMC sampler. Turing provides sampling options for discrete variables, but it works fine on models with only continuous variables (e.g. marginalised HMMs or LDAs).

The efficiency gap between Turing and Stan is narrowing -- we recently switched to a backward AD. Ultimately, we hope Turing's speed will catch up with Stan. That said, Turing aims to solve ML problems while Stan focuses more on Stats problems.

@yebai yebai closed this as completed Mar 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants