Question about Utility of Discrete Sampling Relative to STAN and other Probabilistic Programming Packages #547

rswarnick1 · 2018-09-17T16:47:38Z

Hello,

We're deciding between STAN and Turing to potentially implement a code involving Hidden Markov Models, and as a tool in the future to continue some development for broader Bayesian Models.

We have an implementation of HMMs in STAN following along with the outline described here, and the Turing implementation in the Wiki. Ignoring for now the normal emission distribution in the STAN implementation, because we're more interested in the accumulators and log probabilities required in the STAN implementation than the properties of the model.

Our question is related to the syntactical advantage of coding in Turing relative to STAN. We were under the impression that, due to discrete sampling offered in Turing, we could avoid the auxiliary variables introduced in the Turing HMM implementation (the accumulator variables), and just specify the discrete conditionals in the form of a Markov model as in lines 44 and 45 of HMM.model.jl, with Dirichlet priors as in lines 10 and 14. The Turing model obviously has more brevity, but at the cost of more computing time.

To clarify, is this simply a property of HMMs in general when dealing with probabilistic programming packages? And could one, for example, introduce the additionally auxiliary variables in the STAN implementation to obtain more speed relative to STAN in Turing? Our goal is ease of implementation and not necessarily speed, but if Turing has the possibility to avoid some of the discrete variable specifications and introduce an HMM sampler more analogous to STAN's with higher performance it would also be helpful.

yebai · 2018-09-17T19:25:43Z

Hi @rswarnick1, thanks for the message.

There is no fundamental difference between Stan or Turing. You can write HMMs in Turing similar as the Stan version (i.e. marginalising out latent states using forward-backward algorithm), and then perform inference using the HMC sampler. Turing provides sampling options for discrete variables, but it works fine on models with only continuous variables (e.g. marginalised HMMs or LDAs).

The efficiency gap between Turing and Stan is narrowing -- we recently switched to a backward AD. Ultimately, we hope Turing's speed will catch up with Stan. That said, Turing aims to solve ML problems while Stan focuses more on Stats problems.

yebai closed this as completed Mar 22, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about Utility of Discrete Sampling Relative to STAN and other Probabilistic Programming Packages #547

Question about Utility of Discrete Sampling Relative to STAN and other Probabilistic Programming Packages #547

rswarnick1 commented Sep 17, 2018

yebai commented Sep 17, 2018

Question about Utility of Discrete Sampling Relative to STAN and other Probabilistic Programming Packages #547

Question about Utility of Discrete Sampling Relative to STAN and other Probabilistic Programming Packages #547

Comments

rswarnick1 commented Sep 17, 2018

yebai commented Sep 17, 2018