Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too much macro dependence? #820

Closed
ChrisRackauckas opened this issue Jun 21, 2019 · 4 comments
Closed

Too much macro dependence? #820

ChrisRackauckas opened this issue Jun 21, 2019 · 4 comments

Comments

@ChrisRackauckas
Copy link
Collaborator

I like Turing.jl, I really do. But one thing that seems to get in the way all of the time is the @model macro. DSLs are usually constructed as a way to constrain the possible inputs so that a compilation process can be written on a simplified form. Example: Stan wrote its own derivatives for every term for its AD, so it's constrained to an interface where you can use only the terms it defined. For JuMP, it needs to know what terms are linear, quadratic, integer, etc. in order to specialize, so it has a DSL that specifically captures that information.

Using DSLs isn't that great in many circumstances because, well, you might not have a full programming language available to you. But even more importantly, in many cases you have to write into a language, and thus a script needs to be built at compile time. Again, Stan you build a whole program as a string.

Turing.jl sits in an odd location of design space because it has a DSL-based interface... but it doesn't need to. Turing.jl is built on things like ForwardDiff.jl and Tracker.jl which are language-wide AD systems, so in theory any Julia code could work. In practice, things that are definable in the macro are what is allowed. @model is very good with allowing arbitrary Julia functions, but it still has the issue that, as a macro, it is evaluated at compile time. So like Stan, if you want to programmatically create models, you have to interpolate into a compile time script and then run the compiler. This goes unnoticed in a lot of Turing.jl usage because a lot of users are writing models in the global scope, in which case there's implicitly an eval happening after each command.

However, this gets tricky when defining a Turing model in a function because... functions run at runtime and not at compile time, so the macro is expanded before the function values are known. So for example, let's look at the DiffEqBayes.jl integration with the ODE solvers. Let's say we had a list of variables that we want to be named syms in the output. In theory we could do

syms = [:a,:b]
function (syms,priors,....)
    @model bif(x) = begin
    for i in 1:length(priors)
      syms[i] ~ priors[i]
    end 
    ...

but that would make a bunch of variables named syms, not a variable named the symbol value that is in syms[i], so this is distinctly different from writing a ~ priors[1]. If you want the output to have named chains like [:a,:b], you could construct an expression for the @model, interpolate

$(syms[i]) ~ priors[i]

and then eval the expression, but I think it's clear that is showing that the model isn't truly a macro.

At its core, the issue is that a model isn't actually a macro because it doesn't actually have to use compile-time information: the model is actually a function, and the macro is just a nice way to construct it. The simplest solution of course is to document the internals, like in https://turing.ml/docs/advanced/ , but I find it interesting that there are no test cases and tutorials that show how to use this, given that this is what you'd need to do if you don't want the whole structure defined at compile time.

With #819 and reconstructing the output chain https://github.com/TuringLang/MCMCChains.jl#parameter-names with new names I can probably hack my way around to get the output I want without resorting to eval and using what's documented/tested/expected to be used, but I think this is something to think about in future tutorials.

@mohamed82008
Copy link
Member

mohamed82008 commented Jun 21, 2019

Two quick comments before I go to sleep:

  1. Most of the changes in [WIP] Heterogeneous vector priors #819 were changes in normal functions, not the macro.
  2. Having the variable symbols be runtime information provided by the user maybe possible but I will need to think about it some more.

@yebai
Copy link
Member

yebai commented Oct 3, 2019

@ChrisRackauckas We'll try to reduce macro dependency in v0.8, where we'll redesign the DSL implementation.

@mohamed82008
Copy link
Member

mohamed82008 commented Nov 19, 2019

@ChrisRackauckas I am working now on #965 which uses much less macro magic and enables the use of user-defined variable names using the syntax:

x ~ NamedDist(dist, name)

where name could be a string but even better (for performance) a Turing.VarName constructed using Turing.VarName{:y}("") or Turing.VarName{:y}("[1]") for example. In case of a vector of variables, the Vector{<:VarName} can be built before defining the model and just used inside the @model body to enable type stable sampling.

@mohamed82008
Copy link
Member

I believe this issue was solved by #965. The variable names are now not baked into the model and NamedDist can be used to define custom random variable names. The model macro was also significantly simplified in #965. Please re-open if you feel this issue wasn't properly addressed in #965.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants