Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use comprehensions instead of For(1:N)? #14

Closed
StefanKarpinski opened this issue Sep 11, 2018 · 17 comments · Fixed by #177
Closed

use comprehensions instead of For(1:N)? #14

StefanKarpinski opened this issue Sep 11, 2018 · 17 comments · Fixed by #177

Comments

@StefanKarpinski
Copy link

Would it perhaps be more natural to use comprehensions to express multidimensional sampling? In other words, instead of writing

x ~ For(1:N) do n 
    Cauchy(0,100)
end

you would write

x ~ [Cauchy(0,100) for n=1:N]

Is there something about this construct that I'm missing?

@hessammehr
Copy link

I would be tempted to use broadcasting for this, with the advantage of conveying paramter values and shape information at same time.

x ~ Cauchy.(zeros(N), 100)
y ~ Normal.(x, (1.0:5.0)') # N × 5 matrix of normal variables
z ~ Poisson.(y)

@StefanKarpinski
Copy link
Author

From Slack:

@cscherrer: Thanks @Stefan! I had considered this, but went with For for a couple of reasons:

  • rhs of ~ needs to be a distribution, and it seemed a bit safer to make this explicit
  • The github README is a bit outdated. I added iid as a combinator for cases where samples should be considered as such, since this makes it easier to reason about. For is then for cases where there's (conditional) independence, but the distributions depend on the index.

@StefanKarpinski: given that it’s in a @model DSL block though, couldn’t you make comprehension syntax just a way of writing the distribution that For currently represents? or do you need this to exist outside of the macro DSL as well?

@cscherrer: Sure, @model could translate this. Only concern here is that the way a model is shown would be different than what is input. I would think the internal representation should be something reasonably regular, rather than having lots of exceptions

Regarding show output: why not just translate comprehension syntax to something like For internally and then display it the same way? Although I can see that in that case you might just want to use the actual internal representation as the input/output syntax.

@hessammehr: broadcast syntax does seem very nice for this.

Is there some reason that an array of distributions can't be the actual representation of a multidimensional conditionally independent distribution? That would make x ~ [Cauchy(0,100) for n=1:N] and y ~ Normal.(x, (1.0:5.0)') literal syntaxes that don't even need translation. In other words, define observe/sample from an array of distributions, as observing/sampling from the multidimensional distribution of the constituent parts. Unless it has some other meaning...

@StefanKarpinski
Copy link
Author

I guess the answer may be that For needs to remember what f is, it can't just apply f and then forget it.

@hessammehr
Copy link

@StefanKarpinski My question as well, since in a way broadcast already works nicely with a lot of things

θ = (zeros(5), abs.(rand(3))')
d = LogNormal.(θ)
rand.(d)
logpdf.(d, ans)

@cscherrer
Copy link
Owner

@StefanKarpinski Thank you for opening this issue. Great to see others getting involved.

@hessammehr Really good point about broadcasting. Yes, that should work as-is.

As it stands, anything on the right of ~ has to be a distribution. I think it should stay this way in the internal representation. I had originally played with making Vector{Distribution} a Distribution, but it just seemed simpler to have its own construct.

In general, I could imagine having D{Distribution} for lots of D data structures. This would make it easier to build distributions and assemble them after the fact. But we would need to be really careful about dependencies in this case.

This is especially the case when we get to composability: If we have

m1 = @model begin
    mu ~ someDist
    sigma ~ anotherDist
end

(no arguments or observes), then this will be usable from another model as

m2 = @model begin
    a ~ m1
    y ~ Normal(a.mu, a.sigma)
end

@cscherrer
Copy link
Owner

Oh wait, I just realized the broadcasting trick hits the same problem. Here's an example:

julia> Normal.(zeros(3))
3-element Array{Normal{Float64},1}:
 Normal{Float64}=0.0, σ=1.0)
 Normal{Float64}=0.0, σ=1.0)
 Normal{Float64}=0.0, σ=1.0)

julia> rand(Normal.(zeros(3)))
Normal{Float64}=0.0, σ=1.0)

Since rand already has a method on vectors this would need to be coded as some kind of special case.

Seems like it could make more sense to have another function for the product measure. This is easy enough to do, maybe it could be independent or just indep.

@hessammehr
Copy link

@cscherrer I see. How about something like this where broadcasting is used everywhere?

x = Normal.(zeros(3))
rand.(x) # gives an Array{Float64,1}

# for scalar sampling
x = Normal.(0)
rand.(x) # gives a Float64

@cscherrer
Copy link
Owner

@hessammehr Interesting, so the idea is to have sampling and inference be in terms of rand. instead of rand?

That could be convenient. I think the biggest concern is to be sure we have consistent semantics. As it is, x ~ dist means dist is a distribution type and x is a value drawn from that distribution. Allowing things like x ~ Normal.(zeros(3)) is intuitive, but if we do this we need to be very careful there aren't semantic inconsistencies we're not anticipating.

Is there a performance advantage to broadcasting vs the existing For, or is it just notational convenience? I wonder if something like x .~ dists could make sense. This could be taken as a x having the same shape as dists, sampled independently.

@hessammehr
Copy link

@cscherrer I agree that x .~ Normal.(zeros(3)) would make more sense.

I would say intuitive, concise notation and playing nicely with the rest of the Julia idioms/eco-system make broadcasting an attractive option. Will get back to you re. performance.

@StefanKarpinski
Copy link
Author

Writing x .~ Normal.(zeros(3)) is a pretty nice alternative. I guess it means that x is a vector of 3 values sampled from the three normal distributions on the right? That avoids the issue of treating arrays of distributions as multidimensional distributions quite nicely.

@cscherrer
Copy link
Owner

I guess it means that x is a vector of 3 values sampled from the three normal distributions on the right?

Yep, that's the idea. I think with this, the biggest thing to be careful of is the potential for hidden dependence.

@cscherrer
Copy link
Owner

@StefanKarpinski I was just looking into implementing this, running into some trouble. Guessing it's due to nonstandard use of ~.

julia> :(x ~ Normal())
:(x ~ Normal())

julia> :(x .~ Normal())
ERROR: syntax: missing comma or ) in argument list

Another potential concern is confusion between .~ and \dotsim (), which I'm currently using as an "observe". Any ideas?

@StefanKarpinski
Copy link
Author

Right, it seems like .~ doesn't parse as an infix operator. You could open an issue on the Julia repo and request that the parser be updated to support it. That won't be usable until 1.1 or 1.2 though.

@kskyten
Copy link

kskyten commented Nov 8, 2019

.~ is now a valid infix operator. This would make a nice improvement to the syntax.

@cscherrer
Copy link
Owner

I think we could handle this by translating

x .~ dists

into

x ~ For(size(dists)) do j
    dists[j]
end

A remaining question is, what if dists is very complex? As it stands, we'll be computing it twice. It would be great to add a CSE pass, but it could be simpler to just immediately translate this instead into something like

foo = dists
x ~ For(size(foo)) do j
    foo[j]
end

Both of these have the downside that the user will see the expanded code. ~ statements in Soss are stored as a Dict{Symbol, Expr}, and reconstructed whenever we need to print the model. Going back and forth between these representations will be annoying. A third alternative is to change the representation to Dict{Symbol, Tuple{Bool, Expr}}, where the Bool indicates whether there's broadcasting.

As I write this, this last idea seems the most promising. But it will take a few core edits.

@kskyten
Copy link

kskyten commented Nov 9, 2019

How about mimicking how Julia handles this?
For example,

x .~ Dists

gets translated into something like

Broadcast(~, x, Dists)

@cscherrer
Copy link
Owner

Soss needs the body of a model to be of the form

begin
    line_1
    
    line_n
end

Each line is syntactically translated into a Statement. This is an abstract type, with subtypes Assign and Sample. For example,

x ~ Normal(μ,σ)

becomes

Sample(:x, :(Normal(μ,σ)))

Next, all of the Samples are brought together to build a named tuple mapping each Symbol to its Expr. This becomes the dists field for a Model.

Because all of this is entirely syntactic, translating into another form only helps when its done on the right side of ~ or =. Otherwise we need another way to represent this information.

@cscherrer cscherrer linked a pull request Aug 31, 2020 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants