Skip to content

Commit

Permalink
Merge pull request #67 from probcomp/20190203-marcoct-names
Browse files Browse the repository at this point in the history
20190203 marcoct names
  • Loading branch information
marcoct committed Feb 8, 2019
2 parents 2ade766 + 9470877 commit 82cf9f7
Show file tree
Hide file tree
Showing 111 changed files with 3,273 additions and 4,046 deletions.
2 changes: 1 addition & 1 deletion docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ makedocs(
"Probability Distributions" => "ref/distributions.md",
"Built-in Modeling Language" => "ref/modeling.md",
"Generative Function Combinators" => "ref/combinators.md",
"Assignments" => "ref/assignments.md",
"Choice Maps" => "ref/choice_maps.md",
"Selections" => "ref/selections.md",
"Optimizing Trainable Parameters" => "ref/parameter_optimization.md",
"Inference Library" => "ref/inference.md",
Expand Down
16 changes: 8 additions & 8 deletions docs/src/getting_started.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,16 +19,16 @@ Let's write a short Gen program that does Bayesian linear regression: given a se

There are three main components to a typical Gen program.

First, we define a _generative model_: a Julia function, extended with some extra syntax, that, conceptually, simulates a fake dataset. The model below samples `slope` and `intercept` parameters, and then for each of the x-coordinates that it accepts as input, samples a corresponding y-coordinate. We name the random choices we make with `@addr`, so we can refer to them in our inference program.
First, we define a _generative model_: a Julia function, extended with some extra syntax, that, conceptually, simulates a fake dataset. The model below samples `slope` and `intercept` parameters, and then for each of the x-coordinates that it accepts as input, samples a corresponding y-coordinate. We name the random choices we make with `@trace`, so we can refer to them in our inference program.

```julia
using Gen

@gen function my_model(xs::Vector{Float64})
slope = @addr(normal(0, 2), :slope)
intercept = @addr(normal(0, 10), :intercept)
slope = @trace(normal(0, 2), :slope)
intercept = @trace(normal(0, 10), :intercept)
for (i, x) in enumerate(xs)
@addr(normal(slope * x + intercept, 1), "y-$i")
@trace(normal(slope * x + intercept, 1), "y-$i")
end
end
```
Expand All @@ -42,14 +42,14 @@ The inference program below takes in a data set, and runs an iterative MCMC algo
function my_inference_program(xs::Vector{Float64}, ys::Vector{Float64}, num_iters::Int)
# Create a set of constraints fixing the
# y coordinates to the observed y values
constraints = DynamicAssignment()
constraints = choicemap()
for (i, y) in enumerate(ys)
constraints["y-$i"] = y
end

# Run the model, constrained by `constraints`,
# to get an initial execution trace
(trace, _) = initialize(my_model, (xs,), constraints)
(trace, _) = generate(my_model, (xs,), constraints)

# Iteratively update the slope then the intercept,
# using Gen's metropolis_hastings operator.
Expand All @@ -60,8 +60,8 @@ function my_inference_program(xs::Vector{Float64}, ys::Vector{Float64}, num_iter

# From the final trace, read out the slope and
# the intercept.
assmt = get_assmt(trace)
return (assmt[:slope], assmt[:intercept])
choices = get_choices(trace)
return (choices[:slope], choices[:intercept])
end
```

Expand Down
38 changes: 19 additions & 19 deletions docs/src/ref/assignments.md → docs/src/ref/choice_maps.md
Original file line number Diff line number Diff line change
@@ -1,52 +1,52 @@
# Assignments
# Choice Maps

Maps from the addresses of random choices to their values are stored in associative tree-structured data structures that have the following abstract type:
```@docs
Assignment
ChoiceMap
```

Assignments are constructed by users to express observations and/or constraints on the traces of generative functions.
Assignments are also returned by certain Gen inference methods, and are used internally by various Gen inference methods.
Choice maps are constructed by users to express observations and/or constraints on the traces of generative functions.
Choice maps are also returned by certain Gen inference methods, and are used internally by various Gen inference methods.

Assignments provide the following methods:
Choice maps provide the following methods:
```@docs
has_value
get_value
get_subassmt
get_submap
get_values_shallow
get_subassmts_shallow
get_submaps_shallow
to_array
from_array
address_set
```
Note that none of these methods mutate the assignment.

Assignments also provide `Base.isempty`, which tests of there are no random
Choice maps also provide `Base.isempty`, which tests of there are no random
choices in the assignment, and `Base.merge`, which takes two assignments, and
returns a new assignment containing all random choices in either assignment.
It is an error if the assignments both have values at the same address, or if
one assignment has a value at an address that is the prefix of the address of a
value in the other assignment.


## Dynamic Assignment
## Dynamic Choice Map

One concrete assignment type is `DynamicAssignment`, which is mutable.
Users construct `DynamicAssignments` and populate them for use as observations or constraints, e.g.:
One concrete assignment type is `DynamicChoiceMap`, which is mutable.
Users construct `DynamicChoiceMaps` and populate them for use as observations or constraints, e.g.:
```julia
assmt = DynamicAssignment()
assmt[:x] = true
assmt["foo"] = 1.25
assmt[:y => 1 => :z] = -6.3
choices = choicemap()
choices[:x] = true
choices["foo"] = 1.25
choices[:y => 1 => :z] = -6.3
```

There is also a constructor for `DynamicAssignment` that takes initial (address, value) pairs:
There is also a constructor for `DynamicChoiceMap` that takes initial (address, value) pairs:
```julia
assmt = DynamicAssignment((:x, true), ("foo", 1.25), (:y => 1 => :z, -6.3))
choices = choicemap((:x, true), ("foo", 1.25), (:y => 1 => :z, -6.3))
```

```@docs
DynamicAssignment
choicemap
set_value!
set_subassmt!
set_submap!
```
8 changes: 4 additions & 4 deletions docs/src/ref/combinators.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ In the schematic below, the kernel is denoted ``\mathcal{G}_{\mathrm{k}}``.
For example, consider the following generative function, which makes one random choice at address `:z`:
```julia
@gen function foo(x1::Float64, x2::Float64)
y = @addr(normal(x1 + x2, 1.0), :z)
y = @trace(normal(x1 + x2, 1.0), :z)
return y
end
```
Expand All @@ -31,7 +31,7 @@ bar = Map(foo)
```
We can then obtain a trace of `bar`:
```julia
(trace, _) = initialize(bar, ([0.0, 0.5], [0.5, 1.0]))
(trace, _) = generate(bar, ([0.0, 0.5], [0.5, 1.0]))
```
This causes `foo` to be invoked twice, once with arguments `(0.0, 0.5)` in address namespace `1` and once with arguments `(0.5, 1.0)` in address namespace `2`.
If the resulting trace has random choices:
Expand Down Expand Up @@ -95,7 +95,7 @@ The initial state is denoted ``y_0``, the number of applications is ``n``, and t
For example, consider the following kernel, with state type `Bool`, which makes one random choice at address `:z`:
```julia
@gen function foo(t::Int, y_prev::Bool, z1::Float64, z2::Float64)
y = @addr(bernoulli(y_prev ? z1 : z2), :y)
y = @trace(bernoulli(y_prev ? z1 : z2), :y)
return y
end
```
Expand All @@ -105,7 +105,7 @@ bar = Map(foo)
```
We can then obtain a trace of `bar`:
```julia
(trace, _) = initialize(bar, (5, false, 0.05, 0.95))
(trace, _) = generate(bar, (5, false, 0.05, 0.95))
```
This causes `foo` to be invoked five times.
The resulting trace may contain the following random choices:
Expand Down
68 changes: 34 additions & 34 deletions docs/src/ref/gfi.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ There are various kinds of generative functions, which are represented by concre
For example, the [Built-in Modeling Language](@ref) allows generative functions to be constructed using Julia function definition syntax:
```julia
@gen function foo(a, b)
if @addr(bernoulli(0.5), :z)
if @trace(bernoulli(0.5), :z)
return a + b + 1
else
return a + b
Expand All @@ -34,18 +34,18 @@ We represent the randomness used during an execution of a generative function as
In this section, we assume that random choices are discrete to simplify notation.
We say that two random choice maps ``t`` and ``s`` **agree** if they assign the same value for any address that is in both of their domains.

Generative functions may also use **non-addressed randomness**, which is not included in the map ``t``.
However, the state of non-addressed random choices *is* maintained by the trace internally.
We denote non-addressed randomness by ``r``.
Non-addressed randomness is useful for example, when calling black box Julia code that implements a randomized algorithm.
Generative functions may also use **untraced randomness**, which is not included in the map ``t``.
However, the state of untraced random choices *is* maintained by the trace internally.
We denote untraced randomness by ``r``.
Untraced randomness is useful for example, when calling black box Julia code that implements a randomized algorithm.

The observable behavior of every generative function is defined by the following mathematical objects:

### 1. Input type
The set of valid argument tuples to the function, denoted ``X``.

### 2. Probability distribution family
A family of probability distributions ``p(t, r; x)`` on maps ``t`` from random choice addresses to their values, and non-addressed randomness ``r``, indexed by arguments ``x``, for all ``x \in X``.
A family of probability distributions ``p(t, r; x)`` on maps ``t`` from random choice addresses to their values, and untraced randomness ``r``, indexed by arguments ``x``, for all ``x \in X``.
Note that the distribution must be normalized:
```math
\sum_{t, r} p(t, r; x) = 1 \;\; \mbox{for all} \;\; x \in X
Expand All @@ -55,14 +55,14 @@ We use ``p(t; x)`` to denote the marginal distribution on the map ``t``:
```math
p(t; x) := \sum_{r} p(t, r; x)
```
And we denote the conditional distribution on non-addressed randomness ``r``, given the map ``t``, as:
And we denote the conditional distribution on untraced randomness ``r``, given the map ``t``, as:
```math
p(r; x, t) := p(t, r; x) / p(t; x)
```

### 3. Return value function
A (deterministic) function ``f`` that maps the tuple ``(x, t)`` of the arguments and the random choice map to the return value of the function (which we denote by ``y``).
Note that the return value cannot depend on the non-addressed randomness.
Note that the return value cannot depend on the untraced randomness.

### 4. Internal proposal distribution family
A family of probability distributions ``q(t; x, u)`` on maps ``t`` from random choice addresses to their values, indexed by tuples ``(x, u)`` where ``u`` is a map from random choice addresses to values, and where ``x`` are the arguments to the function.
Expand All @@ -76,7 +76,7 @@ p(t; x) > 0 \mbox{ if and only if } q(t; x, u) > 0 \mbox{ for all } u \mbox{ whe
```math
q(t; x, u) > 0 \mbox{ implies that } u \mbox{ and } t \mbox{ agree }.
```
There is also a family of probability distributions ``q(r; x, t)`` on non-addressed randomness, that satisfies:
There is also a family of probability distributions ``q(r; x, t)`` on untraced randomness, that satisfies:
```math
q(r; x, t) > 0 \mbox{ if and only if } p(r; x, t) > 0
```
Expand Down Expand Up @@ -107,7 +107,7 @@ get_retval

The map ``t`` from addresses of random choices to their values:
```@docs
get_assmt
get_choices
```

The log probability that the random choices took the values they did:
Expand All @@ -128,13 +128,13 @@ There are several methods that take a trace of a generative function as input an
We will illustrate these methods using the following generative function:
```julia
@gen function foo()
val = @addr(bernoulli(0.3), :a)
if @addr(bernoulli(0.4), :b)
val = @addr(bernoulli(0.6), :c) && val
val = @trace(bernoulli(0.3), :a)
if @trace(bernoulli(0.4), :b)
val = @trace(bernoulli(0.6), :c) && val
else
val = @addr(bernoulli(0.1), :d) && val
val = @trace(bernoulli(0.1), :d) && val
end
val = @addr(bernoulli(0.7), :e) && val
val = @trace(bernoulli(0.7), :e) && val
return val
end
```
Expand All @@ -151,22 +151,22 @@ Suppose we have a trace (`trace`) with initial choices:
```
Note that address `:d` is not present because the branch in which `:d` is sampled was not taken because random choice `:b` had value `true`.

### Force Update
### Update
```@docs
force_update
update
```
Suppose we run [`force_update`](@ref) on the example `trace`, with the following constraints:
Suppose we run [`update`](@ref) on the example `trace`, with the following constraints:
```
├── :b : false
└── :d : true
```
```julia
constraints = DynamicAssignment((:b, false), (:d, true))
(new_trace, w, discard, _) = force_update((), noargdiff, trace, constraints)
constraints = choicemap((:b, false), (:d, true))
(new_trace, w, _, discard) = update(trace, (), noargdiff, constraints)
```
Then `get_assmt(new_trace)` will be:
Then `get_choices(new_trace)` will be:
```
├── :a : false
Expand All @@ -192,20 +192,20 @@ p(t; x') = 0.7 × 0.6 × 0.1 × 0.7 = 0.0294\\
w = \log p(t'; x')/p(t; x) = \log 0.0294/0.0784 = \log 0.375
```

### Free Update
### Regenerate
```@docs
free_update
regenerate
```
Suppose we run [`free_update`](@ref) on the example `trace`, with selection `:a` and `:b`:
Suppose we run [`regenerate`](@ref) on the example `trace`, with selection `:a` and `:b`:
```julia
(new_trace, w, _) = free_update((), noargdiff, trace, select(:a, :b))
(new_trace, w, _) = regenerate(trace, (), noargdiff, select(:a, :b))
```
Then, a new value for `:a` will be sampled from `bernoulli(0.3)`, and a new value for `:b` will be sampled from `bernoulli(0.4)`.
If the new value for `:b` is `true`, then the previous value for `:c` (`false`) will be retained.
If the new value for `:b` is `false`, then a new value for `:d` will be sampled from `bernoulli(0.7)`.
The previous value for `:c` will always be retained.
Suppose the new value for `:a` is `true`, and the new value for `:b` is `true`.
Then `get_assmt(new_trace)` will be:
Then `get_choices(new_trace)` will be:
```
├── :a : true
Expand All @@ -219,9 +219,9 @@ Then `get_assmt(new_trace)` will be:
The weight (`w`) is ``\log 1 = 0``.


### Extend
### Extend update
```@docs
extend
extend_update
```

### Argdiffs
Expand Down Expand Up @@ -270,8 +270,8 @@ This static property of the generative function is reported by `accepts_output_g
```@docs
has_argument_grads
accepts_output_grad
backprop_params
backprop_trace
accumulate_param_gradients!
choice_gradients
get_params
```

Expand Down Expand Up @@ -320,19 +320,19 @@ Then, the argument tuple passed to e.g. [`initialize`](@ref) will have two eleme
NOTE: Be careful to distinguish between arguments to the generative function itself, and arguments to the constructor of the generative function.
For example, if you have a generative function type that is parametrized by, for example, modeling DSL code, this DSL code would be a parameter of the generative function constructor.

### Decide what the addressed random choices (if any) will be
### Decide what the traced random choices (if any) will be
Remember that each random choice is assigned a unique address in (possibly) hierarchical address space.
You are free to design this address space as you wish, although you should document it for users of your generative function type.

### Implement the methods of the interface

- At minimum, you need to implement all methods under the [`Traces`](@ref) heading (e.g. [`initialize`](@ref), ..)

- To support [`metropolis_hastings`](@ref) or local optimization, or local iterative adjustments to traces, be sure to implement the [`force_update`](@ref) and [`free_update](@ref) methods.
- To support [`metropolis_hastings`](@ref) or local optimization, or local iterative adjustments to traces, be sure to implement the [`update`](@ref) and [`regenerate`](@ref) methods.

- To support gradients of the log probability density with respect to the arguments and/or random choices made by the function, implement the [`backprop_trace`](@ref) method.
- To support gradients of the log probability density with respect to the arguments and/or random choices made by the function, implement the [`choice_gradients`](@ref) method.

- Generative functions can also have trainable parameters (e.g. neural network weights). To support these, implement the [`backprop_params`](@ref) method.
- Generative functions can also have trainable parameters (e.g. neural network weights). To support these, implement the [`accumulate_param_gradients!`](@ref) method.

- To support use of your generative function in custom proposals (instead of just generative models), implement [`assess`](@ref) and [`propose`](@ref) methods.

Loading

0 comments on commit 82cf9f7

Please sign in to comment.