In [37]:
import sys
import os
sys.path.insert(0, os.path.abspath('..'))

import random
from src import *

Let's define a fair coin. It outputs 0 half the time, 1 the other half. 

In [38]:
def coin(p: float) -> float:
    return 1 if random.random() < p else 0

for _ in range(5):
    print(coin(0.5))

0
0
0
1
0


Now, we introduce the notion of a `Sampler`, which is an implmentation of the `IntractableReal` class. Samplers are relevant when we can run a program, but have limited knowledge of its distribution over outputs.

In [39]:
sampler = Sampler(lambda: coin(0.5))
print(sampler)

Sampler(<function <lambda> at 0x1043ea700>)


Our `Sampler(f).evaluate()` implements the simplest possible unbiased estimator of program mean: it simply runs `f()` once and takes this as the mean.

In [40]:
for _ in range(5):
    print(sampler.estimate())

1
1
0
0
1


To estimate $Var(f)$, our sampler defines its own internal sampler, which takes 2 samples from our lambda, $f$.

This looks like: `Sampler(lambda: 0.5 * (self.f() - self.f())**2)`

Note there are only two possible values `self.f()`: 0, 1. 

So $f() - f() = -1, 0, 1$. Meaning $(f() - f())^2 = 0, 1$.

Because we get p(0) = 0.5 and p(-1 | 1) = 0.5, we can expect our variance to be 0 half the time, and 0.5 the other half of the time. Let's sample and see!

In [41]:
for _ in range(10):
    print(sampler.variance().estimate())

0.0
0.0
0.0
0.0
0.5
0.0
0.0
0.0
0.0
0.0


We can use the `Dist(d, n)` implementation of the `IntractableReal` class estimate the mean and variance of an underlying distribution, d, over many samples, n.

Say we don't know our coin is fair. We can imagine estimating the true mean and variance over the distribution over coin flips by taking many samples using our `Sampler()`.

In [42]:
dist = Dist(sampler, 1000)
print(dist.estimate(), dist.variance().estimate())

0.511 0.2425


It seems this works! We can do the same to discover unfair coins.

In [43]:
sampler = Sampler(lambda: coin(0.8))
dist = Dist(sampler, 100)
print(dist.estimate(), dist.variance().estimate())

0.79 0.165


Because we are just sampling from our lambda, it can be arbitrarily complex! Run this a few times; at scale, we get stable answers.

In [44]:
def weird_tricoin(p: float) -> float:
    if coin(p) == 0:
        return 0
    else:
        if coin(1-p) == 0:
            return 1
        else:
            return 100

sampler = Sampler(lambda: weird_tricoin(0.5))
dist = Dist(sampler, 10000)
print(dist.estimate(), dist.variance().estimate())


25.0741 1850.52645


Say you've done the math, and you know that the expected value of this program is 25.25. You can specify this to sampler, and let variance be determined by sampling against that mean using a refined variance sampler: `return Sampler(lambda: (self.f() - self.known_mean)**2)`.

In [45]:
sampler_with_known_mean = Sampler(lambda: weird_tricoin(0.5), known_mean=25.25)
# print(sampler_with_known_mean.estimate())

dist = Dist(sampler_with_known_mean, 10000)
print(dist.estimate(), dist.variance().estimate())

25.25 1832.6311
