In [1]:
# 1) Random sampling (torch.randn, torch.rand, torch.distributions module)
# 2) Demonstration of central limit theorem through simulation
# 3) Autodiff and computational graph construction (grad fields for tensors, backward method (and how it discards the implicitly created computational graph), optimizers, loss functions)
# 4) Application of autodiff to a very simple problem (like fitting a normal and/or a beta distribution through maximum likelihood)

In [2]:
import torch
import numpy as np

# Part 1: Generating some data
In order to do this lab you're gonna need some data. Usually I wouldn't suggest doing this but why don't we just make up our data. Using PyTorch's __[built in distribution sampling functions](https://pytorch.org/docs/stable/distributions.html)__ create a dataset sampled from a Bernoulli, and another dataset sampled from a normal distribution

## Sampling from a normal distribution
First, draw a sample from a normal distribution using the parameters below. Story your data in the variable named `X_norm`, and print out the sample mean and standard deviation

In [3]:
trueMu = 1.491# mean
trueSig = 0.1876# standard deviation
nNormal = 10
#### your code here ####
normalDist = torch.distributions.normal.Normal(loc = trueMu, scale = trueSig)
X_norm = torch.autograd.Variable(normalDist.sample(sample_shape = [nNormal,]))
print("Inaccurate sample mean: {}, sample std: {}".format(X_norm.mean(), np.std(torch.Tensor.numpy(X_norm))))

Inaccurate sample mean: 1.5139328241348267, sample std: 0.1719292849302292


How close are the sample estimates of the mean and standard deviation to the true values? How can you improve the sample estimation of these values? What is the theorem that backs up your answer? Please describe how you might do that below and then implement your solution. Continue to store your data in the variable `X_norm`.

Note, we care more about you explaining that you know what you're doing, rather than how close you can get your estimates. 
### 1) Your answer here

In [4]:
#### your code here ####
nNormal = 10000
X_norm = torch.autograd.Variable(normalDist.sample(sample_shape = [nNormal,]))
print("More accurate sample mean: {}, sample std: {}".format(X_norm.mean(), np.std(torch.Tensor.numpy(X_norm))))

More accurate sample mean: 1.4918864965438843, sample std: 0.18744029104709625


## Binary data generation
Follow the same steps as above, but instead of a normal distribution we want binary data. What distribution can you use to sample binary data? Please ensure that the sample estimate probability of seeing a value of `1` is close to `trueP` 

In [5]:
trueP = 0.23
nBinary = 10000
#### your code here ####
bern = torch.distributions.bernoulli.Bernoulli(torch.tensor([trueP]))
# X_bin = torch.autograd.Variable(bern.sample(sample_shape = [100,]))
X_bin = bern.sample(sample_shape = [nBinary,])
print("Sample estimate P(1): {}".format(X_bin.mean()))

Sample estimate P(1): 0.23499999940395355


# Part 2: Maximum Likelihood Expectation

## MLE: Normal

In [12]:
learnedMu = torch.autograd.Variable(torch.rand(1), requires_grad = True)
learnedSigma = torch.autograd.Variable(torch.rand(1), requires_grad = True)
print(learnedMu)
print(learnedSigma)


learningRate = 0.000001
# learnedProb.zero_()
n = X_norm.shape[0]
for myIter in range(1000):
    lossFunc = ((n/2)*torch.log(2*np.pi*(learnedSigma**2)) + (1/(2*(learnedSigma**2)))*torch.sum((X_norm - learnedMu)**2))
    lossFunc.backward()
    if myIter % 100 == 0:
        print("log likelihood: {}, learned mu = {}, learned sigma {}".format(lossFunc.data.numpy(),
                                                                          learnedMu.data.numpy(),
                                                                          learnedSigma.data.numpy()))
    learnedMu.data = learnedMu.data - learnedMu.grad.data*learningRate
    learnedSigma.data = learnedSigma.data - learnedSigma.grad.data*learningRate
    learnedMu.grad.data.zero_()
    learnedSigma.grad.data.zero_()

tensor([0.5881], requires_grad=True)
tensor([0.7432], requires_grad=True)
log likelihood: [13933.847], learned mu = [0.588112], learned sigma [0.7431544]
log likelihood: [-2553.5645], learned mu = [1.49186], learned sigma [0.1874403]
log likelihood: [-2553.5645], learned mu = [1.4918864], learned sigma [0.18744029]
log likelihood: [-2553.5645], learned mu = [1.4918864], learned sigma [0.18744029]
log likelihood: [-2553.5645], learned mu = [1.4918864], learned sigma [0.18744029]
log likelihood: [-2553.5645], learned mu = [1.4918864], learned sigma [0.18744029]
log likelihood: [-2553.5645], learned mu = [1.4918864], learned sigma [0.18744029]
log likelihood: [-2553.5645], learned mu = [1.4918864], learned sigma [0.18744029]
log likelihood: [-2553.5645], learned mu = [1.4918864], learned sigma [0.18744029]
log likelihood: [-2553.5645], learned mu = [1.4918864], learned sigma [0.18744029]


## MLE: Bernoulli

In [13]:
learnedP = torch.autograd.Variable(torch.rand(1), requires_grad = True)

In [14]:
learningRate = 0.00001
# learnedProb.zero_()
for myIter in range(1000):
    lossFunc = -torch.sum(torch.log(X_bin*learnedP + (1-X_bin)*(1-learnedP)))
    lossFunc.backward()
    if myIter % 100 == 0:
        print("log likelihood: {}, learned prob = {}, update = {}".format(lossFunc.data.numpy(),
                                                                          learnedP.data.numpy(),
                                                                          learnedP.grad.data.numpy()))
    learnedP.data = learnedP.data - learnedP.grad.data*learningRate
    learnedP.grad.data.zero_()

log likelihood: 15082.716796875, learned prob = [0.85383964], update = [49587.496]
log likelihood: 5452.47607421875, learned prob = [0.23499978], update = [0.]
log likelihood: 5452.47607421875, learned prob = [0.23499978], update = [0.]
log likelihood: 5452.47607421875, learned prob = [0.23499978], update = [0.]
log likelihood: 5452.47607421875, learned prob = [0.23499978], update = [0.]
log likelihood: 5452.47607421875, learned prob = [0.23499978], update = [0.]
log likelihood: 5452.47607421875, learned prob = [0.23499978], update = [0.]
log likelihood: 5452.47607421875, learned prob = [0.23499978], update = [0.]
log likelihood: 5452.47607421875, learned prob = [0.23499978], update = [0.]
log likelihood: 5452.47607421875, learned prob = [0.23499978], update = [0.]
