<a href="https://colab.research.google.com/github/TALeonard/19ma573thomasleonard/blob/master/src/hw12_v01.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

We want to find $E[\hat V_{10}^{2}]$ via OMC. So, we do that. Import numpy module.

In [0]:
import numpy as np

In [0]:
def omc_v10_sq(n):
  v_vec = np.zeros(n) # n entries
  for j in range(n):
    indic_sum = 0 #Initialize summation of the indicator
    for i in range(10):
      z = np.random.normal(0,1) #Generate Z_i
      if z < -2:
        indic_sum = indic_sum + 1 #Increment the indicator, if criteria met
    v10 = indic_sum/10 #After incrementing sum, compute v-hat-10
    v_vec[j] = v10**2 #We want expected value fo v-hat-10 squared, so the entry
                      #In the vector is v10**2.
  return v_vec.mean()

In [11]:
out_1 = omc_v10_sq(1000)
out_2 = omc_v10_sq(2000)
out_3 = omc_v10_sq(100000)
print("OMC approximation for 1000 trials is: " + str(out_1))
print("OMC approximation for 2000 trials is: " + str(out_2))
print("OMC approximation for 100000 trials is: " + str(out_3))

OMC approximation for 1000 trials is: 0.0028200000000000005
OMC approximation for 2000 trials is: 0.0026550000000000007
OMC approximation for 100000 trials is: 0.0027788000000000005


From this, the OMC approximation of $E[\hat V_{10} ^{2}]$ is roughly $0.0026$. Now, run Importance Sampling method.

In [0]:
def v10_is(n,b):
  # First, take N samples from N(-b, 1) where N() represents the Normal distribution.
  indic_sum = 0 # Empty sum to populate
  for i in range(n):
    x = np.random.normal(-b,1) #Generate a sample from N(-b, 1)
    if x < -2: # Increment sum if x_i < -2 (if not, summation term is 0)
      indic_sum = indic_sum + np.exp(10 * x * b)
  v10 = np.exp(0.5 * (b**2)) * (1/10) * indic_sum
  return v10  

Given this importance sampling, compute $E[\hat V_{10}^{2}]$ as the average of a large number of trials.

In [0]:
def is_average(m,n,b):
  v10_is_vec = np.zeros(m)
  for i in range(m):
    v10_out = v10_is(n,b)
    v10_is_vec[i] = v10_out**2
  return v10_is_vec.mean()

In [12]:
is_out_1 = is_average(1000,10,2)
is_out_2 = is_average(2000,10,2)
is_out_3 = is_average(10000,10,2)

print("Importance Sampling output for m = 1000 is " + str(is_out_1))
print("Importance Sampling output for m = 2000 is " + str(is_out_2))
print("Importance Sampling output for m = 10000 is " + str(is_out_3))

Importance Sampling output for m = 1000 is 1.2369776244356614e-36
Importance Sampling output for m = 2000 is 1.4662160938366953e-36
Importance Sampling output for m = 10000 is 1.3201695680616566e-36


According to these, the IS output is very close to 0. We saw something similar in the OMC testing (albeit to only about 3 decimal places). This makes sense to consider given the underlying algorithm.

The sum of the indicator function is only nonzero if at least one $x_{i} < -2$. For all such $x_{i}$, the relevant terms in the summation are all at most $e^{10bx_{i}} = e^{20x_{i}}$. As $e^{x}$ is an increasing function, the largest value this will take without making the indicator 0 is roughly $e^{-40} \approx 4.24 * 10^{-18}$. Thus, even if it took this value 10 times the preceding $\frac{1}{10}$ term in the definition used for $\hat V_{10}$ would "cancel" it out (by way of $10*e^{-40} * \frac{1}{10}$), leaving us with $e^{\frac{1}{2}b^{2}}e^{-40} = e^{2}e^{-40} = e^{-38} \approx 3.14 * 10^{-17}$. Thus, logically this result makes sense (as much as it shows the OMC is massively off, or my assumptions somewhere are horridly wrong).

As for making the Importance Sampling most efficient... I'm not entirely sure.