# Solution 1

We have importance sampling estimator as $$\hat{\theta_g}= \frac{1}{N} \sum_{t=1}^{N}\frac{h(Z_t)\pi(Z_t)}{g(Z_t)}$$
where, function to be estimated is $h(x)=x$, importance density is $g(x)=\frac{1}{\sqrt{2\pi}}e^{-\frac{x^2}{2}}$ and target density is $\pi(x) = \frac{1}{\pi(1+x^2)}$.
Now, after substituting the values and a bit of rearrangement, we can see that $Var_g(\hat{\theta_g})$ is finite iff $Var_g(\frac{h(Z_1)\pi(Z_1)}{g(Z_1)})=Var_g(\frac{xe^{\frac{x^2}{2}}}{1+x^2})=Var_g(s(x))$ is finite.
Now, $E_g(s)=\int_{-\infty}^{\infty}\frac{x}{1+x^2}dx = 0$, and $E_g(s^2)= \int_{-\infty}^{\infty} \frac{x^2e^{\frac{x^2}{2}}}{(1+x^2)^2} \to \infty$. So, $Var_g(s)=\infty$ and as a result the estimator does not have a finite variance. 
 

# Solution 2

(a) If we assume that the error in the approximate expression for variance in the lecture to be constant upto scaling then even if the simple importance estimator has finite variance, the weighted importance estimator may not have finite variance. The reason is that $Var_g(\hat{\theta_g})$ is finite iff $\frac{E_g(w^2)}{(E_g(w))^2}$ is finite. But if $g$ is uniform distibution on an interval and $w$ is like the function described [here](https://math.stackexchange.com/a/4230717) scaled to the same interval, then the ratio will diverge.

(b) In importance sampling, proposals are never rejected, but in accept-reject, we may need many iterations to get an acceptable proposal. So, if we use importace sampling we may land near the result quickly.

# Solution to Problem 3

By running the code for different values of $N$, we can see that the variance is increasing with $N$ for all three cases. This suggests that weighted importance sampling estimator does not have a finite variance in all three cases. The variance is decreasing with incresing value of $v$. 

In [10]:
using Distributions
using Random
Random.seed!(1)

TaskLocalRNG()

In [11]:
function sampleY(v,n)
    global i=0
    global Y=[]
    global v=5
    d=TDist(v)
    while(i<n)
        global i=i+1
        y=rand(d,1)
        append!(Y,y)
    end
return Y
end

sampleY (generic function with 1 method)

In [12]:
function PI(Y,x,v,n)
    global m=exp(-x*x/2)
    global i1=0
    while(i1<n)
        global i1=i1+1
        global m=m*(1+(Y[i1]-x)^2/v)^(-(v+1)/2)
    end
    return m
end

PI (generic function with 1 method)

In [13]:
function normalcdf(x)
    a=exp((-x*x/2))/((2*pi)^(0.5))
    return a
end

normalcdf (generic function with 1 method)

In [14]:
function firstMoment(n,v,num)
    Y=sampleY(v,n)
    global i3=0
    global top=0
    global bot=0
    global flag=0
    while (flag==0)
#         println(i3)
        x=randn()
        g=normalcdf(x)
        p=PI(Y,x,v,n)
#         println("p:",p,"g:",g)
        top=top+(x*p)/g
        bot=bot+p/g
        i3=i3+1
        if(i3>=num)
            global flag=1
            break
        end
    end
    return top/bot
end

firstMoment (generic function with 1 method)

In [15]:
function secondMoment(n,v,num)
    Y=sampleY(v,n)
    global i2=0
    global top2=0
    global bot2=0
    while (i2<num)
        i2=i2+1
        x=randn()
        g=normalcdf(x)
        p=PI(Y,x,v,n)
        top2=top2+x*x*p/g
        bot2=bot2+p/g
    end
    return top2/bot2
end

secondMoment (generic function with 1 method)

In [16]:
println("For v=5")
expectation=firstMoment(50,5,10000)
println("Expectation:",expectation)
println("Variance:",expectation^2-secondMoment(50,5,10000))

For v=5
Expectation:0.3195498376470879
Variance:0.048214074599329894


In [17]:
println("For v=1")
expectation=firstMoment(50,1,10000)
println("Expectation:",expectation)
println("Variance:",expectation^2-secondMoment(50,1,10000))

For v=1
Expectation:0.04382631588330899
Variance:-0.18224589091390006


In [18]:
println("For v=2")
expectation=firstMoment(50,2,10000)
println("Expectation:",expectation)
println("Variance:",(expectation)^2-secondMoment(50,2,10000))

For v=2
Expectation:0.20644879379356426
Variance:0.00898394357852978


# Solution to problem 4

We have $Y_1,\dots Y_n | \lambda \sim Poisson({\lambda})$ and $\lambda \sim Gamma(\alpha, \beta)$, let $(y_1, \dots y_n)$ be a sample drawn from $Y = (Y_1, \dots Y_n)$ and let $y = \sum_{i=1}^{n} y_i$. Then, 
$$p(\lambda | Y) \propto f(Y|\lambda) p(\lambda) \propto \lambda^{\alpha - 1} e^{-\beta \lambda}\prod_{i=1}^{n} \frac{\lambda^{y_i} e^{- \lambda}}{y_i!} \propto \lambda^{\alpha + y- 1} e^{-(\beta + n)\lambda} \sim Gamma(\alpha + y , \beta + n) \text{  (Upto constant of proportionality)}$$
So, we conclude that $p(\lambda | Y) \sim Gamma(\alpha + y , \beta + n)$.
