Given: $N \sim Poisson(\lambda)$ and $X_1, \dots, X_n \sim \vec{\pi}$

$X_k(t)$ is continous time MC with $X_k(0) = X_k$
$N_t(a) = $\{k:X_k(t) = a\}$

i.e. $N_t$ is the number of visits to state $a$ in time $t$.

$\sum_a\pi(a)Q_{ab}=0$ for each $b$ with the constraint $\sum_a\pi(a)=1$


$\sum_a\pi(a)Q_{ab}=0$ $\implies$ $\vec{\pi}^TQ=0$ $\implies$ 

$$
\begin{align*}
\vec{\pi}^TQ&=0\\
\Longleftrightarrow \vec{\pi}^TQ^n&=0\ \  \forall n \geq 1\\ 
\Longleftrightarrow \sum_{n\geq 1}\vec{\pi}\frac{t^n}{n!}Q^n &=0 \ \  \forall t \geq 0\\
\Longleftrightarrow \vec{\pi}\sum_{n\geq 0}\frac{t^n}{n!}Q^n &=\vec{\pi}\\
\Longleftrightarrow \vec{\pi}P &=\vec{\pi}\\
\Longleftrightarrow \vec{\pi}\  \text{is a stationary distribution}
\end{align*}
$$

Now, $P(X_k(t)=a)=\pi(a)$ and $N_t(a) = \{k:X_k(t) = a\}$ $\implies$ $N_t(a)|N \sim Binom(N, \pi(a))$ and 
$N \sim Poisson(\lambda)$ then $N_t \sim Poisson(\lambda \pi)$ [We did this in class TODO]

## Problem 2

In [1]:
%matplotlib inline
from __future__ import division
import pandas as pd
import matplotlib
import itertools
matplotlib.rcParams['figure.figsize'] = (16,12)
import matplotlib.pyplot as plt
import numpy as np
np.random.seed(1)

def propose(S):
    r = np.random.choice(len(S), 2)
    rs = np.sort(r)
    j,k=rs[0],rs[1]
    y=np.copy(S)
    y[j:k+1] = y[j:k+1][::-1]
    return y

def count_cycles(S):
    sample_length = len(S)
    n_cycles = 0
    index = 0
    length_travelled = 0
    visited = []
    while length_travelled < sample_length:
        if S[index] == index and index < sample_length :
            index+=1
            n_cycles+=1
            length_travelled+=1
        else:
            visited.append(index)
            index = S[index]
            length_travelled+=1
            if index not in visited:
                n_cycles+=1
    return n_cycles

In [2]:
N = [2,3,4, 100]
alpha = 3

In [3]:
assert count_cycles([0,1]) == 2
assert count_cycles([0,2,1]) == 2
assert count_cycles([1,0]) == 1

In [4]:
N_iterations = 1000

def theoretical(S, alpha, denom):
    n_cycles = count_cycles(S)
    return n_cycles**alpha/denom


def run(n):
    oldS = np.arange(n)
    old_n_cycles = count_cycles(oldS)
    count_dict = {}
    denom = sum([count_cycles(x)**alpha  for x in itertools.permutations(range(n))])
    for i in range(N_iterations):
        proposedS = propose(oldS)
        new_n_cycles = count_cycles(proposedS)
        pi_ab = new_n_cycles**alpha/(old_n_cycles**alpha)
        q = min(1,pi_ab)
        if q>= np.random.uniform():
            oldS = proposedS
            old_n_cycles = new_n_cycles
        tkey = ','.join([str(x+1) for x in oldS.tolist()])
        key="["+tkey+"]"
        if key not in count_dict:
            count_dict[key] = [0,0,0]
            count_dict[key][1] = theoretical(oldS,alpha,denom)
            count_dict[key][2] = old_n_cycles
        count_dict[key][0]+=1
    df = pd.DataFrame(count_dict)
    df=df.transpose()
    df.columns=[r'Simulated $\pi(s)$', 'Theoretical', 'c(s)']
    df[r'Simulated $\pi(s)$'] = df[r'Simulated $\pi(s)$']/N_iterations
    df['Percentage Error'] = 100*(df[r'Simulated $\pi(s)$']/df['Theoretical']-1)
    df.index.name='State'
    return df


## n=2

In [5]:
df  = run(N[0])
df

Unnamed: 0_level_0,Simulated $\pi(s)$,Theoretical,c(s),Percentage Error
State,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
"[1,2]",0.91,0.888889,2,2.375
"[2,1]",0.09,0.111111,1,-19.0


## n=3


In [6]:
df = run(N[1])
df

Unnamed: 0_level_0,Simulated $\pi(s)$,Theoretical,c(s),Percentage Error
State,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
"[1,2,3]",0.579,0.509434,3,13.655556
"[1,3,2]",0.154,0.150943,2,2.025
"[2,1,3]",0.009,0.018868,1,-52.3
"[2,3,1]",0.132,0.150943,2,-12.55
"[3,1,2]",0.106,0.150943,2,-29.775
"[3,2,1]",0.02,0.018868,1,6.0


## n=4

In [7]:
count_dict = run(N[2])
count_dict

Unnamed: 0_level_0,Simulated $\pi(s)$,Theoretical,c(s),Percentage Error
State,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
"[1,2,3,4]",0.117,0.169761,4,-31.079687
"[1,2,4,3]",0.059,0.071618,3,-17.618519
"[1,3,2,4]",0.022,0.02122,2,3.675
"[1,3,4,2]",0.06,0.071618,3,-16.222222
"[1,4,2,3]",0.085,0.071618,3,18.685185
"[1,4,3,2]",0.013,0.02122,2,-38.7375
"[2,1,3,4]",0.002,0.002653,1,-24.6
"[2,1,4,3]",0.005,0.002653,1,88.5
"[2,3,1,4]",0.013,0.02122,2,-38.7375
"[2,3,4,1]",0.077,0.071618,3,7.514815


## N=100

In [None]:
df  = run(N[3])
#df
expectation = sum(df[r'Simulated $\pi(s)$']*df['c(s)'])
expectation2 = sum(df[r'Simulated $\pi(s)$']*df['c(s)']*df['c(s)'])


In [None]:
print expectation, expectation2

In [None]:
print np.mean(df['c(s)'])

$\sum_{s \in S_a}\pi(s)c(s)=E[c(s)]$

and similarly, 

$\sum_{s \in S_a}\pi(s)c^2(s)=E[c^2(s)]=Var(c(s))+E^2[c(s)]$

In [None]:
cycles = df['c(s)']
plt.hist(cycles, normed=True)

## Problem 3

In [None]:
N = 1000
chrom_length = 3*(10**9)
transposon_length = 3*1000
mu = 0.05
t_positions = []

x_initial = np.random.uniform(0,chrom_length,size=1)
for i in range(N):
    temp_mu = np.random.uniform(0,1,1)
    if temp_mu <=0.05:
        ## Tranpose
        ## Ignore overlap with existing region
        y,z = np.random.uniform(0, chrom_length,size=2)
        