# Test Samplers by Hopping on a Graph

Let's test samplers by checking whether they run a model the same way. The model is that a particle hops around a graph, from graph node to graph node. Each hop is determined by a continuous univariate distribution. We'll let a particle hop around and then ask, at the end, how much time it spent on each node.

In [1]:
using Pkg
Pkg.activate(".")
using Fleck
using LinearAlgebra
using Revise

[32m[1m  Activating[22m[39m project at `~/dev/Fleck/examples`


In [10]:
include("../test/graph_occupancy.jl")

sample_run_graph_occupancy (generic function with 1 method)

Make the model. This model is made by a random number generator, where the random number generator determines which nodes are connected (so you can hop from one to the other) and which distribution determines the speed of each hop.

In [3]:
Threads.nthreads()

24

In [3]:
model_rng = Xoshiro(3498217)  # Random number generator.
singleton_groc = GraphOccupancy(model_rng)
length(singleton_groc)

8

In [4]:
function generate_samples(singleton_groc, singleton_sampler, trial_cnt)
    occupancy = zeros(Float64, (length(singleton_groc), trial_cnt))
    rng = Xoshiro(24370233)
    Threads.@threads for trial_idx in 1:trial_cnt
        groc = deepcopy(singleton_groc)
        sampler = deepcopy(singleton_sampler)
        occ, observations = run_graph_occupancy(groc, 1e6, sampler, rng)
        occupancy[:, trial_idx] .= occ
    end
    occupancy
end

generate_samples (generic function with 1 method)

In [11]:
sampler = FirstReaction{keyspace(GraphOccupancy)}()
trial_cnt = 100
occupancy = generate_samples(singleton_groc, sampler, trial_cnt)

8×100 Matrix{Float64}:
      2.1761e5      2.17855e5     2.18164e5  …     2.17873e5       2.18516e5
 169254.0           1.69344e5     1.6892e5         1.68403e5  168563.0
      0.0           0.0           0.0              0.0             0.0
   2073.26       2157.37       2204.28          2096.51         2195.99
      1.82777e5     1.82987e5     1.82702e5        1.82911e5       1.82838e5
      1.83892e5     1.83931e5     1.83952e5  …     1.84469e5       1.84459e5
      1.00826e5     1.00247e5     1.00445e5        1.00442e5       1.00003e5
      1.43567e5     1.43478e5     1.43613e5        1.43805e5       1.43423e5

That zero dwell time is pesky. It means there was a component of the graph that wasn't connected, which is fine, but it will throw off the stats. We're going to see how likely each draw is using https://en.wikipedia.org/wiki/Multinomial_test.

In [12]:
function occupancy_likelihood_ratio(occupancy, mle)
    # Sometimes a graph node is disconnected, so remove those.
    keep_row = [sum(occupancy[check_idx,:]) > 0 for check_idx in 1:size(occupancy, 1)]
    occupancy = occupancy[keep_row, :]
    category_cnt = size(occupancy, 1)
    trial_cnt = size(occupancy, 2)
    odds_ratio = zeros(Float64, trial_cnt)
    for trial_idx in 1:trial_cnt
        relative_occupancy = occupancy[:, trial_idx] / sum(occupancy[:, trial_idx])
        log_odds = 0.0
        for node_idx in 1:category_cnt
            log_odds += occupancy[node_idx, trial_idx] * log(mle[node_idx] / relative_occupancy[node_idx])
        end
        odds_ratio[trial_idx] = -2 * log_odds
    end
    return odds_ratio
end

function maximum_likelihood(occupancy)
    # Sometimes a graph node is disconnected, so remove those.
    keep_row = [sum(occupancy[check_idx,:]) > 0 for check_idx in 1:size(occupancy, 1)]
    occupancy = occupancy[keep_row, :]
    category_cnt = size(occupancy, 1)
    trial_cnt = size(occupancy, 2)
    expected_occupancy = sum(occupancy; dims=2) / trial_cnt
    mle = expected_occupancy / sum(expected_occupancy)
end

maximum_likelihood (generic function with 1 method)

In [13]:
mle = maximum_likelihood(occupancy)
odds_ratios = occupancy_likelihood_ratio(occupancy, mle)
sort!(odds_ratios)

100-element Vector{Float64}:
  0.47170872317769863
  1.0477788100678254
  1.1182296978115573
  1.5920294628498368
  1.6185251975797996
  1.6484488613929784
  1.7943514080777163
  1.8962455154481006
  1.9043175580173113
  2.095203853303474
  ⋮
 16.035144818671824
 16.525796267986777
 18.74605687470853
 19.547747469967817
 19.68452057640343
 58.28862117703375
 60.447789636371
 78.37708666751291
 87.13441909130256

In [14]:
perfect = zeros(length(mle), 2)
perfect[:,1] .= mle * 1e6
perfect[:,2] .= mle * 1e6
occupancy_likelihood_ratio(perfect, mle)

2-element Vector{Float64}:
 1.7625206483190706e-10
 1.7625206483190706e-10

In [15]:
sampler = FirstToFire{keyspace(GraphOccupancy)}()
trial_cnt = 10
first_to_fire = generate_samples(singleton_groc, sampler, trial_cnt)

8×10 Matrix{Float64}:
    2.17429e5  217853.0           2.17895e5  …     2.17308e5     2.18072e5
    1.67507e5       1.67498e5     1.67293e5        1.67087e5     1.66945e5
    0.0             0.0           0.0              0.0           0.0
 2065.57         2261.02       2028.96          2148.86       2003.96
    1.80384e5       1.8076e5      1.80787e5        1.81309e5     1.80446e5
    1.84179e5       1.84153e5     1.84484e5  …     1.84355e5     1.84647e5
    1.03073e5       1.02267e5     1.02518e5        1.02757e5     1.02866e5
    1.45362e5       1.45207e5     1.44994e5        1.45035e5     1.4502e5

In [16]:
ftf_odds_ratio = occupancy_likelihood_ratio(first_to_fire, mle)
sort!(ftf_odds_ratio)

10-element Vector{Float64}:
  88.69783794716477
  96.59095092402504
 110.82891948863653
 112.12921731210463
 114.20145667326688
 134.20945872844277
 151.03119268754745
 151.3439105802463
 156.68062839833692
 164.0930592848017

Let's look in more depth at a single run with this sampler.

In [18]:
rng = Xoshiro(24370233)
groc = deepcopy(singleton_groc)
sampler_ftf = FirstToFire{keyspace(GraphOccupancy)}()
occupancy_ftf, observations_ftf = run_graph_occupancy(groc, 1e6, sampler_ftf, rng)
groc = deepcopy(singleton_groc)
sampler_fr = FirstReaction{keyspace(GraphOccupancy)}()
occupancy_fr, observations_fr = run_graph_occupancy(groc, 1e6, sampler_fr, rng)


([220023.25027902523, 168278.8198621476, 0.0, 2116.2548418180836, 179826.17919817506, 185069.21983681395, 100614.32042814745, 144071.62714050163], Dict{Tuple{Int64, Int64}, TransitionObserver}((6, 8) => TransitionObserver(2.730404958128929e-6, 4.030808224109933, 40965.46750952257, 92777), (4, 5) => TransitionObserver(6.938353180885315e-7, 9.118537008413114, 2116.2548418180836, 2037), (2, 5) => TransitionObserver(2.8100330382585526e-6, 5.646939960774034, 110854.93270651967, 182929), (6, 2) => TransitionObserver(0.46914584457408637, 4.612327880342491, 48236.69895598239, 55943), (2, 6) => TransitionObserver(0.001520805701147765, 5.80447075527627, 48999.1299832613, 53231), (7, 8) => TransitionObserver(6.8508670665323734e-6, 4.15953955904115, 34625.75952006418, 104299), (7, 2) => TransitionObserver(0.6103188297711313, 1.984100949135609, 16.414609279221622, 12), (5, 6) => TransitionObserver(3.760406980291009e-5, 2.6237527617486194, 5661.050004555574, 18037), (8, 5) => TransitionObserver(0.69

In [19]:
occupancy_fr

8-element Vector{Float64}:
 220023.25027902523
 168278.8198621476
      0.0
   2116.2548418180836
 179826.17919817506
 185069.21983681395
 100614.32042814745
 144071.62714050163

In [20]:
occupancy_ftf

8-element Vector{Float64}:
 219959.52702019995
 167643.61244652607
      0.0
   2059.5499834459224
 179385.47956963518
 186268.99871330644
 100255.7374212472
 144427.06985781007

In [33]:
sort(collect(observations_fr), by = a -> a[1][1])

24-element Vector{Pair{Tuple{Int64, Int64}, TransitionObserver}}:
 (1, 5) => TransitionObserver(0.0034435114357620478, 3.188356563121488, 136111.35077488038, 190236)
 (1, 6) => TransitionObserver(0.11658225604332983, 2.8390970316249877, 83911.89950414485, 162011)
 (2, 5) => TransitionObserver(2.8100330382585526e-6, 5.646939960774034, 110854.93270651967, 182929)
 (2, 6) => TransitionObserver(0.001520805701147765, 5.80447075527627, 48999.1299832613, 53231)
 (2, 7) => TransitionObserver(0.00013385957572609186, 5.138986614882015, 8424.757172366622, 8968)
 (4, 5) => TransitionObserver(6.938353180885315e-7, 9.118537008413114, 2116.2548418180836, 2037)
 (5, 6) => TransitionObserver(3.760406980291009e-5, 2.6237527617486194, 5661.050004555574, 18037)
 (5, 1) => TransitionObserver(2.6095221983268857e-7, 2.6274962739553303, 50600.711254788985, 166528)
 (5, 7) => TransitionObserver(3.372860373929143e-5, 2.9786079837940633, 58404.629655935336, 133491)
 (5, 8) => TransitionObserver(4.774401895701885

In [34]:
sort(collect(observations_ftf), by = a -> a[1][1])

24-element Vector{Pair{Tuple{Int64, Int64}, TransitionObserver}}:
 (1, 5) => TransitionObserver(0.0024783797562122345, 3.3716992783010937, 135682.83927891665, 189022)
 (1, 6) => TransitionObserver(0.11658026571967639, 2.703791961350362, 84276.6877412833, 162980)
 (2, 5) => TransitionObserver(5.939131369814277e-6, 6.501587403821759, 110327.07119184342, 182297)
 (2, 6) => TransitionObserver(0.0020213420502841473, 5.993916473235004, 48789.66476348899, 53340)
 (2, 7) => TransitionObserver(9.694581967778504e-5, 5.528218869701959, 8526.876491193676, 8888)
 (4, 5) => TransitionObserver(0.0003514352720230818, 7.847983775194734, 2059.5499834459224, 2073)
 (5, 6) => TransitionObserver(3.406329778954387e-5, 3.2362204383825883, 5619.15471858897, 17883)
 (5, 1) => TransitionObserver(3.976165316998959e-6, 2.8505185908288695, 50462.39980393472, 166136)
 (5, 7) => TransitionObserver(6.881411536596715e-5, 2.9454364667180926, 58290.51969598826, 132783)
 (5, 8) => TransitionObserver(1.314561814069748e-5,