# Test cross validation scalability

Let's compare cross validation timings on 1 vs 8 cores on compressed genotype matrices as well as dense `Float64` matrices. We test multithreading with `@threads` and distributed computing with `pmap`.

In [1]:
using Distributed
addprocs(8)

@everywhere begin
    using Revise
    using MendelIHT
    using SnpArrays
    using Random
    using GLM
    using DelimitedFiles
    using Test
    using Distributions
    using LinearAlgebra
    using CSV
    using DataFrames
    using StatsBase
    BLAS.set_num_threads(1) # remember to set BLAS threads to 1 !!!
end

In [1]:
using Revise
using MendelIHT
using SnpArrays
using Random
using GLM
using DelimitedFiles
using Test
using Distributions
using LinearAlgebra
using CSV
using DataFrames
using StatsBase
BLAS.set_num_threads(1) # remember to set BLAS threads to 1 !!!

Threads.nthreads()

┌ Info: Precompiling MendelIHT [921c7187-1484-5754-b919-5d3ed9ac03c4]
└ @ Base loading.jl:1317


8

# Univariate response with SnpLinAlg

In [2]:
n = 1000  # number of samples
p = 10000 # number of SNPs
k = 10    # number of causal SNPs per trait
d = Normal
l = canonicallink(d())

# set random seed for reproducibility
Random.seed!(2021)

# simulate `.bed` file with no missing data
x = simulate_random_snparray(undef, n, p)
xla = SnpLinAlg{Float64}(x, model=ADDITIVE_MODEL, center=true, scale=true) 

# intercept is the only nongenetic covariate
z = ones(n)
intercept = 1.0

# simulate response y, true model b, and the correct non-0 positions of b
Y, true_b, correct_position = simulate_random_response(xla, k, d, l, Zu=z*intercept);

## 1 core/thread

+ Locally: 59.8 sec
+ On Hoffman: 
    + 90.6 sec on n6677
    + 35.7 sec on n6078

In [4]:
Random.seed!(2020)
@time mses_new = cv_iht(Y, xla, z, parallel=false);

[32mCross validating...100%|████████████████████████████████| Time: 0:00:59[39m




Crossvalidation Results:
	k	MSE
	1	1327.1531280179597
	2	848.8788999226794
	3	639.948579602539
	4	491.3093147436451
	5	414.7989357365103
	6	307.6314769250727
	7	266.95674197558765
	8	242.05761692092082
	9	236.2320679759727
	10	243.58842257833055
	11	249.85686051763722
	12	247.99599154286292
	13	252.9161674378406
	14	257.6985723761604
	15	264.6579499175102
	16	267.6568409756378
	17	266.171116137713
	18	269.7531680328134
	19	274.6758161402144
	20	279.1288207835988

Best k = 9

 59.736318 seconds (22.59 M allocations: 523.071 MiB, 0.09% gc time)


## pmap

+ Locally: 11.48 sec (8 cores, ~5.2x speed up)
+ On Hoffman: 
    + 15.25 sec (4 workers all on different nodes, ~6x speedup?)
    + 9.04 sec (8 workers all on different nodes, ~10x speedup?)
    + 5.1 sec (16 workers some nodes shared, ~17x speedup?)

In [16]:
# 8 cores
Random.seed!(2020)
@time mses_new = cv_iht(Y, xla, z);

[32mCross validating...100%|████████████████████████████████| Time: 0:00:11[39m




Crossvalidation Results:
	k	MSE
	1	1327.1531280179597
	2	848.8788999226794
	3	639.948579602539
	4	491.3093147436451
	5	414.7989357365103
	6	307.6314769250727
	7	266.95674197558765
	8	242.05761692092082
	9	236.2320679759727
	10	243.58842257833055
	11	249.85686051763722
	12	247.99599154286292
	13	252.9161674378406
	14	257.6985723761604
	15	264.6579499175102
	16	267.6568409756378
	17	266.171116137713
	18	269.7531680328134
	19	274.6758161402144
	20	279.1288207835988

Best k = 9

 11.485343 seconds (36.60 k allocations: 8.148 MiB)


## Threads.@threads

In [14]:
# 8 threads
Random.seed!(2020)
@time mses_new = cv_iht(Y, xla, z);

[32mCross validating...100%|████████████████████████████████| Time: 0:00:10[39m




Crossvalidation Results:
	k	MSE
	1	1998.0714191613408
	2	883.8723436324113
	3	664.2897408340615
	4	491.8268743939946
	5	465.2640757566476
	6	311.96343380091486
	7	262.4858716067314
	8	242.7636653935437
	9	233.3698430342089
	10	239.31422302501267
	11	232.89591020251848
	12	256.3754238244326
	13	227.23792887289247
	14	244.11022983334283
	15	275.51573756073617
	16	258.0385953411366
	17	278.1060977328677
	18	266.2898046454383
	19	235.5155246294285
	20	265.2229016022691

Best k = 13

 10.341796 seconds (22.58 M allocations: 521.592 MiB, 0.46% gc time)


In [13]:
# 8 threads
Random.seed!(2020)
@time mses_new = cv_iht(Y, xla, z);

[32mCross validating... 91%|█████████████████████████████▏  |  ETA: 0:00:01[39m



Crossvalidation Results:
	k	MSE
	1	1957.0010924928874
	2	841.3728442900828
	3	640.7772999825921
	4	521.4699961650224
	5	385.3943750988469
	6	314.3837284299232
	7	276.91330390092514
	8	241.83331623908384
	9	234.95777850020556
	10	245.04689943140693
	11	245.7176731433606
	12	254.0889561772159
	13	273.30628055957334
	14	249.54506726839523
	15	264.49675309777
	16	304.00998195647355
	17	264.0688781826786
	18	247.92218393683152
	19	265.58127116801506
	20	257.80632839024406

Best k = 9

  9.880719 seconds (23.05 M allocations: 547.388 MiB, 1.28% gc time, 0.70% compilation time)


[32mCross validating...100%|████████████████████████████████| Time: 0:00:09[39m


20-element Vector{Float64}:
 1957.0010924928874
  841.3728442900828
  640.7772999825921
  521.4699961650224
  385.3943750988469
  314.3837284299232
  276.91330390092514
  241.83331623908384
  234.95777850020556
  245.04689943140693
  245.7176731433606
  254.0889561772159
  273.30628055957334
  249.54506726839523
  264.49675309777
  304.00998195647355
  264.0688781826786
  247.92218393683152
  265.58127116801506
  257.80632839024406

## @sync ... @spawn 

Answer fluctuates. Not sure why.

In [28]:
# 4 threads
Random.seed!(2020)
@time mses_new = cv_iht(Y, xla, z, d=d(), l=l);

[32mCross validating...100%|████████████████████████████████| Time: 0:00:01[39m




Crossvalidation Results:
	k	MSE
	1	816.1038468662973
	2	635.8501800222248
	3	448.39853780432867
	4	360.577811830275
	5	509.8201630591882
	6	311.4943544718625
	7	249.59172395241683
	8	235.86097409763892
	9	289.7590331303986
	10	231.90438372164522
	11	311.74353930739557
	12	556.4400188227318
	13	283.93522585750975
	14	450.89762842187645
	15	253.82942007096423
	16	349.3043595351862
	17	444.38111610875933
	18	493.34013792664865
	19	364.096839461036
	20	277.8629182058546

Best k = 10

  1.434346 seconds (22.97 M allocations: 527.980 MiB, 5.34% gc time)


## Univariate response with dense Float64s

In [16]:
n = 5000  # number of samples
p = 10000 # number of SNPs
k = 10    # number of causal SNPs per trait
d = Normal
l = canonicallink(d())

# set random seed for reproducibility
Random.seed!(2021)

# simulate `.bed` file with no missing data
x = randn(n, p)

# intercept is the only nongenetic covariate
z = ones(n)
intercept = 1.0

# simulate response y, true model b, and the correct non-0 positions of b
y, true_b, correct_position = simulate_random_response(x, k, d, l, Zu=z*intercept);

In [17]:
# 1 cores
Random.seed!(2020)
@time mses_new = cv_iht(y, x, z, d=d(), l=l, parallel=false);

[32mCross validating...100%|████████████████████████████████| Time: 0:00:18[39m




Crossvalidation Results:
	k	MSE
	1	7620.678703424767
	2	4843.2090739422965
	3	3823.4696319526106
	4	2882.8736571364025
	5	2057.0297220866105
	6	1711.3860665095106
	7	1296.1935231845227
	8	1112.6263164938034
	9	984.7885109603233
	10	967.4921113496174
	11	966.0060438329298
	12	968.2783508468474
	13	971.361243918521
	14	973.4022898469375
	15	978.1775331064498
	16	983.9153752806435
	17	983.0212506386123
	18	986.9111942083202
	19	988.6770001959334
	20	991.9368900330937

Best k = 11

 18.232142 seconds (74.80 M allocations: 1.347 GiB, 2.04% gc time)


In [18]:
# 4 cores
Random.seed!(2020)
@time mses_new = cv_iht(y, x, z, d=d(), l=l, parallel=true);

[32mCross validating...100%|████████████████████████████████| Time: 0:00:22[39m




Crossvalidation Results:
	k	MSE
	1	7620.678703424767
	2	4843.2090739422965
	3	3823.4696319526106
	4	2882.8736571364025
	5	2057.0297220866105
	6	1711.3860665095106
	7	1296.1935231845227
	8	1112.6263164938034
	9	984.7885109603233
	10	967.4921113496174
	11	966.0060438329298
	12	968.2783508468474
	13	971.361243918521
	14	973.4022898469375
	15	978.1775331064498
	16	983.9153752806435
	17	983.0212506386123
	18	986.9111942083202
	19	988.6770001959334
	20	991.9368900330937

Best k = 11

 22.953429 seconds (47.78 k allocations: 10.283 MiB)


## Multivariate response with SnpLinAlg

In [2]:
n = 1000  # number of samples
p = 10000 # number of SNPs
k = 10    # number of causal SNPs per trait
r = 2

# set random seed for reproducibility
Random.seed!(2021)

# simulate `.bed` file with no missing data
x = simulate_random_snparray(undef, n, p)
xla = SnpLinAlg{Float64}(x, model=ADDITIVE_MODEL, center=true, scale=true) 

# intercept is the only nongenetic covariate
z = ones(n)
intercepts = [10 1.0]

# simulate response y, true model b, and the correct non-0 positions of b
Y, true_Σ, true_b, correct_position = simulate_random_response(xla, k, r, Zu=z*intercepts, overlap=2);

In [4]:
# 4 core
Random.seed!(2020)
Yt = Matrix(Y')
Zt = Matrix(z')
@time mses = cv_iht(Yt, Transpose(xla), Zt, path=1:20, parallel=true);

[32mCross validating...100%|████████████████████████████████| Time: 0:00:05[39m




Crossvalidation Results:
	k	MSE
	1	2888.7160633107273
	2	2626.9760744337036
	3	2063.2785491746927
	4	1800.517101408314
	5	1554.7247612809686
	6	1277.3237598020085
	7	1154.9320629872832
	8	1098.591169773047
	9	1019.468933898557
	10	1030.1004609637873
	11	1023.5283415733189
	12	1008.6540951957405
	13	1014.8064649926949
	14	1017.1679560075426
	15	1022.6733448618851
	16	1023.701959348514
	17	1035.6435253352697
	18	1035.1363483912255
	19	1043.6594351282336
	20	1034.336967697158

Best k = 12

  5.997148 seconds (39.35 k allocations: 8.949 MiB)


## Multivariate response with dense Float64s

In [7]:
n = 1000  # number of samples
p = 10000 # number of SNPs
k = 10    # number of causal SNPs
r = 2     # number of traits

# set random seed for reproducibility
Random.seed!(2021)

# simulate `.bed` file with no missing data
x = randn(n, p)

# intercept is the only nongenetic covariate
z = ones(n, 1)
intercepts = [10.0 1.0] # each trait have different intercept

# simulate response y, true model b, and the correct non-0 positions of b
Y, true_Σ, true_b, correct_position = simulate_random_response(x, k, r, Zu=z*intercepts, overlap=2);

In [8]:
# 1 core
Random.seed!(2020)
Yt = Matrix(Y')
Zt = Matrix(z')
@time mses = cv_iht(Yt, Transpose(x), Zt, path=1:20, parallel=false);

[32mCross validating...100%|████████████████████████████████| Time: 0:02:03[39m




Crossvalidation Results:
	k	MSE
	1	2629.5253388334654
	2	2445.3327091797446
	3	1699.7479496810877
	4	1581.049584753384
	5	1345.7979096211036
	6	1002.831521953042
	7	938.1884998991206
	8	731.4755412240933
	9	723.5331961068099
	10	826.2287168502997
	11	605.2207412646168
	12	608.5209443463392
	13	609.4258432524741
	14	608.3389786061807
	15	607.6231927917379
	16	610.1537812223704
	17	610.4358299991871
	18	612.6346959329958
	19	609.484474645297
	20	611.8138514213124

Best k = 11

123.398046 seconds (8.41 M allocations: 1.157 GiB, 0.16% gc time)


In [9]:
# 4 core
Random.seed!(2020)
Yt = Matrix(Y')
Zt = Matrix(z')
@time mses = cv_iht(Yt, Transpose(x), Zt, path=1:20, parallel=true);

[32mCross validating...100%|████████████████████████████████| Time: 0:00:41[39m




Crossvalidation Results:
	k	MSE
	1	2629.5253388334654
	2	2445.3327091797446
	3	1699.7479496810877
	4	1581.049584753384
	5	1345.7979096211036
	6	1002.831521953042
	7	938.1884998991206
	8	731.4755412240933
	9	723.5331961068099
	10	826.2287168502997
	11	605.2207412646168
	12	608.5209443463392
	13	609.4258432524741
	14	608.3389786061807
	15	607.6231927917379
	16	610.1537812223704
	17	610.4358299991871
	18	612.6346959329958
	19	609.484474645297
	20	611.8138514213124

Best k = 11

 41.739760 seconds (463.52 k allocations: 26.606 MiB)
