This tutorial generates a simulated data and use EBtimecourse to find change points in the data.

In [1]:
rm(list=ls())
setwd("~/Documents/Research/Microarray_time_course/manuscript/EBtimecourse")
library(tensorflow)
use_python("/usr/local/bin/python3")
source("EBtimecourse.R")

Use Normal Normal-Gamma model to generate a data matrix of 1000 genes by 8 time points, and numbers of replicates for the 8 time points are 3, 3, 3, 4, 2, 3, 3 respectively, so total columns are 24. In this simulation, we set P=0.2, which means 20% genes having change points and these genes are put in the top 20% rows of the data matrix. In the parameter matrix, it stores parameters to generate data. In the parameter matrix, mu1 and mu2 stand for the latent means, sigma1 and sigma2 stand for variances. n1, n2 and n3 stand for the length of the three homogeneous sequences, for example, if n1=2, n2=1, n3=5, it means the two change points are after the 2nd and 3rd time point.

In [2]:
set.seed(0)

P=0.2 # proportion of genes to have change points
N=1000;

nDE=P*N; nEE=N-nDE;
cp_real_seq_index=1:nDE;
timePoint=8; # number of time points
replicate=c(3,3,3,4,2,3,3,3); # number of replicated of each time points, for example, this vector means 4th time point has 4 replicates and 5th time point has 2 replicates, the others have 3 replicates
Ti=sum(replicate); 

changePointTable = data.frame(matrix(NA, nrow=(timePoint-1)+(timePoint-1)*(timePoint-2)/2, ncol=3), stringsAsFactors=F)
colnames(changePointTable) = c("n1", "n2", "n3")
changePointTable[1:(timePoint-1),"n1"]=1:(timePoint-1)
changePointTable[1:(timePoint-1),"n2"]=(timePoint-1):1
changePointTable[1:(timePoint-1),"n3"]=0
combT = as.data.frame(t(combn(timePoint-1, 2)))
combT$V2 = combT$V2 - combT$V1
changePointTable[timePoint:nrow(changePointTable),c("n1","n2")]=combT
changePointTable[timePoint:nrow(changePointTable),"n3"]=timePoint-rowSums(changePointTable[timePoint:nrow(changePointTable),c("n1", "n2")])
combNumber = nrow(changePointTable)

ss=sample(1:nrow(changePointTable), P*N, replace=T)
n1=changePointTable[ss,"n1"]
n2=changePointTable[ss,"n2"]
n3=changePointTable[ss,"n3"]

n1=c(n1, rep(timePoint,nEE)); n2=c(n2, rep(0, nEE)); n3=c(n3, rep(0, nEE));
mu0=0; kappa0=0.1; alpha0=1; beta0=10;
lambda = rgamma(N*2, shape=alpha0, rate=beta0)
mu = rnorm(N*2,mean=mu0,sd=1/sqrt(kappa0*lambda))
lambda = matrix(lambda, ncol=2)
mu = matrix(mu, ncol=2)
sigma = 1/sqrt(lambda)
parameter=data.frame(mu, sigma, n1=n1, n2=n2, n3=n3)
colnames(parameter)=c("mu1", "mu2", "sigma1", "sigma2", "n1", "n2", "n3")

gene.de=t(apply(parameter[1:nDE,],1, function(x) 
  c(rnorm(sum(replicate[1:x[5]]), mean=x[1], sd=x[3]), rnorm(sum(replicate[(x[5]+1):(x[5]+x[6])]), mean=x[2], sd=x[4]),
    rnorm(sum(replicate[-c(1:(x[5]+x[6]))]), mean=x[1], sd=x[3]))))
gene.ee=t(apply(parameter[(nDE+1):N,],1, function(x) c(rnorm(Ti, mean=x[1], sd=x[3]))))

gene.exp=rbind(gene.de, gene.ee)
                
dim(gene.exp)
head(gene.exp)
head(parameter)

0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21
1,-2.578914,-5.008225,3.8832,-5.900468,-3.035626,-1.901751,-9.531408,20.1844812,0.01100666,-10.559859,⋯,-5.397981,-4.951023,-9.2426663,-0.7181269,-10.601577,-1.298571,-0.8839284,-2.587046,-6.018523,0.6024899
2,-5.789879,-3.571147,-7.502044,-3.457898,-6.023532,-6.615295,-4.051317,-4.3336006,-6.16983272,-8.977959,⋯,-6.991738,-9.431366,-6.1973077,-12.3601702,-6.640681,-13.833225,-9.8541556,-3.856033,-3.630031,-4.2155801
3,-4.632496,-13.537417,-15.071868,-21.70524,-16.063009,-2.344622,-15.015141,2.7187531,-26.34601794,-14.912734,⋯,-18.370618,-18.4442,-10.0784418,-30.3183154,-15.474552,-19.63526,-11.0438885,-27.191105,-23.837181,-20.6881641
4,-18.896538,-29.197519,-23.401813,-22.985557,-27.997242,-26.35214,-26.357198,-24.6143701,-21.85819628,-29.893234,⋯,-33.281545,-26.938101,-23.136341,-25.0826744,-25.214864,-26.29024,-33.0118526,19.929165,35.736492,15.9762683
5,1.166935,-0.998462,3.931977,5.562003,2.250273,-10.031706,7.537726,-0.2054674,-0.6226839,8.242396,⋯,-1.782336,-3.143083,0.6042293,6.4091607,2.073749,-1.671993,-0.1972193,1.783717,-14.122523,-5.0991716
6,4.441938,4.94488,3.510511,4.086975,1.578461,4.511351,-10.394249,-9.205141,-9.64401096,-7.35838,⋯,-6.771395,-5.443706,-7.351025,-7.622454,-9.564551,-6.633108,-7.158806,-6.772567,-8.247895,-13.1447463


Unnamed: 0_level_0,mu1,mu2,sigma1,sigma2,n1,n2,n3
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
1,-4.136114,19.65798,3.51677,14.896818,2,1,5
2,-6.020321,-7.394323,2.042558,2.319634,4,3,1
3,-12.229347,-22.23366,8.124266,7.104597,4,4,0
4,-24.981949,20.903863,4.393284,7.633993,7,1,0
5,-1.578031,-0.970269,4.386907,4.936935,1,7,0
6,3.679111,-6.892291,1.526077,2.502622,2,6,0


In [None]:
Run EBtimecourse

In [3]:
ptm <- proc.time()
result = EBtimecourse(exp.dat = gene.exp, timepoint = timePoint, replicate = replicate, FDR=0.1, verbose=T)
print(proc.time() - ptm)

1 111092.35892902

Max delta ll: 10.2679987124429

21 110889.939023462

Max delta ll: 10.2546845477191

41 110693.374784004

Max delta ll: 9.9678675904288

61 110502.363213789

Max delta ll: 9.67734603759891

81 110316.236914733

Max delta ll: 9.41665477906645

101 110134.364810428

Max delta ll: 9.18988914598594

121 109956.229694207

Max delta ll: 8.99166919388517

141 109781.413430146

Max delta ll: 8.81652620060777

161 109609.575242231

Max delta ll: 8.66008341038832

181 109440.433557867

Max delta ll: 8.51900059831678

201 109273.752937846

Max delta ll: 8.39069245487917

221 109109.334166388

Max delta ll: 8.27313663615496

241 108947.007165886

Max delta ll: 8.16472328729287

261 108786.625373212

Max delta ll: 8.06415906225448

281 108628.061569733

Max delta ll: 7.97038151547895

301 108471.204499672

Max delta ll: 7.88252452189045

321 108315.95632193

Max delta ll: 7.79985768275219

341 108162.230507096

Max delta ll: 7.72177190732327

361 108009.950200618

Max delta ll: 7

3021 93175.9485983926

Max delta ll: 4.80300189078844

3041 93080.0668830578

Max delta ll: 4.79694484024367

3061 92984.3049246479

Max delta ll: 4.79093405989988

3081 92888.6617249449

Max delta ll: 4.78497198240075

3101 92793.1362314564

Max delta ll: 4.77906120024272

3121 92697.7273329524

Max delta ll: 4.7732045626035

3141 92602.433854139

Max delta ll: 4.76740509900264

3161 92507.2545513956

Max delta ll: 4.76166617912531

3181 92412.1881081056

Max delta ll: 4.75599131139461

3201 92317.233130533

Max delta ll: 4.75038426760875

3221 92222.3881444841

Max delta ll: 4.74484901045798

3241 92127.6515930762

Max delta ll: 4.73938958946383

3261 92033.021834636

Max delta ll: 4.73401024879422

3281 91938.4971434564

Max delta ll: 4.7287152109202

3301 91844.0757132731

Max delta ll: 4.72350862066378

3321 91749.7556621751

Max delta ll: 4.71839436329901

3341 91655.535041624

Max delta ll: 4.71337604966538

3361 91561.4118494893

Max delta ll: 4.70845669615665

3381 91467.38404

6021 79618.3532848663

Max delta ll: 4.5319830020162

6041 79527.5901784113

Max delta ll: 4.54382813283883

6061 79436.5842837095

Max delta ll: 4.55623498353816

6081 79345.3242943264

Max delta ll: 4.56921012327075

6101 79253.7987920618

Max delta ll: 4.58275824294833

6121 79161.9962851495

Max delta ll: 4.59688210114837

6141 79069.9052463149

Max delta ll: 4.61158261116361

6161 78977.514147533

Max delta ll: 4.62685906278784

6181 78884.811488976

Max delta ll: 4.64270947533078

6201 78791.7858202662

Max delta ll: 4.65913099955651

6221 78698.4257536673

Max delta ll: 4.67612029325392

6241 78604.7199714915

Max delta ll: 4.69367372378474

6261 78510.657232171

Max delta ll: 4.71178728551604

6281 78416.2263820318

Max delta ll: 4.73045612080023

6301 78321.416381504

Max delta ll: 4.74967356743582

6321 78226.2163556987

Max delta ll: 4.76942969523952

6341 78130.6156798305

Max delta ll: 4.78970931122603

6361 78034.6041100872

Max delta ll: 4.81048943717906

6381 77938.1719

Max delta ll: 0.017114599730121

8981 74658.4456790483

Max delta ll: 0.0157926440006122

9001 74658.1657253728

Max delta ll: 0.014554356559529

9021 74657.9082048009

Max delta ll: 0.0133961026731413

9041 74657.671623008

Max delta ll: 0.012314268911723

9061 74657.4545568766

Max delta ll: 0.01130527483474

9081 74657.2556535172

Max delta ll: 0.0103655841958243

9101 74657.0736290938

Max delta ll: 0.0094917148235254

9121 74656.9072674666

Max delta ll: 0.00868024816736579

9141 74656.755418671

Max delta ll: 0.00792783813085407

9161 74656.6169972482

Max delta ll: 0.00723121863848064

9181 74656.490980445

Max delta ll: 0.00658721048966981

9201 74656.3764062988

Max delta ll: 0.00599272792169359

9221 74656.2723716263

Max delta ll: 0.00544478347001132

9241 74656.1780299304

Max delta ll: 0.00494049247936346

9261 74656.0925892434

Max delta ll: 0.00447707668354269

9281 74656.0153099196

Max delta ll: 0.0040518673195038

9301 74655.9455023951

Max delta ll: 0.003662306960904

Max delta ll: 7.62024283176288e-05

11761 74655.2467066561

Max delta ll: 7.65585427870974e-05

11781 74655.2451717666

Max delta ll: 7.69123435020447e-05

11801 74655.2436298278

Max delta ll: 7.72636267356575e-05

11821 74655.2420808897

Max delta ll: 7.76123633841053e-05

11841 74655.2405250042

Max delta ll: 7.79584515839815e-05

11861 74655.2389622249

Max delta ll: 7.83018185757101e-05

11881 74655.2373926074

Max delta ll: 7.86424207035452e-05

11901 74655.2358162089

Max delta ll: 7.89800396887586e-05

11921 74655.2342330884

Max delta ll: 7.9314733739011e-05

11941 74655.2326433071

Max delta ll: 7.96463136794046e-05

11961 74655.2310469277

Max delta ll: 7.99746776465327e-05

11981 74655.2294440149

Max delta ll: 8.02997528808191e-05

12001 74655.2278346355

Max delta ll: 8.06214957265183e-05

12021 74655.2262188579

Max delta ll: 8.09397897683084e-05

12041 74655.2245967526

Max delta ll: 8.12544167274609e-05

12061 74655.2229683921

Max delta ll: 8.15653911558911e-05

12081

14501 74655.0180892465

Max delta ll: 6.5845757490024e-05

14521 74655.016791593

Max delta ll: 6.51943410048261e-05

14541 74655.0155071375

Max delta ll: 6.45372201688588e-05

14561 74655.0142359911

Max delta ll: 6.38743804302067e-05

14581 74655.0129782609

Max delta ll: 6.32060400675982e-05

14601 74655.0117340497

Max delta ll: 6.25325046712533e-05

14621 74655.0105034566

Max delta ll: 6.18538761045784e-05

14641 74655.0092865763

Max delta ll: 6.11704599577934e-05

14661 74655.0080834995

Max delta ll: 6.04823871981353e-05

14681 74655.0068943125

Max delta ll: 5.97898906562477e-05

14701 74655.0057190972

Max delta ll: 5.90932613704354e-05

14721 74655.0045579312

Max delta ll: 5.83925721002743e-05

14741 74655.0034108879

Max delta ll: 5.76881284359843e-05

14761 74655.0022780357

Max delta ll: 5.69802214158699e-05

14781 74655.0011594388

Max delta ll: 5.62689383514225e-05

14801 74655.0000551567

Max delta ll: 5.5554584832862e-05

14821 74654.9989652444

Max delta ll: 5.483

Max delta ll: 9.12070390768349e-07

17261 74654.9521160574

Max delta ll: 8.54226527735591e-07

17281 74654.952100566

Max delta ll: 7.9943856690079e-07

17301 74654.952086085

Max delta ll: 7.47517333365977e-07

17321 74654.9520725589

Max delta ll: 6.98506482876837e-07

17341 74654.9520599348

Max delta ll: 6.52158632874489e-07

17361 74654.952048162

Max delta ll: 6.08517439104617e-07

17381 74654.9520371919

Max delta ll: 5.67175447940826e-07

17401 74654.9520269784

Max delta ll: 5.28234522789717e-07

17421 74654.952017477

Max delta ll: 4.91607352159917e-07

17441 74654.9520086455

Max delta ll: 4.57162968814373e-07

17461 74654.9520004436

Max delta ll: 4.24726749770343e-07

17481 74654.951992833

Max delta ll: 3.94269591197371e-07

17501 74654.9519857771

Max delta ll: 3.65689629688859e-07

17521 74654.9519793955

Max delta ll: 4.80329617857933e-07

17541 74654.9519732099

Max delta ll: 4.09170752391219e-07

17561 74654.9519675992

Max delta ll: 3.02912667393684e-07

17581 7465

[1] "Converge params:"
$P
[1] 0.8041635

$mu0
[1] -0.2297049

$kappa0
[1] 0.1008914

$alpha0
[1] 0.9621107

$beta0
[1] 9.723649

   user  system elapsed 
248.306  60.615  99.060 


Summarize the sensitivity and FDR for Q1 and Q2

In [4]:
Q1_sensitivity = sum(result$cp.index %in% cp_real_seq_index)/nDE
Q1_FDR = sum(!(result$cp.index %in% cp_real_seq_index))/length(result$cp.index)
cp_position_correct_numer = 0
cp_position_not_correct_numer = 0
for(i in 1:nrow(result$cp.position)) {
  if(parameter[result$cp.position$SeqID[i],"n1"]==result$cp.position[i, "n1"] & parameter[result$cp.position$SeqID[i],"n2"]==result$cp.position[i, "n2"])
    cp_position_correct_numer=cp_position_correct_numer+1
  else
    cp_position_not_correct_numer = cp_position_not_correct_numer+1
}
Q2_sensitivity=cp_position_correct_numer/nDE
Q2_FDR=cp_position_not_correct_numer/nrow(result$cp.position)

print(Q1_sensitivity)
print(Q1_FDR)
print(Q2_sensitivity)
print(Q2_FDR)

[1] 0.795
[1] 0.0755814
[1] 0.725
[1] 0.07643312
