The current project state includes two available packages written along with the MSc Dissertation and all source code required to generate the results presented in the submitted Dissertation.
The python muqfatc
package was written for analysing micro Quantiative Fitness Analysis images obtained by automated microscopy in order to obtain isogenic single lineage time course.
This package is now on PyPI and can be installed using pip install muqfatc
.
The detstochgrowth
package was written for subsequent time course analyses. Provided functions are for generating single lineage growth curves, generating population inferences from single lineage data and more. This package was used to analyze both the in-house muQFA data and the published data kindly provided to us by [Levy et al. 2012] (http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3348152/). Synthetic data was also analyzed for exploratory purposes. These data are included in the Analyses/Synthetic directory.
- Time course observations are saved as tiff images and are sorted into folders labeled by pin numbers (RxxCzz) specifying the rows and columns from the top left-hand corner of an agar plate on which yeast cells were grown.
- The image analysis tool
muqfatc
iterates through the pin directories analyzing single lineages observed in each time course. - Single blobs are captured by shape detection using a Canny edge map detection algorithm and optical dilation and erosion. Colonies are checked for circularity.
FINALPHOT=20 # Time course length
fullpath = os.path.abspath(syspath) # Working directory
# Generating a grey background of median pixels of all first images
DX,DY =20,20 # Background size
bk=muqfatc.makeBackground(myfolders,DX,DY)
# Generating time course estimates (as output matrices saved in text files)
# and time course images (saved in respective output folders).
muqfatc.lineagetimecourse(myfolders,FINALPHOT,bk,DX,DY,fullpath,apply_filt=True,
save_pics=True,log=False,show_im=False)
- In order to be able to compare the population simulations done using single lineage data to true population observations pin population growth can also be captured using
muqfatc
.
# Getting time and area for pin growth estimates for 90 observations
area,time=muqfatc.pintimecourse(myfolders,90,fullpath)
Two frequentist, deterministic approaches are implemented.
rates=c()
for (i in 1:dim(area)[1]){
k=detstocgrowth::LM_growthrate(area[i,],times[i,])$rate
rates=c(rates,k)
}
rates=c()
for (i in 1:dim(area)[1]){
k=detstocgrowth::EXP_growthrate(area[i,],times[i])$rate
rates=c(rates,k)
}
A bayesian approach using rjags
with various model specifications is available. A few examples are shown below.
gc=250:255 # Inference for growth curves 250:255
area=area[gc,]
times=times[gc,]
data=data[gc,]
iterations=1000000
thin=1000
num_chains=4
model_choice=2
bayesinf=detstocgrowth::BayesDet(area,times,model_choice,iterations,thin,num_chains)
model_choice=1
bayesinf=detstochgrowth::BayesDet(area,times,model_choice,iterations,thin,num_chains)
model_choice=3
bayesinf=detstochgrowth::BayesDet(area,times,model_choice,iterations,thin,num_chains)
Both the exponential and the logistic model have a additional model specification with a B value. The B values sets a distinction between dividing and non-dividing growth curves such that N(t)=BN0exp(r*t) for the exponential model. B=1 means the lineage is growing / cells are dividing. B=0 means no growth. This was used to test as a possible option for the stochastic model. (Not discussed in the Dissertation).
A Bayesian stochastic/deterministic hybrid model was also tested for parameter inference. NB: the current implementation is not suitable for a high-throughput level but can be used on individual growth curves which contain some noise and are relatively fast-growing.
# Apply calibration and format data
cells=as.numeric(area)/area[,1]
data=data.frame(c=cells,t=as.numeric(times))
rownames(data)=data$t
data$t=NULL
# Specify parameters for Inference
guessK=100000
guessr=0.3
params=c(K=guessK,r=guessr)
pmin=c(K=0.5*guessK,r=0.01)
pmax=c(K=2*guessK,r=1)
switchN=100 # switch from stochastic to deterministic model
noiseSD=20
iterations=1000000
thin=10000
tune=0.02
# Inference using the Hybrid Model
detstochgrowth::mcmc_chain=BayesHybrid(params,pmin,pmax,iterations,tune,thin,noiseSD,switchN,data)
smfsb::mcmcSummary(mcmc_chain,show=TRUE,plot=TRUE)
# Plotting the MCMC Output
detstochgrowth::PlotPosteriorProb(pmin,pmax,mcmc_chain)
# Plotting the Posterior Predictives
detstocgrowth::PlotPosteriorPredictive(data,mcmc_chain,switchN,guessK)
Population growth is simulated from single lineage data associated with a strain as follows:
strain_names=unique(data$genotype)
pickstrain="HIS3"
strain=detstocgrowth::subset_strain(data,area,times,pickstrain)
strain_rates=rates[which(data$genotype==pickstrain)]
iterations=100
yl=10^22
gr=seq(0,0.5,0.005)
N=1000
t=seq(1,48)
gr=seq(0,0.5,0.001)
simpopdat=detstocgrowth::pop_sim_plot(strain,strain_rates,gr,yl,iterations,t)
Additional features which can be used to analyse the population simulations include:
# Generating a colour map which is used to color population simulations
detstochgrowth::yellowredmap(gr)
# Producing fishplots
N=10000
sample_rates=sample(strain_rates,N,replace=TRUE)
time=seq(1,48)
detstochgrowth::pop_comoposition(sample_rates,time,gr)
# Calculating percentile lineage contributions
t=seq(1,48,1)
iter=1000
StartPops=c(50,100,500,1000,5000,10000)
#StartPops=c(50,100,500)
for (N in StartPops){
newinfo=detstochgrowth::pop_composition_dat(t,iter,N,strain_rates)
if (N==StartPops[1]){
info=newinfo
} else{
info=data.frame(info,newinfo)
}
}
# Estimating lag phase durations
N=seq(100,10000,100)
time=seq(0,48,1)
iterations=100
strainlag=detstocgrowth::lagduration(strain=0,strain_rates,t,N,iterations)
which assess lag duration with increasing inoculation size, lineage growth increasing inoculation sizes and single lineage percentile composition of population over time.
# Plotting the Pin Population Estimates
PinDat=detstochgrowth::pin_pop_plot(strain,strainarea,sraintimes)
sortedfolders=sortedfolders[18] #Looking only at pin 18
obs=30 # Time course length
conv=92.5 # Conversion factor; area/conv=cells
it=100 #Number of simulations
Comp=detstocgrowth::PlotPinObsSim(area,times,sortedfolders,obs,conv,it,leg=FALSE)
- There is an additional feature in the
muqfatc
package which colour-codes the colonies on the plate according the growth rate and residual size. Code for this is included in the Analyses/ImageAnalysis directory but is not addressed in the Dissertation. No significant patters were found.
Discrete stochastic kinetics for individual microbial microcolony growth curves.
Some slides about microQFA.
Some slides about discrete stochastic modelling of microcolony growth curves.
Data obtained from [Levy et al. 2012] (http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3348152/) and [Ziv et al. 2013] (http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3840306/).