Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
91 changed files
with
13,130 additions
and
5 deletions.
There are no files selected for viewing
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,165 @@ | ||
# Benchmarks | ||
|
||
This folder contains the benchmark code used in **BioSimulator.jl: Stochastic simulation in Julia**. | ||
It compares BioSimulator.jl, [StochPy](http://stochpy.sourceforge.net/), [StochKit2](https://sourceforge.net/projects/stochkit/), and [Gillespie.jl](https://github.com/sdwfrost/Gillespie.jl) across a few examples. These benchmarks are not exhaustive. | ||
|
||
# What are the benchmarks? | ||
|
||
These scripts measure the performance of Gillespie's Direct method (SSA) across each software tool. | ||
In addition, the BioSimulator.jl benchmarks include additional results for the algorithms currently implemented (SSA, FRM, NRM, OTL, and SAL) using full simulation output, fixed-interval output, and multi-threading. | ||
There are five basic models included in this suite (see [model definitions](#Model-definitons)). | ||
|
||
The *task* that is benchmarked is | ||
|
||
``` | ||
run-ssa model t_final n_saves n_trial | ||
``` | ||
|
||
where `run-ssa` is the command that runs SSA for a given software package. | ||
Here `t_final` specifies a time span for an individual realization of the stochastic process, `n_saves` is the number of time points saved for a single realization, and `n_trial` is the number of realizations to simulate. | ||
The time to complete this task is sampled several times to obtain reasonable estimates of summary statistics. | ||
See the section on [benchmark parameters](#Benchmark-parameters) for additional details. | ||
|
||
**The benchmark task we have defined reflects how quickly a given tool generates multiple time trajectories of a given stochastic process using SSA.** | ||
"Good performance" depends on how well the following interact: | ||
|
||
- internal model representation/construction | ||
- algorithm implementation | ||
- simulation engine(s) | ||
- individual output generation/storage | ||
- ensemble output generation/storage | ||
|
||
With the exception of `StochPy`, performance is measured using [BenchmarkTools.jl](https://github.com/JuliaCI/BenchmarkTools.jl). | ||
|
||
# Minimal requirements | ||
|
||
| Software | Version | Dependencies | | ||
|-----------------|---------|----------------------------| | ||
| BioSimulator.jl | 0.4 | Julia v0.6.* | | ||
| StochPy | 2.3 | Python 2.6+ or 3.4+, NumPy | | ||
| StochKit2 | 2.0.13 | | | ||
| Gillespie.jl | 0.0.2 | | | ||
|
||
**Note**: The StochPy scripts use `python3` by default. | ||
|
||
# Running the benchmarks | ||
|
||
## Setup | ||
|
||
The main script is contained in `examples.sh`. | ||
It requires `julia`, `python3`, and `ssa` (for StochKit2) to be visible in your `PATH`. | ||
|
||
## Individual scripts | ||
|
||
BioSimulator.jl, StochPy, and Gillespie.jl each have an associated script for running a benchmark from the command line. | ||
Running BioSimulator.jl with the multithreading option requires setting the `JULIA_NUM_THREADS` environment variable (see `examples.sh` for an example). | ||
It uses the following scripts to run the suite: | ||
|
||
#### BioSimulator.jl | ||
|
||
Usage: `julia benchmarks_biosimulator.jl [model1] [model2] ...` | ||
|
||
#### StochPy | ||
|
||
Usage: `python3 stochpy_bench.py [model] [t_final] [n_saves] [seed] [n_trial]` | ||
|
||
#### StochKit | ||
|
||
Usage: `julia benchmarks_stochkit.jl [model1] [model2] ...` | ||
|
||
#### Gilespie.jl | ||
|
||
Usage: `julia benchmarks_gillespie.jl [model1] [model2] ...` | ||
|
||
## Model definitions | ||
|
||
Directories: | ||
- `autoreg` | ||
- `dimer-decay` | ||
- `kendall` | ||
- `mmek` | ||
- `yeast` | ||
|
||
Files (within each model directory): | ||
|
||
- `biosimulator.jl` | ||
- `gillespie.jl` | ||
- `stochkit2.xml` | ||
- `stochpy.psc` | ||
|
||
## Benchmark parameters | ||
|
||
With the exception of the StochPy script, all the benchmark scripts source benchmark parameters from `benchmark_parameters.jl`. | ||
This file simply stores the arguments passed to each tool's simulation command. | ||
Parameters are stored in a `ParamSet` which is a 5-tuple with the following structure: | ||
|
||
- `t_final`: Each simulation command realizes the simulated stochastic process up to this value. | ||
- `n_saves`: The number of time points to save for each realization. Used only for fixed-interval output. | ||
- `n_trial`: The number of realizations to simulate. | ||
- `n_sample`: The number of times to sample a simulation *task*, as defined above ([What are the benchmarks?](What-are-the-benchmarks?)) | ||
- `t_limit`: A time limit for individual benchmarks. A benchmark task will terminate if this limit is exceeded, even if the number of samples is less than `n_sample`. | ||
|
||
In addition, a `SEED` value is used to seed each benchmark task. | ||
The `MODELS` variable stores the name of each model included in this benchmark suite. | ||
The `parameters` dictionary is used to associate `MODELS[i]` to a `ParamSet`. | ||
|
||
## Results | ||
|
||
Files: | ||
|
||
- `biosimulator-serial.json` | ||
- `biosimulator-parallel.json` | ||
- `stochkit.json` | ||
- `gillespie.json` | ||
- `stochpy/*.txt` | ||
|
||
The BenchmarkTools.jl output is saved in JSON files. | ||
These files, along with the output from StochPy, are summarized using | ||
|
||
``` | ||
# for JSON output | ||
julia analysis.jl <filename>.json >> <another-filename>.txt | ||
# for StochPy output | ||
julia analysis.jl ./stochpy/<model>.txt >> summary-stochpy.txt | ||
``` | ||
The summary files are structured as follows: | ||
|
||
``` | ||
----- <model-name-here> ----- | ||
trials: <value> # no. times the task was measured | ||
mean: <estimate> ms # mean based on <value> | ||
std: <estimate> ms # standard deviation based on <value> | ||
median: <estimate> ms # median based on <value> | ||
Q1: <estimate> ms # first quartile based on <value> | ||
Q3: <estimate> ms # third quartile based on <value> | ||
``` | ||
|
||
# Important caveats for consideration | ||
|
||
The tools selected here vary in output saving strategies, algorithm selection, and model representation. | ||
These notes should inform the interpretation of the benchmark results. | ||
Bold type is used to emphasize a point that is likely to have a large impact on performance. | ||
Please feel free to file an issue if an important point is missing. | ||
|
||
### BioSimulator.jl | ||
|
||
- Uses fixed-interval output. | ||
- Generates a dependency graph even if it is not used. | ||
|
||
### StochPy | ||
|
||
- Saves state after each simulation step. Fixed-interval output is available but it uses StochKit2 internally. | ||
- Saves additional information about reaction propensities. | ||
|
||
### StochKit2 | ||
|
||
- Uses fixed-interval output. | ||
- Uses parallelism by default. | ||
- Selects a variant of SSA based on model information. In our benchmarks, `odm_ssa_small` is used. | ||
- **The `ssa` command is invoked from within Julia.** The benchmark results use a small adjustment to account for any overhead, which varies depending on the machine. | ||
|
||
### Gillespie.jl | ||
|
||
- Saves state after each simulation step. | ||
- Does not generate trajectories in an ensemble like the other tools. **We implement a simple wrapper that uses a `for` loop to mimic this feature.** The output of each run is effectively ignored. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,64 @@ | ||
using BenchmarkTools, StatsBase | ||
|
||
##### function definitions ##### | ||
|
||
# for BenchmarkTools.jl results | ||
function summarize(::Val{true}, fname) | ||
# retrieve the Trial objects for the test suite | ||
results = BenchmarkTools.load(fname)[1] | ||
|
||
for (b_info, trial) in leaves(results) | ||
# convert results from ns to ms | ||
times = [ t * 1e-6 for t in trial.times ] | ||
|
||
b_mean = mean(times) | ||
b_std = std(times) | ||
b_median = median(times) | ||
b_q1 = quantile(times, 0.25) | ||
b_q3 = quantile(times, 0.75) | ||
|
||
# benchmark info string | ||
str = [ "$(s) " for s in b_info ] | ||
|
||
println("----- $(str...)-----") | ||
println("trials: $(length(times))") | ||
println(" mean: $(b_mean) ms") | ||
println(" std: $(b_std) ms") | ||
println("median: $(b_median) ms") | ||
println(" Q1: $(b_q1) ms") | ||
println(" Q3: $(b_q3) ms") | ||
println() | ||
end | ||
end | ||
|
||
function summarize(::Val{false}, fname) | ||
# retrieve data from Python benchmarks | ||
results = map(x -> parse(Float64, x), readlines(fname)) | ||
|
||
# convert results from s to ms | ||
times = [ t * 1e3 for t in results ] | ||
|
||
b_mean = mean(times) | ||
b_std = std(times) | ||
b_median = median(times) | ||
b_q1 = quantile(times, 0.25) | ||
b_q3 = quantile(times, 0.75) | ||
|
||
println("----- $(fname) ------") | ||
println("trials: $(length(results))") | ||
println(" mean: $(b_mean) ms") | ||
println(" std: $(b_std) ms") | ||
println("median: $(b_median) ms") | ||
println(" Q1: $(b_q1) ms") | ||
println(" Q3: $(b_q3) ms") | ||
println() | ||
end | ||
|
||
##### command-line tool ##### | ||
|
||
fname = ARGS[1] | ||
ftype = split(fname, ".")[end] | ||
|
||
is_json = ftype == "json" | ||
|
||
summarize(Val(is_json), fname) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
""" | ||
Prokaryotic auto-regulation network | ||
""" | ||
function autoreg(; | ||
gene = 10, P2_gene = 0, RNA = 0, P = 0, P2 = 0, | ||
k1 :: Float64 = 0.01, | ||
k2 :: Float64 = 10.0, | ||
k3 :: Float64 = 1.0, | ||
k3r :: Float64 = 1.0, | ||
k4 :: Float64 = 1.0, | ||
k4r :: Float64 = 10.0, | ||
k5 :: Float64 = 0.1, | ||
k6 :: Float64 = 0.01) | ||
m = Network("auto-regulation") | ||
|
||
m <= Species("gene", gene) | ||
m <= Species("P2_gene", P2_gene) | ||
m <= Species("RNA", RNA) | ||
m <= Species("P", P) | ||
m <= Species("P2", P2) | ||
|
||
m <= Reaction("transcription", k1, "gene --> gene + RNA") | ||
m <= Reaction("translation", k2, "RNA --> RNA + P") | ||
m <= Reaction("dimerization", k3, "P + P --> P2") | ||
m <= Reaction("dissociation", k3r, "P2 --> P + P") | ||
m <= Reaction("repression binding", k4, "gene + P2 --> P2_gene") | ||
m <= Reaction("reverse repression binding", k4r, "P2_gene --> gene + P2") | ||
m <= Reaction("RNA degradation", k5, "RNA --> 0") | ||
m <= Reaction("protein degradation", k6, "P --> 0") | ||
|
||
return m | ||
end | ||
|
||
model = autoreg() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
function F_autoreg(x, parms) | ||
gene, P2_gene, RNA, P, P2 = x | ||
k1, k1r, k2, k3, k4, k4r, k5, k6 = parms | ||
|
||
repression_binding = k1 * gene * P2 | ||
reverse_repression = k1r * P2_gene | ||
transcription = k2 * gene | ||
translation = k3 * RNA | ||
dimerization = k4 * P * (P - 1) / 2 | ||
dissociation = k4r * P2 | ||
RNA_degradation = k5 * RNA | ||
protein_degradation = k6 * P | ||
|
||
return [repression_binding, reverse_repression, transcription, translation, dimerization, dissociation, RNA_degradation, protein_degradation] | ||
end | ||
|
||
function gillespie_autoreg(; | ||
gene = 10, P2_gene = 0, RNA = 0, P = 0, P2 = 0, | ||
k1 = 1.0, k1r = 10.0, k2 = 0.01, k3 = 10.0, k4 = 1.0, k4r = 1.0, k5 = 0.1, k6 = 0.01) | ||
# initial conditions | ||
x0 = [gene, P2_gene, RNA, P, P2] | ||
# stoichiometries | ||
nu = [ | ||
[-1 1 0 0 -1]; # gene + P2 --> P2_gene | ||
[ 1 -1 0 0 1]; # P2_gene --> gene + P2 | ||
[ 0 0 1 0 0]; # gene --> gene + RNA | ||
[ 0 0 0 1 0]; # RNA --> RNA + P | ||
[ 0 0 0 -2 1]; # P + P --> P2 | ||
[ 0 0 0 2 -1]; # P2 --> P + P | ||
[ 0 0 -1 0 0]; # RNA --> 0 | ||
[ 0 0 0 -1 0] # P --> 0 | ||
] | ||
# parameter vector | ||
parms = [k1, k1r, k2, k3, k4, k4r, k5, k6] | ||
return x0, nu, parms | ||
end | ||
|
||
x0, nu, parms = gillespie_autoreg() | ||
F = F_autoreg |
Oops, something went wrong.