Skip to content

Commit

Permalink
separate quick start document to keep doc indexes tidy
Browse files Browse the repository at this point in the history
  • Loading branch information
exaexa committed Jun 6, 2022
1 parent 9f22692 commit 152addf
Show file tree
Hide file tree
Showing 3 changed files with 123 additions and 103 deletions.
109 changes: 13 additions & 96 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,9 @@
[docs-img-dev]: https://img.shields.io/badge/docs-latest-0af.svg
[docs-url-dev]: https://lcsb-biocore.github.io/COBREXA.jl/dev/

[docs-url-quickstart]: https://lcsb-biocore.github.io/COBREXA.jl/stable/quickstart/
[docs-url-examples]: https://lcsb-biocore.github.io/COBREXA.jl/stable/examples/

[docker-url]: https://hub.docker.com/r/lcsbbiocore/cobrexa.jl
[docker-img]: https://img.shields.io/docker/image-size/lcsbbiocore/cobrexa.jl

Expand Down Expand Up @@ -66,11 +69,11 @@ installation-related difficulties. Of course, [the Julia
channel](https://discourse.julialang.org/) is another fast and easy way to find
answers to Julia specific questions.

### Quick start guide
### Quick start

[COBREXA.jl documentation](https://lcsb-biocore.github.io/COBREXA.jl/stable/)
[COBREXA.jl documentation][docs-url-stable]
is available online (also for
[development version](https://lcsb-biocore.github.io/COBREXA.jl/dev/)
[development version][docs-url-dev]
of the package).

<!--quickstart_begin-->
Expand Down Expand Up @@ -129,100 +132,13 @@ Dict{String,Float64} with 95 entries:
"R_TALA" => 1.49698
⋮ => ⋮
```

#### Model variant processing

The main feature of COBREXA.jl is the ability to easily specify and process
many analyses in parallel. To demonstrate, let's see how the organism would perform if
some reactions were disabled independently:

```julia
# convert to a model type that is efficient to modify
m = convert(StandardModel, model)

# find the model objective value if oxygen or carbon dioxide transports are disabled
screen(m, # the base model
variants=[ # this specifies how to generate the desired model variants
[], # one with no modifications, i.e. the base case
[with_changed_bound("R_O2t", lower=0.0, upper=0.0)], # disable oxygen
[with_changed_bound("R_CO2t", lower=0.0, upper=0.0)], # disable CO2
[with_changed_bound("R_O2t", lower=0.0, upper=0.0),
with_changed_bound("R_CO2t", lower=0.0, upper=0.0)], # disable both
],
# this specifies what to do with the model variants (received as the argument `x`)
analysis = x ->
flux_balance_analysis_dict(x, Tulip.Optimizer)["R_BIOMASS_Ecoli_core_w_GAM"],
)
```
You should receive a result showing that missing oxygen transport makes the
biomass production much harder:
```julia
4-element Vector{Float64}:
0.8739215022674809
0.21166294973372796
0.46166961413944896
0.21114065173865457
```

Most importantly, such analyses can be easily specified by automatically
generating long lists of modifications to be applied to the model, and
parallelized.

Knocking out each reaction in the model is efficiently accomplished:

```julia
# load the task distribution package, add several worker nodes, and load
# COBREXA and the solver on the nodes
using Distributed
addprocs(4)
@everywhere using COBREXA, Tulip

# get a list of the workers
worker_list = workers()

# run the processing in parallel for many model variants
res = screen(m,
variants=[
# create one variant for each reaction in the model, with that reaction knocked out
[with_changed_bound(reaction_id, lower=0.0, upper=0.0)]
for reaction_id in reactions(m)
],
analysis = model -> begin
# we need to check if the optimizer even found a feasible solution,
# which may not be the case if we knock out important reactions
sol = flux_balance_analysis_dict(model, Tulip.Optimizer)
isnothing(sol) ? nothing : sol["R_BIOMASS_Ecoli_core_w_GAM"]
end,
# run the screening in parallel on all workers in the list
workers = worker_list,
)
```

In result, you should get a long list of the biomass production for each
reaction knockout. Let's decorate it with reaction names:
```julia
Dict(reactions(m) .=> res)
```
...which should output an easily accessible dictionary with all the objective
values named, giving a quick overview of which reactions are critical for the
model organism to create biomass:
```julia
Dict{String, Union{Nothing, Float64}} with 95 entries:
"R_ACALD" => 0.873922
"R_PTAr" => 0.873922
"R_ALCD2x" => 0.873922
"R_PDH" => 0.796696
"R_PYK" => 0.864926
"R_CO2t" => 0.46167
"R_EX_nh4_e" => 1.44677e-15
"R_MALt2_2" => 0.873922
"R_CS" => 2.44779e-14
"R_PGM" => 1.04221e-15
"R_TKT1" => 0.864759
=>
```
<!--quickstart_end-->

The main feature of COBREXA.jl is the ability to easily specify and process a
huge number of analyses in parallel. You. You can have a look at a
[longer guide that describes the parallelization and screening functionality][docs-url-quickstart],
or dive into the [example analysis workflows][docs-url-examples].

### Testing the installation

If you run a non-standard platform (e.g. a customized operating system), or if
Expand Down Expand Up @@ -272,7 +188,8 @@ The development was supported by European Union's Horizon 2020 Programme under
PerMedCoE project ([permedcoe.eu](https://permedcoe.eu/)) agreement no. 951773.
<!--acknowledgements_end-->

If you use COBREXA.jl and want to refer to it in your work, use the following citation format (also available as BibTeX in [cobrexa.bib](cobrexa.bib)):
If you use COBREXA.jl and want to refer to it in your work, use the following
citation format (also available as BibTeX in [cobrexa.bib](cobrexa.bib)):

> Miroslav Kratochvíl, Laurent Heirendt, St Elmo Wilken, Taneli Pusa, Sylvain Arreckx, Alberto Noronha, Marvin van Aalst, Venkata P Satagopam, Oliver Ebenhöh, Reinhard Schneider, Christophe Trefois, Wei Gu, *COBREXA.jl: constraint-based reconstruction and exascale analysis*, Bioinformatics, Volume 38, Issue 4, 15 February 2022, Pages 1171–1172, https://doi.org/10.1093/bioinformatics/btab782
Expand Down
21 changes: 14 additions & 7 deletions docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -27,22 +27,28 @@ for notebook in notebooks
Literate.notebook(notebook, notebooks_outdir)
end

# generate index.md from .template and the quickstart in README.md
readme = open(f -> read(f, String), joinpath(@__DIR__, "..", "README.md"))
# extract shared documentation parts from README.md
readme_md = open(f -> read(f, String), joinpath(@__DIR__, "..", "README.md"))
quickstart =
match(r"<!--quickstart_begin-->\n([^\0]*)<!--quickstart_end-->", readme).captures[1]
match(r"<!--quickstart_begin-->\n([^\0]*)<!--quickstart_end-->", readme_md).captures[1]
acks = match(
r"<!--acknowledgements_begin-->\n([^\0]*)<!--acknowledgements_end-->",
readme,
readme_md,
).captures[1]
ack_logos =
match(r"<!--ack_logos_begin-->\n([^\0]*)<!--ack_logos_end-->", readme).captures[1]
match(r"<!--ack_logos_begin-->\n([^\0]*)<!--ack_logos_end-->", readme_md).captures[1]

# insert the shared documentation parts into index and quickstart templates
#TODO use direct filename read/write
index_md = open(f -> read(f, String), joinpath(@__DIR__, "src", "index.md.template"))
index_md = replace(index_md, "<!--insert_quickstart-->\n" => quickstart)
index_md = replace(index_md, "<!--insert_acknowledgements-->\n" => acks)
index_md = replace(index_md, "<!--insert_ack_logos-->\n" => ack_logos)
open(f -> write(f, index_md), joinpath(@__DIR__, "src", "index.md"), "w")

quickstart_md = open(f -> read(f, String), joinpath(@__DIR__, "src", "quickstart.md.template"))
quickstart_md = replace(quickstart_md, "<!--insert_quickstart-->\n" => ack_logos)
open(f -> write(f, quickstart_md), joinpath(@__DIR__, "src", "quickstart.md"), "w")

# copy the contribution guide
cp(
joinpath(@__DIR__, "..", ".github", "CONTRIBUTING.md"),
Expand Down Expand Up @@ -78,6 +84,7 @@ makedocs(
linkcheck = !("skiplinks" in ARGS),
pages = [
"Home" => "index.md",
"Quick start" => "quickstart.md",
"User guide" => [
"Quickstart tutorials" =>
vcat("All tutorials" => "tutorials.md", find_mds("tutorials")),
Expand All @@ -86,7 +93,7 @@ makedocs(
"Examples and notebooks" =>
vcat("All notebooks" => "notebooks.md", find_mds("notebooks")),
],
"Types and functions" => vcat("Contents" => "functions.md", find_mds("functions")),
"Function reference" => vcat("Contents" => "functions.md", find_mds("functions")),
"How to contribute" => "howToContribute.md",
],
)
Expand Down
96 changes: 96 additions & 0 deletions docs/src/quickstart.md.template
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@

# Quick start guide

<!--insert_quickstart-->

#### Model variant processing

The main feature of COBREXA.jl is the ability to easily specify and process
many analyses in parallel. To demonstrate, let's see how the organism would perform if
some reactions were disabled independently:

```julia
# convert to a model type that is efficient to modify
m = convert(StandardModel, model)

# find the model objective value if oxygen or carbon dioxide transports are disabled
screen(m, # the base model
variants=[ # this specifies how to generate the desired model variants
[], # one with no modifications, i.e. the base case
[with_changed_bound("R_O2t", lower=0.0, upper=0.0)], # disable oxygen
[with_changed_bound("R_CO2t", lower=0.0, upper=0.0)], # disable CO2
[with_changed_bound("R_O2t", lower=0.0, upper=0.0),
with_changed_bound("R_CO2t", lower=0.0, upper=0.0)], # disable both
],
# this specifies what to do with the model variants (received as the argument `x`)
analysis = x ->
flux_balance_analysis_dict(x, Tulip.Optimizer)["R_BIOMASS_Ecoli_core_w_GAM"],
)
```
You should receive a result showing that missing oxygen transport makes the
biomass production much harder:
```julia
4-element Vector{Float64}:
0.8739215022674809
0.21166294973372796
0.46166961413944896
0.21114065173865457
```

Most importantly, such analyses can be easily specified by automatically
generating long lists of modifications to be applied to the model, and
parallelized.

Knocking out each reaction in the model is efficiently accomplished:

```julia
# load the task distribution package, add several worker nodes, and load
# COBREXA and the solver on the nodes
using Distributed
addprocs(4)
@everywhere using COBREXA, Tulip

# get a list of the workers
worker_list = workers()

# run the processing in parallel for many model variants
res = screen(m,
variants=[
# create one variant for each reaction in the model, with that reaction knocked out
[with_changed_bound(reaction_id, lower=0.0, upper=0.0)]
for reaction_id in reactions(m)
],
analysis = model -> begin
# we need to check if the optimizer even found a feasible solution,
# which may not be the case if we knock out important reactions
sol = flux_balance_analysis_dict(model, Tulip.Optimizer)
isnothing(sol) ? nothing : sol["R_BIOMASS_Ecoli_core_w_GAM"]
end,
# run the screening in parallel on all workers in the list
workers = worker_list,
)
```

In result, you should get a long list of the biomass production for each
reaction knockout. Let's decorate it with reaction names:
```julia
Dict(reactions(m) .=> res)
```
...which should output an easily accessible dictionary with all the objective
values named, giving a quick overview of which reactions are critical for the
model organism to create biomass:
```julia
Dict{String, Union{Nothing, Float64}} with 95 entries:
"R_ACALD" => 0.873922
"R_PTAr" => 0.873922
"R_ALCD2x" => 0.873922
"R_PDH" => 0.796696
"R_PYK" => 0.864926
"R_CO2t" => 0.46167
"R_EX_nh4_e" => 1.44677e-15
"R_MALt2_2" => 0.873922
"R_CS" => 2.44779e-14
"R_PGM" => 1.04221e-15
"R_TKT1" => 0.864759
⋮ => ⋮
```

0 comments on commit 152addf

Please sign in to comment.