Skip to content

MOVING TO: SEE README. Because sometimes you just want to simulate single prokaryotic biological living whole cell models starting from DNA to minute detail to understand how it works and predict simple experimental observations.

Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Awesome whole cell simulation

1. The idea

I was thinking maybe something along:

  • precalculated interactivity of the molecules (simulated?!) to speed things up

  • then discretize cell spatially. TODO necessary? if yes, maybe a 2D slice type of model, on GPU? Model molecule concentration with time in each slot.

  • Then modify genome, determine what proteins come out. Protein folding simulations?!

  • precalculate interactions of new proteins, and loop


  • can you get enough data out of a single cell? How long until it doubles? Do the cells interact with one another, and or change the extracelullar medium noticeably? Having single cells would be simpler and more precise.

    Another option is to make on huge cell in a Cell free environment:

2. Cell free

Since cells are themselves to complicated, maybe we can start with simplified DNA systems?

Maybe we can make insulin more efficiently by selecting only the parts of the cell we care about!


  • Compartmentalization of an all-E. coli Cell-Free Expression System for the Construction of a Minimal Cell. 2016. Filippo Caschera, Vincent Noireaux

  • Engine out of the chassis: Cell-free protein synthesis and its uses. 2013. Gabriel Rosenblum, Barry S. Cooperman.

  • Rapid and Scalable Preparation of Bacterial Lysates for Cell-Free Gene Expression. 2017. Andriy Didovyk, Taishi Tonooka, Lev Tsimring, and Jeff Hasty.

  • Developing cell-free biology for industrial applications. 2006. Jim Swartz.

  • Portable, On-Demand Biomolecular Manufacturing. 2016. Keith Pardee, Shimyn Slomovic, Peter Q. Nguyen, Christopher N. Boddy, Neel S. Joshi, James J. Collins.

2.1. Minimal biology

I like this.

3. Projects with source code

Shut up and give me the fine source.

4. Videos

Shut up and show me a visualisation.

5. Research areas

5.2. Gene editing

Ah, it would be even more awesome if we could hack up the cells and see them do stuff.

Heart only in second half 2010’s did it become possible to edit genes, but coding the entire DNA from scratch is still too expensive.

Previously, you would have to:

and then kill ones that didn’t get the gene, which is less reliable.

5.2.1. Exciting gene edited systems

6. Papers

I guess this is what researchers do instead of blog posts. Go figure!

  • The principles of whole-cell modeling. Jonathan R Karr, Koichi Takahashi and Akira Funahashi

  • The Future of Whole-Cell Modeling. Derek N. Macklin, Nicholas A. Ruggero, and Markus W. Covert

  • Paper-Based Synthetic Gene Networks. Keith Pardee, Alexander A. Green, Tom Ferrante D. Ewen Cameron, Ajay DaleyKeyser, Peng Yin, and James J. Collins Wyss

  • Paper as a novel material platform for devices. Jason P. Rolland and Devin A. Mourey

  • A Whole-Cell Computational Model Predicts Phenotype from Genotype. Jonathan R. Karr, Jayodita C. Sanghvi, Derek N. Macklin, Miriam V. Gutschow, Markus Covert. Notes: Mycoplasma genitalium. Model apparently at:

7. Lectures

10. Big projects, institutes and companies

11. Experimental


Manipulate individual cells:

DNA sequencing:

11.1. Single cell stuff

Protein measurement:


11.2. Mass spectroscopy

Potentially measure the quantities of every substance in the cell?

12. Cell behaviour

Random list of interesting cell behaviour that we have to model and might verify, in particular what kind of external environment they expect to encounter:

15. Courses and conferences

17. Data sources

Questions that beg for a database answer:

17.1. How to download the genomes?

It is freaking hard to get the FASTA with wget links? OMG it is so bad.

Best way so far is to get accession number of type NC_001416.1 and then:

wget -O NC_001416.1.fasta ''


  • where is that API documented?

  • how to download zipped data?

  • data sources?

  • how is population genetic variation accounted for?

  • what do the NNNN mean? Uknown? Present on human genome.

  • what are the "unlocalized genomic scaffold" regions?

17.2. Cool genomes

17.2.1. Lambda phage


17.3. Data formatss

17.3.1. FASTA

Just raw sequence + origin / id metadata.

17.3.2. FASTQ

FASTA + unstandardized ASCII scores for base pair calls.

Widely output by sequencing machines as of 2010’s.

17.3.3. SBML

A file format for models?!

18. Low entry barrier

DIY off topic you don’t need to be a PhD type of resources for people like me

19. Molecular dynamics

21. Techniques

21.1. DNA synthesis



Summary of all enzymatic companies in 2019 at Synbiobeta 2019: (archive)

22. Action

22.1. Bowtie2

git clone
cd bowtie2
git checkout f5d794d7588a5ce4a7e735c42667be5abe0cdaf2
mkdir tmp
cd tmp
"${BT2_HOME}/bowtie2-build" "${BT2_HOME }/example/reference/lambda_virus.fa" lambda_virus
"$BT2_HOME/bowtie2" -x lambda_virus -U "${BT2_HOME}/example/reads/reads_1.fq" -S eg1.sam

What happened:

  • example/reference/lambda_virus.fa is the input FASTA file with the reference genome

  • reads_1.fq is a FASTQ file with the reads and the base call quality.

    The bowtie2 manual says that these were just generated from the reference genome input, and are not real read data.

    This program can also generate such fake data from reference genomes:

  • eg1.sam is the output, which says where each read is most likely to go. It is documented at:

TODO: how to:

23. Bibliography


MOVING TO: SEE README. Because sometimes you just want to simulate single prokaryotic biological living whole cell models starting from DNA to minute detail to understand how it works and predict simple experimental observations.



No releases published


No packages published