Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.

Scripts of Wippel, Tao, et. al, Host adaptation and invasiveness of commensals in the Lotus and Arabidopsis root microbiota.

correspondence to Ruben Garrido-Oter

These scripts are made available to facilitate the reproducibility of our research. If you re-use any or part of this code, please reference with comments and cite our paper. Raw data and intermediate results necessary to run these scripts can also be downloaded here.

Accession numbers

Raw 16S rRNA amplicon reads were deposited in the European Nucleotide Archive (ENA) under the accession number PRJEB37695. Similarly, sequencing reads and genome assemblies of the Lj-SPHERE core collection were uploaded to the same database with the accession number PRJEB37696.

Scripts used for processing data, creating the figures and performing the statistical analysis reported in the manuscript.

Analysis of culture independent and IRL 16S rRNA amplicon data

Raw data corresponding to the natural community greenhouse experiments with Lotus and Arabidopsis data, along with intermediate results (ASV table, metadata, etc.) can be found here. and bash pipeline to pre-process raw sequencing amplicon data from the greenhouse experiments and obtain an ASV table.

LjAt_NC.R, LjAt_NC_load_data.R, and LjAt_NC_diversity.R: R scripts to process the natural community ASV table and perform downstream analysis of alpha- and beta-diversity as well as rank-abundance plots and enrichment tests (Fig. 1a-b).

Raw sequencing data and intermediate results from the At-IRL and Lj-IRL as well as corresponding natural community inocula can be downloaded here. and bash pipeline to pre-process raw sequencing amplicon data and obtain a combined OTU table from all IRL data.

IRLs.R, IRLs_load_data.R, and IRLs_recovery_rates.R: R scripts used to cross-reference data from shallow sequencing of IRL plates (CFUs) and culture-independent profiling of start inocula, and to estimate species recovery rates (Fig. 1c-f).

plotting_parameters.R, and plotting_functions.R: R scripts containing plotting parameters such as colors, ggplot2 themes, etc. as well as auxiliary functions for generating the figures reported in the paper (some of the details may vary with respect to the published version).

Whole-genome assembly, quality control, and annotation of the Lj-SPHERE core culture collection

Raw sequencing data (FASTQ files) from the core Lj-SPHERE culture collection as well as assemblies (FNA files), nucleotide ORFs (FFN), amino acid ORFs (FAA), GFF files, KEGG annotations (KO), reference 16S rRNA sequences, AMPHORA marker gene alignments, and metadata (used for Fig. S4) can be downloaded in bulk here. bash script containing auxiliary functions for whole-genome assembly. script used to assemble the genomes using SOAPdenovo and A5. It can be run in parallel using either the custom script bellow or the gnu parallel suit. custom script to run bash functions in parallel in a multi-core machine.

assembly_stats.R: R script used to generate assembly statistics as well as GC and k-mer spectral projections. The output of this script contains clean assemblies (all contigs <1,000 bp are removed) as well as a PDF file containing a report which was used to manually inspect for likely contaminated assemblies.

functional_MDS.R: R script to perform dimensionality reduction of genome functional profiles on the genomes from the Lj- and At-SPHERE collections.

Reference-based analysis of 16S rRNA amplicon data from SynCom experiments

Raw sequencing data from all SynCom experiments (FASTQ files and mapping files including barcodes, biological and technical metadata, etc.) along with intermediate results (ASV tables, etc.) can be downloaded here.

The experiments reported in the paper correspond to independent Illumina MiSeq sequencing runs as follows:

Sequencing run Experiment ID
AtLj_001 SynCom experiment C
AtLj_002 SynCom experiment B
AtLj_003 SynCom experiment G
AtLj_004 SynCom experiment L
AtLj_005 SynCom experiment H
AtLj_006 SynCom experiment D
AtLj_007 SynCom experiment F
AtLj_008 SynCom experiment J
AtLj_009 SynCom experiment A
AtLj_010 SynCom experiment M
MDA10 Millifluidics experiment (I)

LjAt_SC_load_data.R: R script used to load data and metadata from the SynCom competition experiments.

LjAt_SC_diversity.R and cpcoa.func.R: R scripts employed for beta-diversity analysis, dimensionality reduction, and permutation analyses of variance (Figs. 2b-d, 4b-c, S5a-c, S9, S10a, S10c, S11a, S12a).

LjAt_SC_host_preference.R: script used for plotting host preference boxplots from competition experiments (Figs. 2e-g, S5c-e, S10b, S10d, S11b, S12b).

LjAt_SC_invasion_load_data.R: R script used to load data and metadata from the invasion and persistence experiment.

LjAt_SC_invasion.R: R script used for plotting host preference boxplots from the invasion and persistence experiment (Fig. 4d-e).

hpi.R, and hpi_invasion.R: R scripts used to calculate host preference and invasiveness indices (Fig. 5).

LjAt_MDA_diversity.R: R script used for beta-diversity analysis, dimensionality reduction, and permutation analyses of variance of in vitro millifluidics experiments (Fig. S12a).

For any questions regarding these scripts, please contact

Ruben Garrido-Oter


No releases published


No packages published