Accompanying data and source code for the manuscript https://doi.org/10.1101/2022.09.11.507163
The different subdirectories contain:
00-algorithm
: source code for our network generation algorithm, and a raw mySQL dump of the database used to perform the study01-pathways-per-genus
: source code for the pathway inference algorithm02-network-generation
: commands and auxiliary data used for network generation and annotation02-network-generation/01-per-environment-networks.sh
: commands used to generate the individual per-environment networks02-network-generation/01-output
: output of running the commands in02-network-generation/01-per-environment-networks.sh
. For each environment, this includes a presence-absence matrix of genera in samples, and the resulting networks in the gpickle and xml formats02-network-generation/02-merge-and-annotate.sh
: commands used to combine the individual environmental networks into a multi-environment network, and perform phylogenetic and functional annotation02-network-generation/02-output
: output of running the commands in02-network-generation/02-merge-and-annotate.sh
02-network-generation/02-output/merged6.2.unlooped.avgFunPhyl.pathways.xml
: final combined and annotated network02-network-generation/02-output/merged6.2.networkTable.csv
: network table containing annotations for each node in the network02-network-generation/02-output/consensusNet.sif
: consensus network (edge support > 70) in the SIF format
02-network-generation/goodMetaCyc
: accompanying data02-network-generation/goodMetaCyc/oct2020.combined_noPWY0-1324.tsv
: fraction of genomes from each genera containing any given pathway02-network-generation/goodMetaCyc/phylodist_clean.table.tsv
: phylogenetic distances between genera- Other accessory files linking MetaCyc pathway IDs to pathway names and broader functional categories
03-analysis
: R code used to analyze the results, and the resulting figures
The following conda environment should provide the libraries required to run the different steps of the analysis:
conda create -c conda-forge -c r -n microbialNetworks networkx==1.11 lxml pandas scipy rpy2 mysqlclient cython r r-ade4 r-ggplot2 r-reshape2 r-purrr r-gplots r-dendextend r-svglite r-stringr