Quick Start

Michael Ryan edited this page Jan 17, 2017 · 8 revisions
Clone this wiki locally

Quick Start

Running NAPA Analysis requires two steps: 1. network construction and 2. network analysis. The following quick start example uses data files that can be downloaded from the Sample Data page. Note: Please make sure you have followed the instructions on the Install page to install the NAPA package in your local python environment prior to running the Quick Start examples.

Multiple Sequence Alignment Network Construction

To run the quick start example of multiple sequence alignment network construction, download all of the files from the Sample Data page under the (Multiple sequence) alignment based network reconstruction header and put them in a 'data' folder in the working directory you have selected for NAPA analysis. Network construction from a multiple sequence alignment is performed using the following command. Note you will likely need to prefix the configuration file with a full path to its location on your computer.

python -m napa.run_napa -r build -c aln.config.yaml

The run_napa module uses a yaml configuration file (-c option above) for all of the napa processing parameters. The sample aln.config.yaml includes comments to identify required/optional settings and describes the parameters. For the most part, you can leave these unchanged to run the sample multiple sequence alignment network build. You will, however, need to modify the working_dir, data_dir, and results_dir to reflect the location on your computer where you put the sample data files and where you would like the results. Note: the working directory may need to be a fully qualified path.


Network Analysis

Network analysis uses the network constructed above and is capable of running various analysis methods to identify mutation combinations associated with the evolution of the adaptation of interest. This analysis is performed by the net_analysis module.

The network analysis again uses the .yaml config file for its parameters. The same yaml file is used for network analysis of the sample multiple sequence network. To run analysis:

python -m napa.run_napa -r analysis -c aln.config.yaml

It is not necessary to change any .yaml settings to run the analysis on the sample network but you may want to try different options for calculate_centralities, cent_rank_type, path_len, and cent_type to see various forms of analysis available.


Phylogenetic Tree Network Construction

The alternate network construction method uses an ensemble of phylogenetic trees to construct a directed network. To run this example the files from Sample Data under both the heading "Phylogeny based network reconstruction" and the heading "(Multiple sequence) alignment based network reconstruction" need to be downloaded and placed in a data folder in the local directory you have selected for NAPA analysis. Also, you will need to unzip the trees.zip file which will create a run1 and run2 directory in the data folder.

Network construction from phylogenetic trees and a multiple sequence alignment is performed using the following command. Note you will likely need to prefix the configuration file with a full path to its location on your computer.

python -m napa.run_napa -r build -c phylo.dir.config.yaml

The run_napa module uses a yaml configuration file (-c option above) for all of the parameters of the analysis. The sample phylo.dir.config.yaml includes comments to identify required/optional parameters and describes the parameters. For the most part, you can leave these unchanged to run the sample multi sequence alignment network build. You will, however, need to modify the working_dir, data_dir, and results_dir to reflect the location on your computer where you put the sample data files and where you would like the results. Note the working directory may need to be a fully qualified path.

Additionally, if you would like to try undirected network construction, use the phylo.undir.config.yaml sample configuration file.

Directed Network Analysis


The network files constructed above can then be used to perform analysis using the same .yaml configuration file:

python -m napa.run_napa -r analysis -c phylo.dir.config.yaml

You may want to try different options for the network analysis parameters in the .yaml file to explore the other types of analysis that can be performed.