Skip to content

Commit

Permalink
Merge pull request #128 from McTavishLab/documentation-updates
Browse files Browse the repository at this point in the history
Documentation updates, yay!
  • Loading branch information
LunaSare committed Jul 9, 2020
2 parents 5a3118a + a2790b4 commit 56e09f0
Show file tree
Hide file tree
Showing 29 changed files with 63,949 additions and 297 deletions.
Binary file added docs/img/TNRS1.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/TNRS2.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/TNRS3.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1,231 changes: 1,231 additions & 0 deletions docs/img/schematic.svg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
61,952 changes: 61,952 additions & 0 deletions docs/img/synthtreeleg.svg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
15 changes: 8 additions & 7 deletions docs/mds/DataExploration.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,13 @@
### Tree comparison arguments
## Tree comparison with Robinson-Foulds

### Reroot or relabel tree

from physcraper import treetaxon
podarc = treetaxon.generate_TreeTax_from_run('example/docs/pg_55')
podarc.write_labelled(label='^ot:ottTaxonName', norepeats=False, path='test_podarcis/repeats.tre')
*In construction*

## Relabeling the trees

from physcraper import treetaxon
pg55 = treetaxon.generate_TreeTax_from_run('example/docs/pg_55')
pg55.write_labelled(label='^ot:ottTaxonName', norepeats=False, path='test_podarcis/repeats.tre')

## Rerooting the trees

##Example with Data Dryad chiroptera gene trees???
*In construction*
4 changes: 2 additions & 2 deletions docs/mds/FindTrees.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Searching for studies
## Searching for studies

To search for trees on OpenTree with your taxon of interest, you can use
find_trees.py
find_trees.py


usage: find_trees.py [-h] [-t TAXON_NAME] [-ott OTT_ID] [-tb] [-o OUTPUT]
Expand Down
25 changes: 13 additions & 12 deletions docs/mds/INSTALL.md
Original file line number Diff line number Diff line change
@@ -1,34 +1,35 @@
## Downloading `Physcraper`
## Downloading Physcraper

First step is to clone the repo into your computer:
First step is to clone the Physcraper repo to your computer:

```
git clone git@github.com:McTavishLab/physcraper.git
git clone https://github.com/McTavishLab/physcraper.git
```
Then move to the newly created physcraper directory (cd phscraper) and choose a type

or dowload the repo from https://github.com/McTavishLab/physcraper.git

Then move to the newly created physcraper directory (cd physcraper) and choose a type
of installation, using conda or a python virtual environment.

### Option 1: Install `Physcraper` using conda
### Option 1: Install Physcraper using conda

First, install anaconda
First, install [anaconda](https://www.anaconda.com/products/individual)

Then, create a conda environment

```
conda env create -f cond_env.yml
conda activate physcraper_env
# This next step is temprary until opentree changes are uploaded to pypi
pip install -e git+https://github.com/OpenTreeOfLife/python-opentree@get-tree#egg=opentree
pip install -e .
```

You're done with the installation with conda!

### Option 2: Install `Physcraper` using a python virtual environment
### Option 2: Install Physcraper using a python virtual environment

First, create a python virtual environment

Remeber you need to be in the pyscraper folder, once there do:
Remeber you need to be in the physcraper folder, once there do:

```
virtualenv -p python3 venv-physcraper
Expand All @@ -44,7 +45,7 @@ source venv-physcraper/bin/activate
You will stay in the virtual environment even if you change directories and `physcraper` should run from anywhere, while the virtual environment is activated.


**Note** that you will have to activate the virtual environment every time you want to run `physcraper` ;)
**Note** that you will have to activate the virtual environment every time you want to run `physcraper`


Finally, install `physcraper` inside the virtual environment:
Expand Down
51 changes: 0 additions & 51 deletions docs/mds/LocalDB-luna.md

This file was deleted.

72 changes: 44 additions & 28 deletions docs/mds/LocalDB.md
Original file line number Diff line number Diff line change
@@ -1,52 +1,68 @@
Blast Utilities
## Local Databases

## Create an NCBI API key

Generating an NCBI API key will speed up downloading full sequences following blast searches.
See (NCBI API keys for details)[https://ncbiinsights.ncbi.nlm.nih.gov/2017/11/02/new-api-keys-for-the-e-utilities/]

You can add your api key to your config using

Entrez.api_key = <apikey>

or as a flag in your physcraper_run script --api_key
The BLAST tool can be run using local databases, which can be downloaded and updated from the National Center for Biotechnology Information ([NCBI](https://www.ncbi.nlm.nih.gov/)).

### Installing BLAST command line tools

To blast locally you will need to install blast command line tools first.
Find general instructions at
https://www.ncbi.nlm.nih.gov/books/NBK279671/
https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/

### To Update or download blast DB:
This is not necessary, but will make blast searches faster.

e.g. installing BLAST command line tools on **linux**:

### Install blast command line tools:
Full instructions from ncbi at [manual](https://www.ncbi.nlm.nih.gov/books/NBK279671/) and [installation](https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/)

On Linux :

```
wget https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/ncbi-blast-2.10.0+-x64-linux.tar.gz
tar -xzvf ncbi-blast-2.10.0+-x64-linux.tar.gz
```

The binaries/scripts/executables will be installed in the `/bin` folder.

The binaries are in /bin, and you should add them to your path
Installing BLAST command line tools on **MAC OS** is easy, with the installer. Note, however, that the BLAST executables will be installed in `usr/local/ncbi/blast` and that you will have to add this to your path in order to be able to run the executables, by adding `export PATH=$PATH:"usr/local/ncbi/blast/bin"` to the .bash_profile

If your terminal uses zshell instead of bash, make sure you're running the .bash_profile there too.

### Download

update_blastdb nt
cat *.tar.gz | tar -xvzf - -i
update_blastdb taxdb
gunzip -cd taxdb.tar.gz | (tar xvf - )
### Downloading the NCBI database

If you want to download the NCBI blast database and taxonomy for faster local searches
note that the download can take several hours, depending on your internet connection.

This is what you should do:

```
mkdir local_blast_db # create the folder to save the database
cd local_blast_db # move to the newly created folder
update_blastdb nt # download the NCBI nucleotide databases
# update_blastdb.pl nt # in MAC
cat *.tar.gz | tar -xvzf - --ignore-zeros # unzip the nucleotide databases
update_blastdb taxdb # download the NCBI taxonomy database
# update_blastdb.pl taxdb # in MAC
gunzip -cd taxdb.tar.gz | (tar xvf - ) # unzip the taxonomy database
```

### Download taxonomy databases from ncbi, place them in the 'physcraper/taxonomy directory'
#### Downloading the nodes and names into the physcraper/taxonomy directory

cd taxonomy
```
cd physcraper/taxonomy
wget 'ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz'
gunzip -f -cd taxdump.tar.gz | (tar xvf - names.dmp nodes.dmp)
```



## Setting up an AWS blast db
### Setting up an AWS blast db

To run blast searches without NCBI's required time delays, you can set up your own server on AWS (for $).
See instructions at (AWS marketplace NCBI blast)[https://aws.amazon.com/marketplace/pp/NCBI-NCBI-BLAST/B00N44P7L6]

### Create an NCBI API key

Generating an NCBI API key will speed up downloading full sequences following blast searches.
See (NCBI API keys for details)[https://ncbiinsights.ncbi.nlm.nih.gov/2017/11/02/new-api-keys-for-the-e-utilities/]

You can add your api key to your config using

Entrez.api_key = <apikey>

or as a flag in your physcraper_run script --api_key

0 comments on commit 56e09f0

Please sign in to comment.