Merge pull request #128 from McTavishLab/documentation-updates

Documentation updates, yay!
McTavishLab · Jul 9, 2020 · 56e09f0 · 56e09f0
2 parents 5a3118a + a2790b4
commit 56e09f0
Show file tree

Hide file tree

Showing 29 changed files with 63,949 additions and 297 deletions.
diff --git a/docs/img/TNRS1.png b/docs/img/TNRS1.png
diff --git a/docs/img/TNRS2.png b/docs/img/TNRS2.png
diff --git a/docs/img/TNRS3.png b/docs/img/TNRS3.png
diff --git a/docs/img/schematic.svg b/docs/img/schematic.svg
diff --git a/docs/img/synthtreeleg.svg b/docs/img/synthtreeleg.svg
diff --git a/docs/mds/DataExploration.md b/docs/mds/DataExploration.md
@@ -1,12 +1,13 @@
-### Tree comparison arguments
+## Tree comparison with Robinson-Foulds
 
-### Reroot or relabel tree
-
-    from physcraper import treetaxon
-    podarc = treetaxon.generate_TreeTax_from_run('example/docs/pg_55')
-    podarc.write_labelled(label='^ot:ottTaxonName', norepeats=False, path='test_podarcis/repeats.tre')
+*In construction*
 
+## Relabeling the trees
 
+    from physcraper import treetaxon
+    pg55 = treetaxon.generate_TreeTax_from_run('example/docs/pg_55')
+    pg55.write_labelled(label='^ot:ottTaxonName', norepeats=False, path='test_podarcis/repeats.tre')
 
+## Rerooting the trees
 
-##Example with Data Dryad chiroptera gene trees???
+*In construction*
diff --git a/docs/mds/FindTrees.md b/docs/mds/FindTrees.md
@@ -1,7 +1,7 @@
-# Searching for studies
+## Searching for studies
 
 To search for trees on OpenTree with your taxon of interest, you can use
-find_trees.py  
+find_trees.py
 
 
 usage: find_trees.py [-h] [-t TAXON_NAME] [-ott OTT_ID] [-tb] [-o OUTPUT]

diff --git a/docs/mds/INSTALL.md b/docs/mds/INSTALL.md
@@ -1,34 +1,35 @@
-## Downloading `Physcraper`
+## Downloading Physcraper
 
-First step is to clone the repo into your computer:
+First step is to clone the Physcraper repo to your computer:
 
 ```
-git clone git@github.com:McTavishLab/physcraper.git
+git clone https://github.com/McTavishLab/physcraper.git
 ```
-Then move to the newly created physcraper directory (cd phscraper) and choose a type
+
+or dowload the repo from https://github.com/McTavishLab/physcraper.git
+
+Then move to the newly created physcraper directory (cd physcraper) and choose a type
 of installation, using conda or a python virtual environment.
 
-### Option 1: Install `Physcraper` using conda
+### Option 1: Install Physcraper using conda
 
-First, install anaconda
+First, install [anaconda](https://www.anaconda.com/products/individual)
 
 Then, create a conda environment
 
 ```
    conda env create -f cond_env.yml
    conda activate physcraper_env
-   # This next step is temprary until opentree changes are uploaded to pypi
-   pip install -e git+https://github.com/OpenTreeOfLife/python-opentree@get-tree#egg=opentree
-
+   pip install -e .
 ```
 
 You're done with the installation with conda!
 
-### Option 2: Install `Physcraper` using a python virtual environment
+### Option 2: Install Physcraper using a python virtual environment
 
 First, create a python virtual environment
 
-Remeber you need to be in the pyscraper folder, once there do:
+Remeber you need to be in the physcraper folder, once there do:
 
 ```
 virtualenv -p python3 venv-physcraper
@@ -44,7 +45,7 @@ source venv-physcraper/bin/activate
 You will stay in the virtual environment even if you change directories and `physcraper` should run from anywhere, while the virtual environment is activated.
 
 
-**Note** that you will have to activate the virtual environment every time you want to run `physcraper` ;)
+**Note** that you will have to activate the virtual environment every time you want to run `physcraper`
 
 
 Finally, install `physcraper` inside the virtual environment:

diff --git a/docs/mds/LocalDB-luna.md b/docs/mds/LocalDB-luna.md
diff --git a/docs/mds/LocalDB.md b/docs/mds/LocalDB.md
@@ -1,52 +1,68 @@
-Blast Utilities
+## Local Databases
 
-## Create an NCBI API key
-
-Generating an NCBI API key will speed up downloading full sequences following blast searches.
-See (NCBI API keys for details)[https://ncbiinsights.ncbi.nlm.nih.gov/2017/11/02/new-api-keys-for-the-e-utilities/]
-
-You can add your api key to your config using
-
-    Entrez.api_key = <apikey>
-
-or as a flag in your physcraper_run script --api_key 
+The BLAST tool can be run using local databases, which can be downloaded and updated from the National Center for Biotechnology Information ([NCBI](https://www.ncbi.nlm.nih.gov/)).
 
+### Installing BLAST command line tools
 
+To blast locally you will need to install blast command line tools first.
+Find general instructions at
+https://www.ncbi.nlm.nih.gov/books/NBK279671/
+https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/
 
-### To Update or download blast DB:
-This is not necessary, but will make blast searches faster.
 
+e.g. installing BLAST command line tools on **linux**:
 
-### Install blast command line tools:
-Full instructions from ncbi at [manual](https://www.ncbi.nlm.nih.gov/books/NBK279671/) and [installation](https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/)
-
-On Linux :
-
+```
     wget https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/ncbi-blast-2.10.0+-x64-linux.tar.gz
     tar -xzvf ncbi-blast-2.10.0+-x64-linux.tar.gz
+ ```
+
+The binaries/scripts/executables will be installed in the `/bin` folder.
 
-The binaries are in /bin, and you should add them to your path
+Installing BLAST command line tools on **MAC OS** is easy, with the installer. Note, however, that the BLAST executables will be installed in `usr/local/ncbi/blast` and that you will have to add this to your path in order to be able to run the executables, by adding `export PATH=$PATH:"usr/local/ncbi/blast/bin"` to the .bash_profile
 
+If your terminal uses zshell instead of bash, make sure you're running the .bash_profile there too.
 
-### Download
 
-    update_blastdb nt
-    cat *.tar.gz | tar -xvzf - -i
-    update_blastdb taxdb
-    gunzip -cd taxdb.tar.gz | (tar xvf - )
+### Downloading the NCBI database
 
+If you want to download the NCBI blast database and taxonomy for faster local searches
+note that the download can take several hours, depending on your internet connection.
 
+This is what you should do:
 
+```
+    mkdir local_blast_db  # create the folder to save the database
+    cd local_blast_db  # move to the newly created folder
+    update_blastdb nt  # download the NCBI nucleotide databases
+    # update_blastdb.pl nt  # in MAC
+    cat *.tar.gz | tar -xvzf - --ignore-zeros  # unzip the nucleotide databases
+    update_blastdb taxdb  # download the NCBI taxonomy database
+    # update_blastdb.pl taxdb  # in MAC
+    gunzip -cd taxdb.tar.gz | (tar xvf - )  # unzip the taxonomy database
+```
 
-### Download taxonomy databases from ncbi, place them in the 'physcraper/taxonomy directory'
+#### Downloading the nodes and names into the physcraper/taxonomy directory
 
-    cd taxonomy
+```
+    cd physcraper/taxonomy
     wget 'ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz'
     gunzip -f -cd taxdump.tar.gz | (tar xvf - names.dmp nodes.dmp)
+```
 
 
-
-## Setting up an AWS blast db
+### Setting up an AWS blast db
 
 To run blast searches without NCBI's required time delays, you can set up your own server on AWS (for $).
 See instructions at (AWS marketplace NCBI blast)[https://aws.amazon.com/marketplace/pp/NCBI-NCBI-BLAST/B00N44P7L6]
+
+### Create an NCBI API key
+
+Generating an NCBI API key will speed up downloading full sequences following blast searches.
+See (NCBI API keys for details)[https://ncbiinsights.ncbi.nlm.nih.gov/2017/11/02/new-api-keys-for-the-e-utilities/]
+
+You can add your api key to your config using
+
+    Entrez.api_key = <apikey>
+
+or as a flag in your physcraper_run script --api_key