Skip to content

Commit

Permalink
Merge pull request #130 from McTavishLab/pyopensci
Browse files Browse the repository at this point in the history
Pyopensci
  • Loading branch information
snacktavish committed Jul 15, 2020
2 parents 3fc2144 + 2526d1a commit 017fe19
Show file tree
Hide file tree
Showing 164 changed files with 27,716 additions and 2,023 deletions.
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -27,3 +27,6 @@ physcraper_example_minimal/
scrape_ot_350_compact/
physcraper_example_ot_350/
taxonomy/full_seqs/
*.fas
taxonomy/*.dmp
taxonomy/*.tar*
36 changes: 11 additions & 25 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,46 +8,32 @@ before_install:
- sudo apt-get update

install:
### install blast+
# this fails often with connection errors
#- sudo apt-get install ncbi-blast+
- sudo apt-get install muscle


#### install papara
#- wget 'https://sco.h-its.org/exelixis/resource/download/software/papara_nt-2.5-static_x86_64.tar.gz'
#- gunzip -cd papara_nt-2.5-static_x86_64.tar.gz | (tar xvf - )
#- mv papara_static_x86_64 papara
#- export PATH="$PATH:$(pwd)"
#- papara

##### to use RAXML we need conda
- wget http://repo.continuum.io/miniconda/Miniconda-latest-Linux-x86_64.sh -O miniconda.sh
- chmod +x miniconda.sh
- bash miniconda.sh -b -p $HOME/miniconda
- export PATH="$HOME/miniconda/bin:$PATH"
- conda config --set always_yes yes --set changeps1 no
- conda update -q conda
- conda create -q -n test-environment python=$TRAVIS_PYTHON_VERSION
- source activate test-environment
- conda install -c bioconda raxml
- conda env create -f cond_env.yml
- source activate physcraper_env
- pip install -r requirements.txt
- pip install -e .

- export PYTHONPATH=$PYTHONPATH:$(pwd)
# install requirements for testing
- pip install codecov
- pip install pytest pytest-cov




# command to run tests

- wget 'https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz'
- gunzip -cd taxdump.tar.gz | (tar xvf - names.dmp nodes.dmp)
- mv *.dmp taxonomy/


- export PYTHONPATH=$PYTHONPATH:$(pwd)
# install requirements of physcraper
- pip install -r requirements.txt
- python setup.py install
- pip install codecov
- pip install pytest pytest-cov


script:
#- py.test tests/ --setup-only
#- sh tests/run_tests.sh
Expand Down
46 changes: 46 additions & 0 deletions README.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
---
output: github_document
---

<!-- README.md is generated from README.Rmd; please edit the .Rmd file -->

```{r setup, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
results = "asis",
echo = TRUE,
comment = "#>"
)
IS_README <- TRUE
```

<img align="left" width="250" src="https://raw.githubusercontent.com/McTavishLab/physcraper/main/docs/physcraper.svg">

# Physcraper

[![Build Status](https://travis-ci.org/McTavishLab/physcraper.svg?branch=main)](https://travis-ci.org/McTavishLab/physcraper)[![Documentation](https://readthedocs.org/projects/physcraper/badge/?version=latest&style=flat)](https://physcraper.readthedocs.io/en/latest/)[![codecov](https://codecov.io/gh/McTavishLab/physcraper/branch/main/graph/badge.svg)](https://codecov.io/gh/McTavishLab/physcraper)


<p></p>

<p></p>


## Automated gene tree updating!


```{r child="docs/mds/intro.md"}
```

This is the code repository.
For an introduction to the tool, installation and function usage instructions, tutorials and examples, and tools for developers,
please refer to Physcraper's [documentation website](https://physcraper.readthedocs.io/en/latest/) for more details!


:hamster: :palm_tree: :frog: :ear_of_rice: :panda_face: :tulip: :octopus: :blossom: :whale: :mushroom: :ant: :cactus: :fish: :maple_leaf: :water_buffalo: 🦠 :shell: :bug: :octocat:


```{r child="docs/mds/intro-dendropy.md"}
```
40 changes: 27 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,27 +1,41 @@

<!-- README.md is generated from README.Rmd; please edit the .Rmd file -->

<img align="left" width="250" src="https://raw.githubusercontent.com/McTavishLab/physcraper/main/docs/physcraper.svg">

# Physcraper

[![Build Status](https://travis-ci.org/McTavishLab/physcraper.svg?branch=main)](https://travis-ci.org/McTavishLab/physcraper)[![Documentation](https://readthedocs.org/projects/physcraper/badge/?version=latest&style=flat)](https://physcraper.readthedocs.io/en/latest/)[![codecov](https://codecov.io/gh/McTavishLab/physcraper/branch/main/graph/badge.svg)](https://codecov.io/gh/McTavishLab/physcraper)

[![Build
Status](https://travis-ci.org/McTavishLab/physcraper.svg?branch=main)](https://travis-ci.org/McTavishLab/physcraper)[![Documentation](https://readthedocs.org/projects/physcraper/badge/?version=latest&style=flat)](https://physcraper.readthedocs.io/en/latest/)[![codecov](https://codecov.io/gh/McTavishLab/physcraper/branch/main/graph/badge.svg)](https://codecov.io/gh/McTavishLab/physcraper)

<p></p>
<p>

<p></p>
</p>

## Automated gene tree updating!
<p>

Use a tree (from the literature, a synthetic tree from Open Tree of Life, or your own tree) and a single locus alignment to find and add homologous sequences to (hopefully) improve and advance phylogenetic inference in a group.
</p>

## Automated gene tree updating\!

The tool is under current development in the McTavish Lab.
Please post an issue at https://github.com/McTavishLab/physcraper/issues or contact ejmctavish@ucmerced.edu if you need any help or have feedback.
Use a tree and a single locus alignment to find and add homologous
sequences to improve and advance phylogenetic inference in a group.

This is the code repository, please refer to Physcraper's [documentation website](https://physcraper.readthedocs.io/en/latest/) for more details on how to install it and run!
The tool is under active development in the [McTavish
Lab](https://mctavishlab.github.io/). Please post a GitHub issue
[here](https://github.com/McTavishLab/physcraper/issues) or contact
<ejmctavish@ucmerced.edu> if you need any help or have feedback.

This is the code repository. For an introduction to the tool,
installation and function usage instructions, tutorials and examples,
and tools for developers, please refer to Physcraper’s [documentation
website](https://physcraper.readthedocs.io/en/latest/) for more
details\!

:hamster: :palm_tree: :frog: :ear_of_rice: :panda_face: :tulip: :octopus: :blossom: :whale: :mushroom: :ant: :cactus: :fish: :maple_leaf: :water_buffalo: 🦠 :shell: :bug: :octocat:
:hamster: :palm\_tree: :frog: :ear\_of\_rice: :panda\_face: :tulip:
:octopus: :blossom: :whale: :mushroom: :ant: :cactus: :fish:
:maple\_leaf: :water\_buffalo: 🦠 :shell: :bug: :octocat:

Physcraper relies on:
Dendropy
Sukumaran, J and MT Holder. 2010. DendroPy: a Python library for phylogenetic computing. Bioinformatics 26: 1569-1571.
Physcraper relies on Dendropy: *Sukumaran, J and MT Holder. 2010.
DendroPy: a Python library for phylogenetic computing. Bioinformatics
26: 1569-1571*.
2 changes: 1 addition & 1 deletion bin/find_trees.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
sys.exit(1)


assert(args.taxon_name or args.ottid), "A taxon name or an OTT id are required for search."
assert(args.taxon_name or args.ott_id), "A taxon name or an OTT id are required for search."

if args.taxon_name:
try:
Expand Down
41 changes: 22 additions & 19 deletions bin/physcraper_run.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@

parser.add_argument("-tx","--taxonomy", help="path to taxonomy")

parser.add_argument("-v","--verbose", action="store_true", help="OpenTree study id")
parser.add_argument("-v","--verbose", action="store_true", help="verbose")



Expand Down Expand Up @@ -108,28 +108,28 @@
conf.set_local()

if args.eval:
conf.e_value_thresh = args.eval
conf.e_value_thresh = float(args.eval)

if args.hitlist_len:
conf.hitlist_size = args.hitlist_len
conf.hitlist_size = int(args.hitlist_len)

if args.trim_perc:
conf.trim_perc = args.trim_perc
conf.trim_perc = float(args.trim_perc)

if args.relative_length_max:
conf.maxlen = args.relative_length_max
conf.maxlen = float(args.relative_length_max)

if args.relative_length_min:
conf.minlen = args.relative_length_min
conf.minlen = float(args.relative_length_min)

if args.species_number:
conf.spp_threshold = int(args.species_number)

if args.num_threads:
conf.num_threads = args.num_threads
conf.num_threads = int(args.num_threads)

if args.delay:
conf.delay = args.delay
conf.delay = int(args.delay)

if args.email:
conf.email = args.email
Expand Down Expand Up @@ -185,23 +185,26 @@
else:
sys.stdout.write("Using alignment file found at {}.\n".format(alnfile))

search_ott_id = None
if args.search_taxon:
ids = physcraper.IdDicts(conf)
if args.search_taxon.startswith('ott'):
search_ott_id = args.search_taxon.split(':')[1]
elif args.search_taxon.startswith('ncbi'):
ncbi_id = inst(args.search_taxon.split(':')[1])
search_ott_id = ids.ncbi_ott[ncbi_id]
else:
sys.stderr.write("search taxon id must be in format ott:123 or ncbi:123\n")

if study_id:
if args.search_taxon:
ids = physcraper.IdDicts(conf)
if args.search_taxon.startswith('ott'):
ott_id = args.search_taxon.split(':')[1]
elif args.search_taxon.startswith('ncbi'):
ncbi_id = inst(args.search_taxon.split(':')[1])
ott_id = ids.ncbi_ott[ncbi_id]
else:
sys.stderr.write("search taxon id must be in format ott:123 or ncbi:123\n")
if search_ott_id:
data_obj = generate_ATT_from_phylesystem(study_id =study_id,
tree_id = tree_id,
alnfile = alnfile,
aln_schema = aln_schema,
workdir = workdir,
configfile = conf,
search_taxon = ott_id)
search_taxon = search_ott_id)
scraper = physcraper.PhyscraperScrape(data_obj, ids)
else:
scraper = scraper_from_opentree(study_id =study_id,
Expand Down Expand Up @@ -234,7 +237,7 @@
treefile = treefile,
otu_json = otu_dict,
tree_schema = args.tree_schema,
search_taxon=search_taxon)
search_taxon=search_ott_id)
ids = physcraper.IdDicts(conf)
scraper = physcraper.PhyscraperScrape(data_obj, ids)
# sys.stdout.write("Read in tree {} taxa in alignment and tree\n".format(len(scraper.data.aln)))
Expand Down
8 changes: 5 additions & 3 deletions bin/tree_comparison.py
Original file line number Diff line number Diff line change
Expand Up @@ -187,7 +187,8 @@
for node in conflict_orig:
if conflict_orig[node]['status'] == 'conflicts_with':
witness = conflict_orig[node]['witness_name']
orig_conf_taxa.add(witness)
for tax in witness:
orig_conf_taxa.add(tax)

sys.stdout.write("\nOriginal tree conflicts with {} taxa in the OpenTree taxonomy:\n".format(len(orig_conf_taxa)))
for tax in orig_conf_taxa:
Expand All @@ -201,9 +202,10 @@
for node in conflict:
if conflict[node]['status'] == 'conflicts_with':
witness = conflict[node]['witness_name']
updated_conf_taxa.add(witness)
for tax in witness:
updated_conf_taxa.add(tax)

sys.stdout.write("Updated tree conflicts with {} taxa in the OpenTree taxonomy:\n".format(len(updated_conf_taxa)))
sys.stdout.write("\nUpdated tree conflicts with {} taxa in the OpenTree taxonomy:\n".format(len(updated_conf_taxa)))
for tax in updated_conf_taxa:
sys.stdout.write("{}\n".format(tax))

Expand Down
1 change: 1 addition & 0 deletions cond_env.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ dependencies:
- coverage
- dendropy
- numpy
- wget

- pandas
- requests
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
(((otu376428,otu376440,(otu376436,otu376444)node999082,otu376450)node999079,((((otu376432,otu376449)node999102,(otu376425,otu376448)node999105)node999101,((otu376433,otu376451)node999109,otu376434,otu376426)node999108)node999100,((otu376437,otu376438)node999096,otu376447)node999095)node999094,((otu376442,otu376429)node999118,(otu376443,otu376431)node999121,(otu376453,otu376441)node999115)node999114,((otu376445,otu376454)node999091,(otu376427,otu376435,otu376446)node999087)node999086)node999078,otu376430,(otu376420,otu376439,otu376452)node999074)node999072;

0 comments on commit 017fe19

Please sign in to comment.