Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation #34

Merged
merged 58 commits into from
Apr 3, 2020
Merged
Show file tree
Hide file tree
Changes from 50 commits
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
97d3d0f
initial commit of documentation for readthedocs format
evanroyrees Mar 13, 2020
c936532
first commit
Sidduppal Mar 19, 2020
02c8fde
Commiting all files in Scripts folder
Sidduppal Mar 19, 2020
b55d480
Modification to COPYRIGHT and config.py
Sidduppal Mar 19, 2020
2db1629
stable build
Sidduppal Mar 20, 2020
e7f71f8
Removed scripts_2, can now be fetched and run by any user
Sidduppal Mar 23, 2020
d4c2cf4
Create README.md
Sidduppal Mar 25, 2020
025dec5
Update README.md
Sidduppal Mar 25, 2020
ad624d2
Update README.md
Sidduppal Mar 25, 2020
241dd7a
Update README.md
Sidduppal Mar 25, 2020
a876c33
Update README.md
Sidduppal Mar 25, 2020
8e18f9c
Update README.md
Sidduppal Mar 26, 2020
7b71bbe
automatic argparse, autosummary, automatic copyright update, last upd…
Sidduppal Mar 26, 2020
903e04c
Merge branch 'documentation' of https://github.com/Sidduppal/Autometa…
Sidduppal Mar 26, 2020
60cf464
Update README.md
Sidduppal Mar 26, 2020
f603dd9
Update README.md
Sidduppal Mar 26, 2020
9946922
modified all copyrights added usage and autodoc for all scripts
Sidduppal Mar 31, 2020
25422b3
Merge branch 'documentation' of https://github.com/Sidduppal/Autometa…
Sidduppal Mar 31, 2020
7f8cd8b
changed autometa to run_autometa
Sidduppal Mar 31, 2020
6f639f0
initial commit of documentation for readthedocs format
evanroyrees Mar 13, 2020
c5bc237
first commit
Sidduppal Mar 19, 2020
bdab032
Commiting all files in Scripts folder
Sidduppal Mar 19, 2020
068ec49
Modification to COPYRIGHT and config.py
Sidduppal Mar 19, 2020
2c155c5
stable build
Sidduppal Mar 20, 2020
9d0a8fc
Removed scripts_2, can now be fetched and run by any user
Sidduppal Mar 23, 2020
edc58ca
Create README.md
Sidduppal Mar 25, 2020
0e1757d
Update README.md
Sidduppal Mar 25, 2020
8b3b283
Update README.md
Sidduppal Mar 25, 2020
12c3d01
Update README.md
Sidduppal Mar 25, 2020
8fc2c69
Update README.md
Sidduppal Mar 25, 2020
976420a
Update README.md
Sidduppal Mar 26, 2020
c189649
automatic argparse, autosummary, automatic copyright update, last upd…
Sidduppal Mar 26, 2020
4859440
Update README.md
Sidduppal Mar 26, 2020
1b9ab8d
Update README.md
Sidduppal Mar 26, 2020
9cf7bc4
modified all copyrights added usage and autodoc for all scripts
Sidduppal Mar 31, 2020
a20ecef
changed autometa to run_autometa
Sidduppal Mar 31, 2020
6941c85
Merge branch 'documentation' of https://github.com/Sidduppal/Autometa…
Sidduppal Mar 31, 2020
0fb0178
Applied changes to scripts and docs source files to remove warnings e…
evanroyrees Mar 31, 2020
46923be
Merge branch 'dev' of https://github.com/WiscEvan/Autometa into docum…
evanroyrees Mar 31, 2020
aaa83da
Merge branch 'Sidduppal-documentation' into documentation
evanroyrees Mar 31, 2020
e52c707
fixes PR Review comments Sidduppal/Autometa#1
evanroyrees Apr 1, 2020
68bec2b
environment.yaml and .readthedocs.yaml files for readthedocs integration
evanroyrees Apr 1, 2020
22eb064
attempt to reduce memory consumption in readthedocs.org. Removed pack…
evanroyrees Apr 1, 2020
db0d3c4
removed most dependencies from conda env and have moved to docs/requi…
evanroyrees Apr 1, 2020
1faf667
changed conda file to conda environment.
evanroyrees Apr 1, 2020
df572d8
added pip in environment.yaml dependencies.
evanroyrees Apr 1, 2020
794b469
removed numba from docs/requirements.txt
evanroyrees Apr 1, 2020
081d25e
fixes Sidduppal/documentation#1. minor changes in template.py to refl…
evanroyrees Apr 2, 2020
77150a6
Merge pull request #1 from WiscEvan/documentation
Sidduppal Apr 2, 2020
255d882
addressed Jason's comments for merge to dev
Sidduppal Apr 2, 2020
de08e73
todo box added, function to automatically input modules, sidebar, and…
Sidduppal Apr 3, 2020
51964c7
Updated markers docstring (fixed incorrect f-string) to allow paramet…
evanroyrees Apr 3, 2020
27797f1
Merge pull request #2 from WiscEvan/markers-docstring
Sidduppal Apr 3, 2020
b899010
addressed Jason's comments for merge to dev
Sidduppal Apr 2, 2020
4ef34db
todo box added, function to automatically input modules, sidebar, and…
Sidduppal Apr 3, 2020
f0ccdff
Merge branch 'documentation' of https://github.com/Sidduppal/Autometa…
Sidduppal Apr 3, 2020
01b10e8
final changes, added Ian to copyright, removed hardcoded copyright
Sidduppal Apr 3, 2020
a9fca2c
added reference for todo.py
Sidduppal Apr 3, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions .readthedocs.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Read the Docs configuration file
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details

# Required
version: 2

# Build documentation in the docs/ directory with Sphinx
sphinx:
configuration: docs/conf.py

# Optionally build your docs in additional formats such as PDF and ePub
formats: all

conda:
environment: docs/environment.yaml

python:
version: 3.7
install:
- requirements: docs/requirements.txt
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -25,13 +25,13 @@ conda install -c bioconda -c conda-forge --yes \
tqdm \
numpy \
scikit-learn \
scipy \
samtools \
bedtools \
bowtie2 \
hmmer \
prodigal \
diamond \
ipython \
ndcctools \
parallel \
requests \
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ conda install -n autometa -c bioconda -c conda-forge --yes \
tqdm \
numpy \
scikit-learn \
scipy \
samtools \
bedtools \
bowtie2 \
Expand Down
5 changes: 4 additions & 1 deletion autometa.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
COPYRIGHT
Copyright 2020 Ian J. Miller, Evan R. Rees, Kyle Wolf, Siddharth Uppal,
Shaurya Chanana, Izaak Miller, Jason C. Kwan

Expand All @@ -18,6 +19,7 @@

You should have received a copy of the GNU Affero General Public License
along with Autometa. If not, see <http://www.gnu.org/licenses/>.
COPYRIGHT

Main script to run Autometa
"""
Expand All @@ -34,6 +36,8 @@
logger = logging.getLogger('autometa')


__version__ = "2.0.0"

def init_logger(fpath=None, level=logging.INFO):
"""Initialize logger.

Expand Down Expand Up @@ -137,7 +141,6 @@ def main(args):
main(args)
except KeyboardInterrupt:
logger.info('User cancelled run. Exiting...')
sys.exit(1)
jason-c-kwan marked this conversation as resolved.
Show resolved Hide resolved
except Exception as err:
issue_request = '''

Expand Down
12 changes: 6 additions & 6 deletions autometa/binning/recursive_dbscan.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
COPYRIGHT
Copyright 2020 Ian J. Miller, Evan R. Rees, Kyle Wolf, Siddharth Uppal,
Shaurya Chanana, Izaak Miller, Jason C. Kwan

Expand All @@ -18,6 +19,7 @@

You should have received a copy of the GNU Affero General Public License
along with Autometa. If not, see <http://www.gnu.org/licenses/>.
COPYRIGHT

Cluster contigs recursively searching for bins with highest completeness and purity.
"""
Expand Down Expand Up @@ -238,8 +240,7 @@ def get_clusters(master_df, markers_df, domain='bacteria', completeness=20., pur
----------
master_df : pd.DataFrame
index=contig,
cols=
embedded-kmers are cols 'x','y' and 'z'
cols=['x','y','coverage']
markers_df : pd.DataFrame
wide format, i.e. index=contig cols=[marker,marker,...]
domain : str
Expand Down Expand Up @@ -304,10 +305,9 @@ def binning(master, markers, domain='bacteria', completeness=20., purity=90.,
----------
master : pd.DataFrame
index=contig,
cols=
embedded-kmers are cols 'x','y' and 'z'
taxa cols should be present if `taxonomy` is True.
i.e. [taxid,superkingdom,phylum,class,order,family,genus,species]
cols=['x','y']
taxa cols should be present if `taxonomy` is True.
i.e. [taxid,superkingdom,phylum,class,order,family,genus,species]
markers : pd.DataFrame
wide format, i.e. index=contig cols=[marker,marker,...]
domain : str
Expand Down
46 changes: 22 additions & 24 deletions autometa/common/coverage.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
COPYRIGHT
Copyright 2020 Ian J. Miller, Evan R. Rees, Kyle Wolf, Siddharth Uppal,
Shaurya Chanana, Izaak Miller, Jason C. Kwan

Expand All @@ -18,8 +19,9 @@

You should have received a copy of the GNU Affero General Public License
along with Autometa. If not, see <http://www.gnu.org/licenses/>.
COPYRIGHT

Construct contig coverage table given an input assembly and reads or alignments.
Calculates coverage of contigs
"""


Expand Down Expand Up @@ -86,29 +88,25 @@ def make_length_table(fasta, out):

def get(fasta, out, fwd_reads=None, rev_reads=None, se_reads=None, sam=None,
bam=None, lengths=None, bed=None, nproc=1):
"""Get coverages for assembly `fasta` file using provided files:

Either:
`fwd_reads` and `rev_reads` and/or `se_reads`
or:
`sam`
or:
`bam`
or:
`bed`

Will begin coverage calculation based on files provided checking in the
following order:
1. `bed`
2. `bam`
3. `sam`
4. `fwd_reads` and `rev_reads` and `se_reads`

Event sequence to calculate contig coverages:
1. align paired-end reads to generate alignment.sam
2. sort samfile to generate alignment.bam
3. calculate assembly coverages to generate alignment.bed
4. calculate contig coverages to generate coverage.tsv
"""Get coverages for assembly `fasta` file using provided files.
Either: `fwd_reads` and `rev_reads` and/or `se_reads` or,`sam`, or `bam`, or `bed`.

Notes
-----
Will begin coverage calculation based on files provided checking in the
following order:

#. `bed`
#. `bam`
#. `sam`
#. `fwd_reads` and `rev_reads` and `se_reads`

Event sequence to calculate contig coverages:

#. align reads to generate alignment.sam
#. sort samfile to generate alignment.bam
#. calculate assembly coverages to generate alignment.bed
#. calculate contig coverages to generate coverage.tsv


Parameters
Expand Down
2 changes: 2 additions & 0 deletions autometa/common/exceptions.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
COPYRIGHT
Copyright 2020 Ian J. Miller, Evan R. Rees, Kyle Wolf, Siddharth Uppal,
Shaurya Chanana, Izaak Miller, Jason C. Kwan

Expand All @@ -18,6 +19,7 @@

You should have received a copy of the GNU Affero General Public License
along with Autometa. If not, see <http://www.gnu.org/licenses/>.
COPYRIGHT

File containing customized AutometaExceptions for more specific exception handling
"""
Expand Down
3 changes: 2 additions & 1 deletion autometa/common/external/bedtools.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
COPYRIGHT
Copyright 2020 Ian J. Miller, Evan R. Rees, Kyle Wolf, Siddharth Uppal,
Shaurya Chanana, Izaak Miller, Jason C. Kwan

Expand All @@ -18,7 +19,7 @@

You should have received a copy of the GNU Affero General Public License
along with Autometa. If not, see <http://www.gnu.org/licenses/>.

COPYRIGHT
Script containing wrapper functions for bedtools.
"""

Expand Down
3 changes: 2 additions & 1 deletion autometa/common/external/bowtie.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
COPYRIGHT
Copyright 2020 Ian J. Miller, Evan R. Rees, Kyle Wolf, Siddharth Uppal,
Shaurya Chanana, Izaak Miller, Jason C. Kwan

Expand All @@ -18,7 +19,7 @@

You should have received a copy of the GNU Affero General Public License
along with Autometa. If not, see <http://www.gnu.org/licenses/>.

COPYRIGHT
Script containing wrapper functions for bowtie2.
"""

Expand Down
46 changes: 40 additions & 6 deletions autometa/common/external/diamond.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
COPYRIGHT
Copyright 2020 Ian J. Miller, Evan R. Rees, Kyle Wolf, Siddharth Uppal,
Shaurya Chanana, Izaak Miller, Jason C. Kwan

Expand All @@ -18,7 +19,7 @@

You should have received a copy of the GNU Affero General Public License
along with Autometa. If not, see <http://www.gnu.org/licenses/>.

COPYRIGHT
Class and functions related to running diamond on metagenome sequences
"""

Expand All @@ -38,13 +39,44 @@


class DiamondResult:
"""docstring for DiamondResult.
"""DiamondResult class

Some operator overloading here...
For other examples of this see:
https://www.geeksforgeeks.org/operator-overloading-in-python/
Parameters
----------
qseqid : str
query sequence ID
sseqid : str
subject sequence ID
pident : float
Percentage of identical matches.
length : int
Alignment length.
mismatch : int
Number of mismatches.
gapopen : int
Number of gap openings.
qstart : int
Start of alignment in query.
qend : int
End of alignment in query.
sstart : int
Start of alignment in subject.
send : int
End of alignment in subject sequence.
evalue : float
Expect value.
bitscore : float
Bitscore.

Attributes
----------
sseqids : dict
{sseqid:parameters, sseqid:parameters, ...}
qseqid: str
result query sequence ID

"""

def __init__(self, qseqid, sseqid, pident, length, mismatch, gapopen,
qstart, qend, sstart, send, evalue, bitscore):
self.qseqid = qseqid
Expand Down Expand Up @@ -365,7 +397,9 @@ def main(args):

if __name__ == '__main__':
import argparse
parser = argparse.ArgumentParser('Retrieves blastp hits with provided input assembly')
parser = argparse.ArgumentParser(description="""
Retrieves blastp hits with provided input assembly
""")
parser.add_argument('fasta', help='</path/to/faa/file>')
parser.add_argument('database', help='</path/to/diamond/formatted/database>')
parser.add_argument('acc2taxids', help='</path/to/ncbi/prot.accession2taxid.gz>')
Expand Down
3 changes: 2 additions & 1 deletion autometa/common/external/hmmer.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
COPYRIGHT
Copyright 2020 Ian J. Miller, Evan R. Rees, Kyle Wolf, Siddharth Uppal,
Shaurya Chanana, Izaak Miller, Jason C. Kwan

Expand All @@ -18,7 +19,7 @@

You should have received a copy of the GNU Affero General Public License
along with Autometa. If not, see <http://www.gnu.org/licenses/>.

COPYRIGHT
Functions related to running hmmer on metagenome sequences
"""

Expand Down
34 changes: 27 additions & 7 deletions autometa/common/external/prodigal.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
COPYRIGHT
Copyright 2020 Ian J. Miller, Evan R. Rees, Kyle Wolf, Siddharth Uppal,
Shaurya Chanana, Izaak Miller, Jason C. Kwan

Expand All @@ -18,6 +19,7 @@

You should have received a copy of the GNU Affero General Public License
along with Autometa. If not, see <http://www.gnu.org/licenses/>.
COPYRIGHT

Functions to retrieve orfs from provided assembly using prodigal
"""
Expand Down Expand Up @@ -176,9 +178,32 @@ def contigs_from_headers(fpath):
First determines if all of ID=3495691_2 from description is in header.
"3495691_2" represents the 3,495,691st gene in the 2nd sequence.

i.e. : record.description
Example
-------
.. code-block:: python

#: prodigal versions < 2.6 record
>>>record.id
'k119_1383959_3495691_2'

>>>record.description
'k119_1383959_3495691_2 # 688 # 1446 # 1 # ID=3495691_2;partial=01;start_type=ATG;rbs_motif=None;rbs_spacer=None'
^ ^

>>>record.description.split('#')[-1].split(';')[0].strip()
'ID=3495691_2'

>>>orf_id = '3495691_2'
'3495691_2'

>>>record.id.replace(f'_{orf_id}', '')
'k119_1383959'

#: prodigal versions >= 2.6 record
>>>record.id
'k119_1383959_2'
>>>record.id.rsplit('_',1)[0]
'k119_1383959'

Parameters
----------
fpath : str
Expand All @@ -189,11 +214,6 @@ def contigs_from_headers(fpath):
dict
contigs translated from prodigal ORF description. {orf_id:contig_id, ...}

Raises
-------
ExceptionName
Why the exception is raised.

"""
version = get_versions('prodigal')
if version.count('.') >= 2:
Expand Down
Loading