Skip to content

Commit

Permalink
Merge pull request #81 from biocore/test
Browse files Browse the repository at this point in the history
update README and add conda build recipe
  • Loading branch information
ekopylova committed Feb 18, 2018
2 parents 64031ef + 7a5c538 commit e197852
Show file tree
Hide file tree
Showing 9 changed files with 196 additions and 213 deletions.
2 changes: 1 addition & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ before_install:
- echo "$TRAVIS_OS_NAME"
- if [[ "$TRAVIS_OS_NAME" == "linux" ]]; then wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh; fi
- if [[ "$TRAVIS_OS_NAME" == "osx" ]]; then wget https://repo.continuum.io/miniconda/Miniconda3-latest-MacOSX-x86_64.sh -O miniconda.sh; fi
- if [[ "$TRAVIS_OS_NAME" == "osx" ]]; then brew install homebrew/science/rnammer; fi
- if [[ "$TRAVIS_OS_NAME" == "osx" ]]; then brew update && brew install homebrew/science/rnammer; fi
- bash miniconda.sh -b -p $HOME/miniconda
- export PATH="$HOME/miniconda/bin:$PATH"
- hash -r
Expand Down
185 changes: 55 additions & 130 deletions README.org
Original file line number Diff line number Diff line change
Expand Up @@ -13,150 +13,75 @@ As Python 3 matures and majority python packages support Python 3, the scientifi
micronota can annotate multiple features including coding genes, prophage, CRISPR, tRNA, rRNA and other ncRNAs. It has a customizable framework to integrate additional tools and databases. Generally, the annotation can be classified into 2 categories: structural annotation and functional annotation. Structural annotation is the identification of the genetic elements on the sequence and functional annotation is to assign functions to those elements.

* Install

To install the latest release of micronota:
** install via conda
To install the latest release of micronota:
#+BEGIN_SRC sh
conda install micronota
#+END_SRC


Or you can install through ~pip~:
** install via pip
You can install through ~pip~:
#+BEGIN_SRC sh
pip install micronota
#+END_SRC

To install the latest developping version:
To install or update to the latest developping version:
#+BEGIN_SRC
pip install git+git://github.com/biocore/micronota.git
#+END_SRC

* Prepare Databases

To prepare (download and format) the files of TIGRFAM to the right form read by micronota:
#+BEGIN_SRC
micronota database prepare tigrfam --cache_dir ~/database
#+END_SRC
* Print Configure Info

To check the micronota setup, you can run:
micronota relies on external packages like prodigal, aragorn, etc. for annotation. You need to install those external packages (this is implicitly installed when micronota is installed via conda). To simplify the process, you can install with conda:
#+BEGIN_SRC
micronota info
#+END_SRC

It will print out the system info, databases available, external tools, and other configuration info.

* Config Files

There are 3 types of config files user can set for workflow, logging, and parameters, respectively. All of the 3 config files can be specified on the command line (=--cfg=, =--log=, and =--param=) to override the default settings. Besides, you can also put the frequently used settings into global config files. The global config files should be put in the config directory of micronota. On linux it is usually =~/.config/micronota=; on Mac, it is usually =~/Library/Application Support/micronota=. You can also find the directory by running the following command
#+BEGIN_SRC python
python -c "import click; print(click.get_app_dir('micronota'))"
conda install --file https://raw.githubusercontent.com/biocore/micronota/master/ci/conda_requirements.txt
#+END_SRC

If the directory does not exist, just simply create it with =mkdir= command.

It is always good to confirm your settings by printing out the setup:
#+BEGIN_EXAMPLE
micronota info
# OR if you provide it on command line:
micronota --cfg misc.cfg --param param.cfg info
#+END_EXAMPLE

** workflow config
This is how [[https://github.com/biocore/micronota/blob/master/micronota/support_files/misc.cfg][default workflow config]] looks like. You can copy and modify that to create your own config file either put in the config dir (with file name of =misc.cfg=) or provided in command with =--cfg=. micronota will only read one workflow config file, which is the first one it finds in the order of command line > global > default.

Here is an example modified that you can use on command line or global config dir:
#+BEGIN_EXAMPLE
[general]
# use another dir as database dir
db_dir = better/db

[feature]
# run prodigal first
prodigal = 1
# don't run infernal
infernal = 0

# next to annotate CDS
[cds]
# run diamond tegother with uniref50
diamond = uniref50
#+END_EXAMPLE

The format of the config file is widely used in different OS platforms and described [[https://docs.python.org/3/library/configparser.html#supported-ini-file-structure][here]]. =0= / =1=, =no= / =yes= , =false= / =true=, =on= / =off= can all be used to turn off or on each tool. If the tool need a database file to run with, specify the database instead of the indicator.

** Logging config
This is how [[https://github.com/biocore/micronota/blob/master/micronota/support_files/log.cfg][default logging config]] looks like. It is used to config logging utilitiy to print out useful info. You can change logging config similarly as you do for workflow config. The global file should be named as =log.cfg= in the config dir if you plan to define global logging config.

For example, if you want to reduce the verbosity of logging, you can change level to =ERROR= in your global logging config file:
#+BEGIN_EXAMPLE
[loggers]
keys=root

[handlers]
keys=consoleHandler

[formatters]
keys=simpleFormatter

[logger_root]
level=ERROR
handlers=consoleHandler

[handler_consoleHandler]
class=StreamHandler
formatter=simpleFormatter
args=(sys.stdout,)

[formatter_simpleFormatter]
format=%(asctime)s %(name)s %(levelname)s %(message)s
datefmt=%Y-%m-%d %H:%M:%S
#+END_EXAMPLE
** Parameter config
The parameter config is used to tune the parameters of each external tools. This is how the [[https://github.com/biocore/micronota/blob/master/micronota/support_files/param.cfg][default parameter config]] looks like. You can specify the parameter for each individual tools. For example, if you want to run Prodigal with genetic translation table 1, instead of the default translation table, you can create a file param.cfg:
#+BEGIN_EXAMPLE
[prodigal]
# set translation table to 1
-t = 1
#+END_EXAMPLE

Different from the other 2 config files, all the param config files will be read by micronota in the order of default, global and command line param config, with the following one overriding its previous.


* Sequence Features to Identify

| Features | Supported | Tools |
|-------------------------+-----------+--------------------------------------------------|
| coding gene | yes | Prodigal |
| tRNA | ongoing | Aragorn |
| ncRNA | yes | Infernal |
| CRISPR | ongoing | MinCED |
| ribosomal binding sites | ongoing | RBSFinder |
| prophage | ongoing | PHAST |
| replication origin | todo | Ori-Finder 1 (bacteria) & Ori-Finder 2 (archaea) |
| microsatellites | todo | nhmmer? |
| signal peptide | ongoing | SignalP |
| transmembrane proteins | ongoing | TMHMM |

* Databases Supported

| Databases | Supported |
|-----------+-----------|
| TIGRFAM | yes |
| UniRef | yes |
| Rfam | ongoing |

* Getting help

To get help with micronota, you should use the [[https://biostars.org/t/micronota][micronota tag]] on Biostars. The developers regularly monitor the =micronota= tag on Biostars.


* Developing

If you're interested in getting involved in micronota development, see [[https://github.com/biocore/micronota/blob/master/CONTRIBUTING.md][CONTRIBUTING.md]].

See the list of [[https://github.com/biocore/micronota/graphs/contributors][micronota's contributors]].

* Prepare Databases
You need to download database files:
| Databases | Supported | Download URL |
|-----------+-----------+--------------|
| UniRef | yes | |
| Rfam | yes | |
| ResFam | ongoing | |

* Usage
** You can see the command supported by micronota with:
#+BEGIN_SRC sh
micronota -h
#+END_SRC
** To run the default annotation pipeline:
#+BEGIN_SRC sh
micronota annotate -i <input.fna> -o <output-dir> --out-fmt genbank --kingdom bacteria
#+END_SRC
** Customize the annotaton:
You can set up what annotation to run with what parameters by providing a config file when running annotation:
#+BEGIN_SRC sh
micronota annotate -i <input.fna> -o <output-dir> --out-fmt genbank --kingdom bacteria --config <your-config>
#+END_SRC

You can modify the [[https://github.com/biocore/micronota/blob/master/micronota/bacteria.yaml][default config file]] to create your own config file.

* Sequence Features to Annotate

| Features | Supported | Tools |
|------------------------------------------+-----------+--------------------------------------------------|
| coding gene | yes | Prodigal |
| tRNA | yes | Aragorn |
| ncRNA | yes | Infernal + Rfam |
| CRISPR | yes | MinCED |
| rho-independent transcription terminator | yes | transtermhp |
| tandem repeat | yes | tandem repeat finder (trf) |
| ribosomal binding sites | ongoing | RBSFinder |
| prophage | ongoing | PHAST |
| replication origin | todo | Ori-Finder 1 (bacteria) & Ori-Finder 2 (archaea) |
| microsatellites | todo | nhmmer? |
| signal peptide | ongoing | SignalP |
| transmembrane proteins | ongoing | TMHMM |

* Contributing

If you're interested in getting involved in micronota development, see [[https://github.com/biocore/micronota/blob/master/CONTRIBUTING.md][CONTRIBUTING.md]].

See the list of [[https://github.com/biocore/micronota/graphs/contributors][micronota's contributors]].

* Licensing

micronota is available under the new BSD license. See [[https://github.com/biocore/micronota/blob/master/COPYING.txt][COPYING.txt]] for micronota's license, and [[https://github.com/biocore/micronota/tree/master/licenses][the licenses directory]] for the licenses of third-party software and databasese that are (either partially or entirely) distributed with micronota.
micronota is available under the new BSD license. See [[https://github.com/biocore/micronota/blob/master/COPYING.txt][COPYING.txt]] for micronota's license, and [[https://github.com/biocore/micronota/tree/master/licenses][the licenses directory]] for the licenses of third-party software and databasese that are (either partially or entirely) distributed with micronota.
73 changes: 73 additions & 0 deletions micronota/bacteria.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
general:
protein_xref: '~/database/protein.sqlite'
CDS:
prodigal:
params: '-f gff'
priority: 100
threads: 1
rho_independent_terminator:
transtermhp:
params: '-p $TRANSTERMHP'
# '--all-context' # output all predicted terminators instead of legitimate ones
priority: 99
threads: 1
ncRNA:
cmscan:
params: ''
priority: 50
threads: 1
db: '~/database/Rfam/v12.2/rfam-tRNA-rRNA.cm'
output: 'cmscan'
CRISPR:
minced:
params: ''
priority: 50
threads: 1
tRNA:
aragorn:
params: '-w'
priority: 50
threads: 1
tandem_repeats:
tandem_repeats_finder:
params: ''
priority: 50
threads: 1
output: 'tandem_repeats_finder'
rRNA:
rnammer:
params: '-m lsu,ssu,tsu'
# cmscan_rRNA:
# params: ''
# priority: 50
# threads: 1
# db: '/Users/zech/database/Rfam/v12.2/bacteria.cm'
protein:
diamond_uniref90:
params: '--index-chunks 1 --id 90 --subject-cover 80 --query-cover 80 --max-target-seqs 3'
priority: 50
threads: 1
db: '~/database/uniref/20161130/uniref90.dmnd'
input: 'prodigal.faa'
output: 'diamond_uniref90.faa'
diamond_uniref50:
params: '--index-chunks 1 --id 50 --subject-cover 80 --query-cover 80 --max-target-seqs 3'
priority: 50
threads: 1
db: '~/database/uniref/20161130/uniref50.dmnd'
input: 'diamond_uniref90.faa'
output: 'diamond_uniref50.faa'
# hmmscan_tigrfam:
# params: '−−cug_nc'
# priority: 60
# threads: 1
# db: '~/database/TIGRFAM/tigrfam.hmm'
# input: ''
# output: ''
# hmmscan_tigrfam:
# params: '−−cug_ga'
# priority: 60
# threads: 1
# db: '~/database/TIGRFAM/tigrfam.hmm'
# input: ''
# output: ''
3 changes: 2 additions & 1 deletion micronota/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,8 @@ def list_commands(self, ctx):
rv = []
cmd_folder = abspath(join(dirname(__file__), 'commands'))
for filename in listdir(cmd_folder):
if filename.endswith('.py') and filename != '__init__.py':
# don't list the commands that starts with "_"; but they are still available to run
if filename.endswith('.py') and not filename.startswith('_'):
rv.append(splitext(filename)[0])
rv.sort()
return rv
Expand Down
8 changes: 2 additions & 6 deletions micronota/commands/_rfam.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,12 +19,12 @@

# keep this command hidden from help msg
# this option is only available in click v7
@click.command(hidden=True)
@click.command()
@click.option('--operation', type=click.Choice(['bacteria', 'archaea', 'eukarya', 'default']),
default='default', required=True,
help='Keep bacteria/archaea/eukarya rRNA models or filter away tRNA, tmRNA, rRNA models (default)')
@click.argument('infile', type=click.File('r'), nargs=1)
@click.argument('outfile', type=click.Path(), nargs=1, default=None)
@click.argument('outfile', type=click.Path(), nargs=1, default='rfam_filtered.cm')
@click.pass_context
def cli(ctx, operation, infile, outfile):
'''Create rfam database for micronota usage.'''
Expand All @@ -39,12 +39,8 @@ def cli(ctx, operation, infile, outfile):
('RF01960', 'SSU_rRNA_eukarya'),
('RF02543', 'LSU_rRNA_eukarya')}}
if operation == 'default':
if outfile is None:
outfile = 'miscRfam.cm'
with open(join(outfile), 'w') as out:
filter_models(infile, out)
else:
if outfile is None:
outfile = join(kingdom + '.cm')
with open(outfile, 'w') as out:
filter_models(infile, out, negate=True, models=kingdom_models[kingdom])
2 changes: 1 addition & 1 deletion micronota/commands/_uniprot.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
logger = getLogger(__name__)


@click.command(hidden=True)
@click.command()
@click.argument('infile', type=click.Path(), nargs=-1)
@click.argument('outfile', type=click.Path(), nargs=1)
@click.pass_context
Expand Down

0 comments on commit e197852

Please sign in to comment.