Skip to content

Commit

Permalink
Updated module order
Browse files Browse the repository at this point in the history
  • Loading branch information
tjakobi committed Feb 23, 2018
1 parent e44990c commit a0c3089
Showing 1 changed file with 69 additions and 105 deletions.
174 changes: 69 additions & 105 deletions docs/Modules.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,20 @@
This project contains the framework of the circular RNA toolbox ``circtools``.

# Modules

Circtools currently offers four modules:
Circtools currently offers seven modules:

```
$ circtools
$ circtools --help
usage: circtools [-V] <command> [<args>]
Available commands:
enrich: circular RNA RBP enrichment scan
primer: circular RNA primer design tool
detect: circular RNA detection with DCC
reconstruct: circular RNA reconstruction with FUCHS
quickcheck: circular RNA sequencing library quick checks
circtest: circular RNA statistical testing
enrich: circular RNA RBP enrichment scan
exon: circular RNA alternative exon analysis
quickcheck: circular RNA sequencing library quick checks
primer: circular RNA primer design tool
circtools: a modular, python-based framework for circRNA-related tools that
Expand All @@ -26,77 +25,18 @@ positional arguments:
optional arguments:
-h, --help show this help message and exit
-V, --version show program's version number and exit
-V, --version show program's version number and exit
```

### detect
## detect

The ``detect`` command is an interface to [DCC](https://github.com/dieterich-lab/DCC), also developed at the Dieterich lab. Please see the corresponding [manual](https://github.com/dieterich-lab/DCC) on the GitHub project for instructions how to run DCC. The parameters supplied to circtools will be directly passed to DCC.

### reconstruct
## reconstruct

The ``reconstruct`` command is an interface to [FUCHS](https://github.com/dieterich-lab/FUCHS). FUCHS is employing DCC-generated data to reconstruct circRNA structures. Please see the corresponding [manual](https://github.com/dieterich-lab/FUCHS) on the GitHub project for instructions how to run FUCHS. All parameters supplied to circtools will be directly passed to FUCHS.


### primer

The ``primer`` command is used to design and visualize primers required for follow up wet lab experiments to verify circRNA candidates. The full documentation for the ``primer`` module can be found in its own [manual](R/circtools/vignettes/plot-transcripts.md).

### enrich

The ``enrichment`` module may be used to identify circRNAs enriched for specific RNA binding proteins (RBP) based on DCC-identified circRNAs and processed [eCLIP](http://www.nature.com/nmeth/journal/v13/n6/full/nmeth.3810.html) data. For K526 and HepG2 cell lines plenty of this data is available through the [ENCODE](https://www.encodeproject.org/search/?type=Experiment&assay_title=eCLIP)
project. The enrich module understands the following options:

```
usage: circtools [-h] -c CIRC_RNA_INPUT -b BED_INPUT -a ANNOTATION -g
GENOME_FILE [-o OUTPUT_DIRECTORY] [-i NUM_ITERATIONS]
[-p NUM_PROCESSES] [-t TMP_DIRECTORY] [-T THRESHOLD]
[-P PVAL] [-H HAS_HEADER] [-F OUTPUT_FILENAME]
[-I INCLUDE_FEATURES]
circular RNA RBP enrichment tools
optional arguments:
-h, --help show this help message and exit
Required options:
-c CIRC_RNA_INPUT, --circ-file CIRC_RNA_INPUT
Path to the CircRNACount file generated by DCC
-b BED_INPUT, --bed-input BED_INPUT
One or more BED files containing features to overlap
-a ANNOTATION, --annotation ANNOTATION
Genome reference annotation file used to not shuffle
into intragenic regions
-g GENOME_FILE, --genome GENOME_FILE
Genome file for use with bedtools shuffle. See
bedtools man page for details.
Additional options:
-o OUTPUT_DIRECTORY, --output OUTPUT_DIRECTORY
The output folder for files created by circtest
[default: .]
-i NUM_ITERATIONS, --iterations NUM_ITERATIONS
Number of iterations for CLIP shuffling [default:
1000]
-p NUM_PROCESSES, --processes NUM_PROCESSES
Number of threads to distribute the work to
-t TMP_DIRECTORY, --temp TMP_DIRECTORY
Temporary directory used by pybedtools
-T THRESHOLD, --threshold THRESHOLD
p-value cutoff
-P PVAL, --pval PVAL p-value cutoff
-H HAS_HEADER, --header HAS_HEADER
Defines if the circRNA input file has a header line
[default: no]
-F OUTPUT_FILENAME, --output-filename OUTPUT_FILENAME
Defines the output file prefix [default: output]
-I INCLUDE_FEATURES, --include-features INCLUDE_FEATURES
Defines the the features which should be used for
shuffling. May be specified multiple times. [default:
all - shuffle over the whole genome]
```
### circtest
## circtest

The ``circtest`` command is an interface to [CircTest](https://github.com/dieterich-lab/CircTest). The module a a very convenient way to employ statistical testing to circRNA candidates generated with DCC without having to write an R script for each new experiment. For detailed information on the implementation itself take a look at the [CircTest documentation](https://github.com/dieterich-lab/CircTest). In essence, the module allows dynamic grouping of the columns (samples) in the DCC data.

Expand Down Expand Up @@ -170,7 +110,62 @@ circtools circtest -d DCC_DIR
```


### exon
## enrich

The ``enrichment`` module may be used to identify circRNAs enriched for specific RNA binding proteins (RBP) based on DCC-identified circRNAs and processed [eCLIP](http://www.nature.com/nmeth/journal/v13/n6/full/nmeth.3810.html) data. For K526 and HepG2 cell lines plenty of this data is available through the [ENCODE](https://www.encodeproject.org/search/?type=Experiment&assay_title=eCLIP)
project. The enrich module understands the following options:

```
usage: circtools [-h] -c CIRC_RNA_INPUT -b BED_INPUT -a ANNOTATION -g
GENOME_FILE [-o OUTPUT_DIRECTORY] [-i NUM_ITERATIONS]
[-p NUM_PROCESSES] [-t TMP_DIRECTORY] [-T THRESHOLD]
[-P PVAL] [-H HAS_HEADER] [-F OUTPUT_FILENAME]
[-I INCLUDE_FEATURES]
circular RNA RBP enrichment tools
optional arguments:
-h, --help show this help message and exit
Required options:
-c CIRC_RNA_INPUT, --circ-file CIRC_RNA_INPUT
Path to the CircRNACount file generated by DCC
-b BED_INPUT, --bed-input BED_INPUT
One or more BED files containing features to overlap
-a ANNOTATION, --annotation ANNOTATION
Genome reference annotation file used to not shuffle
into intragenic regions
-g GENOME_FILE, --genome GENOME_FILE
Genome file for use with bedtools shuffle. See
bedtools man page for details.
Additional options:
-o OUTPUT_DIRECTORY, --output OUTPUT_DIRECTORY
The output folder for files created by circtest
[default: .]
-i NUM_ITERATIONS, --iterations NUM_ITERATIONS
Number of iterations for CLIP shuffling [default:
1000]
-p NUM_PROCESSES, --processes NUM_PROCESSES
Number of threads to distribute the work to
-t TMP_DIRECTORY, --temp TMP_DIRECTORY
Temporary directory used by pybedtools
-T THRESHOLD, --threshold THRESHOLD
p-value cutoff
-P PVAL, --pval PVAL p-value cutoff
-H HAS_HEADER, --header HAS_HEADER
Defines if the circRNA input file has a header line
[default: no]
-F OUTPUT_FILENAME, --output-filename OUTPUT_FILENAME
Defines the output file prefix [default: output]
-I INCLUDE_FEATURES, --include-features INCLUDE_FEATURES
Defines the the features which should be used for
shuffling. May be specified multiple times. [default:
all - shuffle over the whole genome]
```


## exon

The exon module of circtools employs the [ballgown R package](https://www.bioconductor.org/packages/release/bioc/html/ballgown.html) to combine data generated with DCC and circtest with ballgown-compatible `stringtie` output or cufflinks output converted via [tablemaker](https://github.com/leekgroup/tablemaker) in order get deeper insights into differential exon usage within circRNA candidates.

Expand Down Expand Up @@ -230,39 +225,8 @@ Output options:
[Default: exon_analysis]
```

### quickcheck

The quickcheck module of circtools is an easy way to check the results of a DCC run for problems and to quickly assess the number of circRNAs in a given experiment. The module needs the mapping log files produced by STAR as well as the directory with the DCC results. The module than generates a series of figures in PDF format to assess the results.

```
usage: circtools [-h] -d DCC_DIR -s STAR_DIR -l CONDITION_LIST -g GROUPING
[-o OUTPUT_DIRECTORY] [-n OUTPUT_NAME]
## primer

circular RNA sequencing library quality assessment
optional arguments:
-h, --help show this help message and exit
Required:
-d DCC_DIR, --DCC DCC_DIR
Path to the detect/DCC data directory
-s STAR_DIR, --star STAR_DIR
Path to the base STAR data directory containing sub-
folders with per-sample mappings
-l CONDITION_LIST, --condition-list CONDITION_LIST
Comma-separated list of conditions which should be
comparedE.g. "RNaseR +","RNaseR -"
-g GROUPING, --grouping GROUPING
Comma-separated list describing the relation of the
columns specified via -c to the sample names specified
via -l; e.g. -g 1,2 and -r 3 would assign sample1 to
each even column and sample 2 to each odd column
Output options:
-o OUTPUT_DIRECTORY, --output-directory OUTPUT_DIRECTORY
The output directory for files created by circtest
[Default: ./]
-n OUTPUT_NAME, --output-name OUTPUT_NAME
The output name for files created by circtest
[Default: quickcheck]
```
The ``primer`` command is used to design and visualize primers required for follow up wet lab experiments to verify circRNA candidates. The full documentation for the ``primer`` module can be found in its own [manual](R/circtools/vignettes/plot-transcripts.md).

0 comments on commit a0c3089

Please sign in to comment.