Skip to content

Commit

Permalink
PR requests part-2
Browse files Browse the repository at this point in the history
  • Loading branch information
tkmamidi committed Feb 1, 2024
1 parent 9e0dc66 commit e324714
Show file tree
Hide file tree
Showing 7 changed files with 25 additions and 12 deletions.
2 changes: 1 addition & 1 deletion .test_data/README
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@ This directory has 3 files -

`testing_variants_hg38.vcf.gz` - We custom made a test VCF file with few variants from every chromosome (1-22,X,Y)

`file_list` - contains list of above 2 test vcf files with relative path. This file is used to test nextflow pipeline
`file_list.txt` - contains list of above 2 test vcf files with relative path. This file is used to test nextflow pipeline
File renamed without changes.
8 changes: 8 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,11 @@ YYYY-MM-DD John Doe
```

---

```txt
2024-02-01 Tarun Mamidi
* Uses OpenCRAVAT for annotations
* Uses Neural Networks from keras instead of traditional scikit-learn models
* Nextflow pipeline to annotate, parse and DITTO predictions
```
21 changes: 13 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,12 +14,16 @@ Markdown](https://github.com/uab-cgds-worthey/DITTO/actions/workflows/linting.ym
DITTO is an explainable neural network that can be helpful for accurate and rapid interpretation of small
genetic variants for pathogenicity using patient’s genotype (VCF) information.

## Usage
## Using DITTO

DITTO scores for variants can be obtained by the below 3 ways. Webapp and API are for single variant analysis and the
local setup is for batch/bulk variant predictions.

### Webapp
<!-- markdown-link-check-disable -->
DITTO is available for public use at this [website](https://cgds-ditto.streamlit.app/).
<!-- markdown-link-check-enable -->

### API

DITTO is not hosted as a public API but one can serve up locally to query DITTO scores. Please follow the instructions
Expand Down Expand Up @@ -72,25 +76,26 @@ Please follow the steps mentioned in [install_openCravat.md](docs/install_openCr
#### Run DITTO pipeline

Create an environment via conda or pip. Below is an example to install `nextflow`.
Create an environment via conda. Below is an example to install `nextflow`.

- [Anaconda virtual environment](https://docs.anaconda.com/free/anaconda/install/index.html)

```sh
# create environment. Needed only the first time. Please use the above link if you're not using Mac.
conda create --name envi ditto-env
conda create --name ditto-env

conda activate ditto-env

# Install nextflow
conda install bioconda::nextflow
```

Please make a samplesheet with VCF files (incl. path). Please make sure to edit the directory paths as needed.
Please make a samplesheet with VCF files (incl. path). Please make sure to edit the directory paths as needed and run
the pipeline as shown below.

```sh
nextflow run pipeline.nf \
--outdir /data/ \
--outdir ./data/ \
-work-dir ./wor_dir \
--build hg38 -with-report \
--oc_modules /data/opencravat/modules \
Expand All @@ -114,6 +119,6 @@ For queries, please open a GitHub issue.
For urgent queries, send an email with clear description to

|Name | Email |
------|--------|
Tarun Mamidi | <tmamidi@uab.edu>
Liz Worthey | <lworthey@uab.edu>
|------|--------|
|Tarun Mamidi | <tmamidi@uab.edu>|
|Liz Worthey | <lworthey@uab.edu>|
2 changes: 1 addition & 1 deletion model.job
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,6 @@ module load Anaconda3
--outdir /data/results \
-work-dir .work_dir/ \
--build hg38 -c cheaha.config -with-report \
--sample_sheet .test_data/file_list -resume
--sample_sheet .test_data/file_list.txt -resume

#https://training.nextflow.io/basic_training/cache_and_resume/#how-to-organize-in-silico-experiments
2 changes: 1 addition & 1 deletion pipeline.nf
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
nextflow.enable.dsl=2

// Define the command-line options to specify the path to VCF files
params.sample_sheet = '.test_data/file_list'
params.sample_sheet = '.test_data/file_list.txt'
params.build = "hg38"
params.oc_modules = "/data/project/worthey_lab/projects/experimental_pipelines/tarun/opencravat/modules"
// Define the Scratch directory
Expand Down
2 changes: 1 addition & 1 deletion src/analysis/filter.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/bin/bash

set -euo pipefail
# Filter the DITTO scores and other annotations after running the pipeline. Example tested on CAGI project

# Specify the input folder containing the CSV files
Expand Down

0 comments on commit e324714

Please sign in to comment.