formatting document

uab-cgds-worthey · Jan 25, 2024 · a150750 · a150750
1 parent 7e5e303
commit a150750
Show file tree

Hide file tree

Showing 4 changed files with 15 additions and 20 deletions.
diff --git a/README.md b/README.md
@@ -7,13 +7,11 @@ Markdown](https://github.com/uab-cgds-worthey/DITTO/actions/workflows/linting.ym
 
 ***!!! For research purposes only !!!***
 
-> **_NOTE:_**  In a past life, DITTO used a different remote Git management provider, [UAB
+> ***NOTE:***  In a past life, DITTO used a different remote Git management provider, [UAB
 > Gitlab](https://gitlab.rc.uab.edu/center-for-computational-genomics-and-data-science/sciops/ditto). It was migrated to
 > Github in April 2023, and the Gitlab version has been archived.
 
-
-**Aim:** We aim to develop a pipeline for accurate and rapid interpretation of genetic variants for pathogenicity using patient’s genotype (VCF) information.
-
+We aim to develop a pipeline for accurate and rapid interpretation of genetic variants for pathogenicity using patient’s genotype (VCF) information.
 
 ## Usage
 
@@ -28,7 +26,7 @@ in this [GitHub repo](https://github.com/uab-cgds-worthey/DITTO-API).
 
 ### Setting up to use locally
 
-> **_NOTE:_** This setup will allow one to annotate a VCF sample and make DITTO predictions. Currently tested only in Cheaha (UAB HPC). Docker versions may need to be explored later to make it
+> ***NOTE:*** This setup will allow one to annotate a VCF sample and make DITTO predictions. Currently tested only in Cheaha (UAB HPC). Docker versions may need to be explored later to make it
 > useable in Mac and Windows.
 
 #### System Requirements
@@ -63,12 +61,11 @@ git clone https://github.com/uab-cgds-worthey/DITTO.git
 
 Please follow the steps mentioned in [install_openCravat.md](docs/install_openCravat.md).
 
-> **_NOTE:_** Current version of OpenCravat that we're using doesn't support "Spanning or overlapping deletions"
+> ***NOTE:*** Current version of OpenCravat that we're using doesn't support "Spanning or overlapping deletions"
 > variants i.e. variants with `*` in `ALT Allele` column. More on these variants
 > [here](https://gatk.broadinstitute.org/hc/en-us/articles/360035531912-Spanning-or-overlapping-deletions-allele-).
 > These will be ignored when running the pipeline.
 
-
 #### Run DITTO pipeline
 
 Create an environment via conda or pip. Below is an example to install `nextflow`.
@@ -85,7 +82,6 @@ conda activate ditto-env
 conda install bioconda::nextflow
 ```
 
-
 Please make a samplesheet with VCF files (incl. path). Please make sure to edit the directory paths as needed.
 
 ```sh
@@ -103,14 +99,12 @@ To run on UAB cheaha, please update the `model.job` file and submit a slurm job
 sbatch model.job
 ```
 
-
 ## Reproducing the DITTO model
 
 Detailed instructions on reproducing the model is explained in [build_DITTO.md](docs/build_DITTO.md)
 
-
 ## Contact information
 
 For queries, send an email with clear description to
 
-Tarun Mamidi    -   tmamidi@uab.edu
+Tarun Mamidi    -   <tmamidi@uab.edu>
diff --git a/docs/build_DITTO.md b/docs/build_DITTO.md
@@ -11,7 +11,6 @@ sources.
 
 :fire: DITTO is currently trained on variants from ClinVar and is not intended for clinical use.
 
-
 ## System Requirements
 
 *OS:*
@@ -30,18 +29,16 @@ sources.
 - Storage: ~1TB (includes annotation databases from OpenCravat)
 - RAM: ~50GB
 
-> **_NOTE:_** We used 10 CPU cores, 50GB memory for training DITTO. The tuning and training process took ~16 hrs. Since
+> ***NOTE:*** We used 10 CPU cores, 50GB memory for training DITTO. The tuning and training process took ~16 hrs. Since
 > DITTO uses tensorflow architecture, this process can be potentially accelerated using GPUs.
 
-
 ## Installation
 
-### Requirements:
+### Requirements
 
 - DITTO repo from GitHub
 - OpenCravat with databases to annotate
 
-
 To fetch DITTO source code, change in to directory of your choice and run:
 
 ```sh
@@ -75,7 +72,7 @@ Download the latest clinVar variants: [Download VCF](https://ftp.ncbi.nlm.nih.go
 oc run clinvar.vcf.gz -l hg38 -t csv --package mypackage -d path/to/output/directory/
 ```
 
-> **_NOTE:_** By default OpenCravat uses all available CPUs. Please specify the number of CPU cores using this parameter
+> ***NOTE:*** By default OpenCravat uses all available CPUs. Please specify the number of CPU cores using this parameter
 > in the above command `--mp 2`. Minimum number of CPUs to use is 2.
 
 ## Preprocessing
@@ -116,7 +113,6 @@ Follow the below steps to install and add more databases for annotation and befo
 
 4. Follow the steps from Preprocessing above to parse, filter, process, tune and train DITTO.
 
-
 ## Benchmarking
 
 Please follow the [python notebook](../src/analysis/opencravat_latest_benchmarking-Consequence_80_20.ipynb) to benchmark

diff --git a/docs/install_openCravat.md b/docs/install_openCravat.md
@@ -1,9 +1,11 @@
 # OpenCravat
+
 Original documentation for OpenCravat can be found [here](https://open-cravat.readthedocs.io/en/latest/index.html).
 
 ## Installation
 
 ### Create conda environment
+
 ```sh
 # create conda environment. Needed only the first time.
 conda create -n opencravat
@@ -13,9 +15,11 @@ conda activate opencravat
 ```
 
 ### Install openCravat
+
 ```sh
 pip3 install open-cravat==2.4.1
 ```
+
 ### Set Modules Directory
 
 Use `oc config md` to see where modules directory is currently pointed to. To change the modules directory, use `oc
@@ -24,6 +28,7 @@ config md [new directory]` to point OpencRAVAT to the new directory.
 Test it by using `oc config md` command. It should output the new modules directory.
 
 ### Install necessary modules for DITTO
+
 ```sh
 oc module install-base
 

diff --git a/src/analysis/filter.sh b/src/analysis/filter.sh
@@ -1,7 +1,7 @@
-# Filter the DITTO scores and other annotations after running the pipeline. Example tested on CAGI project
-
 #!/bin/bash
 
+# Filter the DITTO scores and other annotations after running the pipeline. Example tested on CAGI project
+
 # Specify the input folder containing the CSV files
 input_folder="/data/project/worthey_lab/projects/experimental_pipelines/tarun/DITTO/data/processed/CAGI_TR/"