Pipeline for analyzing results of in-vivo nanoparticle experiments from the Dahlman Lab at Georgia Tech.
How to run
Clone this repository into your home directory
Open a terminal window
Run the following command:
File path to fastq.gz files
Output directory location
Barcode list (csv file, one 8-nt barcode per line)
Raw counts of each barcode per read file
Barcode counts normalized to input
Average and standard deviation of normalized barcode counts between replicates of the same cell type
Same as cell_type_variance.csv except replicates are included next to their aggregate metrics
Base percentages for each position in the three randomized pcr regions
Graphs generated by the normalization.R script
Counts where there was an 'N' in the barcode region and number of instances where no barcode was found in the expected region for each fastq file
The following probe binding site was used to find barcodes in each read: 'CCTGCTAGTCCACGTCCATGTCCACC'. In instances where the probe region was not found, the read was skipped. This was likely due to an 'N' being assigned to a position in the probe binding site because a base could not be confidently called. Similarly, if there was an 'N' in the barcode region, this read was skipped as well.
This pipeline requires Python version 3.7.x. Important note: MacOS Mojave version 10.14.6 and python 3.7.3 will cause Tkinter to crash the OS. This is a known Mac bug. If you are using MacOS 10.14.6, downgrade your python version to 3.7.0.
Author: Jack Feldman
Dahlman Lab Georgia Tech