Skip to content

Scripts developed in the GuazzaroniLab for long-read data processing and 16S profiling of microbial communities in hospital surfaces

License

Notifications You must be signed in to change notification settings

GuazzaroniLab/nanopore_ICU_profiling

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Nanopore sequencing provides rapid and reliable insight into microbial profiles of Intensive Care Units

Guilherme Marcelino Viana de Siqueira, Felipe Marcelo Pereira-dos-Santos, Rafael Silva-Rocha, María-Eugenia Guazzaroni

In this repository we provide a custom pipeline for taxonomic assignment of long-read 16S sequencing. We have submitted this work in Biorxiv!!

Usage

In Python, import the library provided in minion_analysis.py with the following command:

import minion_analysis

Next, provide paths in which demultiplexed sequencing files (.FASTQ) might be found. Assuming that we have our demultiplexed files for two sequencing runs in the subdirectories Experiment1 and Experiment2, within a directory experiment_data in the user's Documents directory, the command would look like:

path = '/home/user/Documents/experiment_data'
lib = ['Experiment1', 'Experiment2']

The next steps automate the pipeline guppy_barcoder (for removing barcodes) -> NanoFilt -> Minimap2. The minimap2 step depends on the file refseq_16S.fa, that can be found in this repository as a .zip file

HC_analysis = minion_analysis.Analysis(path, lib)

HC_analysis.merge_files(cpus = 4)

HC_analysis.guppy_barcoder(cpus = 2)

HC_analysis.nanofilt(q = 7, cpus=2, just_filt = False)

HC_analysis.minimap(refseqpath = path, nanofilt_q = 7, cpus = 2)

After processing the files, obtaining the read count for each resulting alignment files (.paf)can be done with the alignment-parser.R script.

Rscript --vanilla ./alignment-parser.R [path] [lib(s)]

For this, the nametable.tsv file (that relates NCBI accession numbers and species names) should be located in a directory within the working environment.

UPDATE July 21st 2021: We now provide an example for our downstream analysis in R. Data and code can be found in the directory analysis-example. Please, feel free to contact us at viana.guilherme@usp.br for any questions or suggestions.

About

Scripts developed in the GuazzaroniLab for long-read data processing and 16S profiling of microbial communities in hospital surfaces

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages