Penguin

Tool to extract and classify PseudoUridine signals from fast5 files

Tool workflow

The penguin tool needs as input a fast5 path and if you don't provide a sam file you have to provide a reference genome to align to so the tool can create the sam file. If no bed file is provided a default one is included and will be used. The tool will then id all fast5 files and create coordinate file with ids of files that are modified.

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.

Prerequisites

N/A

Installing

First run the install.sh file to get all required files to run the program.

./install.sh

OR

[HIGHLY RECOMMENDED!!] Docker Installation - download docker - No need to clone repository - open command line and enter ->

docker pull danielacevedo01/penguin:flappie-latest
docker run -i -t danielacevedo01/penguin:flappie-latest /bin/bash
cd home/danny/penguin
git pull

Running the tool

-i fast5 path(** required **)
-s samfile(Created if not included)
-b bedfile(Default if not included)
-ref reference Genome (Default if not included)

Example

python3 main.py -i ~/fast5_directory/ -s ~/sam_directory/my_sam_file.sam -b ~/bed_directory/my_bed_file.bed

Built With

Tensorflow - Used to generate ml models
Scrappie - Used as default basecaller
NanoPolish - Used to create kmers for machine learning models

Contributing

Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us.

Authors

Daniel Acevedo - Initial work - Daniel235
Doaa Salem - Models - hsdoaa

See also the list of contributors who participated in this project.

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Acknowledgments

Falcon
Janga Lab (https://jangalab.sitehost.iu.edu/)

Name		Name	Last commit message	Last commit date
Latest commit History 255 Commits
.vscode		.vscode
Data		Data
Fast5_indexing		Fast5_indexing
Models		Models
SequenceGenerator		SequenceGenerator
SignalExtractor		SignalExtractor
testFiles		testFiles
.gitignore		.gitignore
README.md		README.md
bwaAln.sam		bwaAln.sam
coli-ref.fa		coli-ref.fa
install.sh		install.sh
main.py		main.py
position_prediction.png		position_prediction.png
refgenome.fai		refgenome.fai

daniel235/Penguin

Folders and files

Latest commit

History

Repository files navigation

Penguin

Tool workflow

Getting Started

Prerequisites

Installing

Running the tool

Example

Built With

Contributing

Authors

License

Acknowledgments

About

Resources

Stars

Watchers

Forks

Languages