Skip to content

SahilBShah/Vaccinia_intron_coordinates

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 

Repository files navigation

Vaccinia intron coordinates

Michael Ly, Hannah M Burgess, Sahil B. Shah, Ian Mohr, Britt A. Glaunsinger

Function overview

This script extracts intron coordinates based on exon data from gtf files and outputs a new gtf file containing all genomic coordinates including introns. The RNA-seq data to replicate our work in the paper, "Vaccinia virus D10 has broad decapping activity that is regulated by mRNA splicing" (the article pre-print can be found at: https://www.biorxiv.org/content/10.1101/2021.11.10.468017v1), can be found using the GEO accession number: GSE185520. However, this script can be used generally on any gtf file of interest.

Requirements

Most packages that this script relies upon are from the standard scientific/numeric python distributions. However, the gtfparse must be installed to read in gtf files. This can be done by using the command:

pip install gtfparse

Also, the AGEpy package developed at the Bioinformatics Core Facility of the Max Planck Institute for Biology of Ageing was also used to output python data frames as a gtf file. The instructions to install this package can be found here: https://github.com/mpg-age-bioinformatics/AGEpy.

Input files

The only required input file needed is a gtf file.

Output location

The output location of the new gtf file will be in the folder that contains this script. The outputted file name will be "genes_w_introns.gtf".

Working examples

Example command to run the script:

python3 gtf_intron_extraction.py gene_coordinates.gtf

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published