Skip to content

dabane-ghassan/GFF-GTF-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 

Repository files navigation

GFF-GTF-analysis

General Feature Format files consist of one line per feature, each containing 9 columns of data. For a more detailed explanation about this file format and its different columns please refer to https://www.ensembl.org/info/website/upload/gff.html.

Importing the script and initializing the file

import GeneralFormat as gf 
file_path = "hg38_5k.gtf"
gtf = gf.GeneralFormat(file_path)

Getting the number of non-redondant transcripts in the file

gtf.nb_nr_tx() 

Getting the number of exons per transcript

gtf.ex_per_tx() 

Sending back the length (in bp) of the circular dna for each transcript (exons)

gtf.cdna_per_tx() 

Getting the genome coverage (exons+introns) for every transcript in the file

gtf.tx_coverage() 
  • In each of the cases above, the output will be a dictionary that maps from the transcript to the wished output.

Releases

No releases published

Packages

No packages published

Languages