-
Notifications
You must be signed in to change notification settings - Fork 5
GFF Output Report
In an effort to integrate QuagmiR outputs with the guidelines provided by the miRTop community, we have incorporated in the reports an output in the mirGFF3 format.
The generation of the GFF report requires the use of miRBase21-master.tsv
as a reference file.
-
Column 1: seqID: precursor name
-
Column 2: source: databases used (miRBase21, miRBase22)
-
Column 3: type: Based on sequence ontology guidelines
ref_miRNA, isomiR
(SO:0002166 for ref_miRNA and SO:0002167 for isomiR) -
Column 4: start position based on the precursor sequence
-
Column 5: end position based on the precursor sequence
-
Column 6: Distance score calculated by QuagmiR based on Levenshtein edit distance
-
Column 7: strand: In this case we are mapping against precursor sequence therefore we leave it always as
+
. -
Column 8: phase: Currently not relevant. This can be:
.
-
Column 9: attributes:
- UID: unique ID based on MINTplates sequences following such structure isomiRNA-length-unique_sequence_code (eg. isomiRNA-22-RKREZNPN1)
- Name: mature miRNA name
- Parent: primary miRNA name
-
Variant: Categorical types describing the gain and loss of nucleotides (adapted from isomiR-SEA)
- iso_5p: gain/loss if nucleotides on the 5' end
- iso_3p: gain/loss if nucleotides on the 3' end
- Hits: Number of matching locations in the database
- Genomic: chr:start-end genomic position as indicated by QuagmiR (reference genome: GRCh38/hg38)
- Expression: number of counts
- Fliter: PASS/REJECT (current version only reports PASS sequences)
- sequence: sequence of the read
- number_of_paralogs: number of matching paralogs
## VERSION: 1.0
## source-ontology: miRBase v21 doi:10.25504/fairsharing.hmgte8
## COLDATA: data/sample.fastq
hsa-mir-20a miRBase21 SO:0002167 7 29 1.0 + . UID=isomiRNA-22-U0XZH3PKO; Name=hsa-miR-20a-5p; Parent=hsa-mir-20a; Variant=iso_3p:-1; Hits=1; Genomic=chr13:91350972-91350994; Expression=27; Filter=Pass; sequence=TAAAGTGCTTATAGTGCAGGTA; number_of_paralogs=1
hsa-mir-20a miRBase21 SO:0002166 7 30 0.0 + . UID=isomiRNA-23-U0XZH3PK0H; Name=hsa-miR-20a-5p; Parent=hsa-mir-20a; Variant=NA; Hits=1; Genomic=chr13:91350972-91350995; Expression=23; Filter=Pass; sequence=TAAAGTGCTTATAGTGCAGGTAG; number_of_paralogs=1
hsa-mir-20a miRBase21 SO:0002167 7 28 2.0 + . UID=isomiRNA-21-U0XZH3PKE; Name=hsa-miR-20a-5p; Parent=hsa-mir-20a; Variant=iso_3p:-2; Hits=1; Genomic=chr13:91350972-91350993; Expression=3; Filter=Pass; sequence=TAAAGTGCTTATAGTGCAGGT; number_of_paralogs=1
hsa-mir-20a miRBase21 SO:0002167 7 31 1.0 + . UID=isomiRNA-24-U0XZH3PK2X; Name=hsa-miR-20a-5p; Parent=hsa-mir-20a; Variant=iso_3p:+1; Hits=1; Genomic=chr13:91350972-91350996; Expression=3; Filter=Pass; sequence=TAAAGTGCTTATAGTGCAGGTAGA; number_of_paralogs=1
hsa-mir-20a miRBase21 SO:0002167 7 30 1.0 + . UID=isomiRNA-23-U0XZH3PK0I; Name=hsa-miR-20a-5p; Parent=hsa-mir-20a; Variant=iso_snp; Hits=1; Genomic=chr13:91350972-91350995; Expression=3; Filter=Pass; sequence=TAAAGTGCTTATAGTGCAGGTAT; number_of_paralogs=1
hsa-mir-20a miRBase21 SO:0002167 7 27 3.0 + . UID=isomiRNA-20-U0XZH3PK; Name=hsa-miR-20a-5p; Parent=hsa-mir-20a; Variant=iso_3p:-3; Hits=1; Genomic=chr13:91350972-91350992; Expression=1; Filter=Pass; sequence=TAAAGTGCTTATAGTGCAGG; number_of_paralogs=1
hsa-mir-20a miRBase21 SO:0002167 7 30 1.0 + . UID=isomiRNA-23-U0XZH3PK0F; Name=hsa-miR-20a-5p; Parent=hsa-mir-20a; Variant=iso_snp; Hits=1; Genomic=chr13:91350972-91350995; Expression=1; Filter=Pass; sequence=TAAAGTGCTTATAGTGCAGGTAA; number_of_paralogs=1
hsa-mir-20a miRBase21 SO:0002167 7 31 1.0 + . UID=isomiRNA-24-U0XZH3PK29; Name=hsa-miR-20a-5p; Parent=hsa-mir-20a; Variant=iso_3p:+1; Hits=1; Genomic=chr13:91350972-91350996; Expression=1; Filter=Pass; sequence=TAAAGTGCTTATAGTGCAGGTAGC; number_of_paralogs=1
hsa-mir-20a miRBase21 SO:0002167 7 31 1.0 + . UID=isomiRNA-24-U0XZH3PK2Z; Name=hsa-miR-20a-5p; Parent=hsa-mir-20a; Variant=iso_3p:+1; Hits=1; Genomic=chr13:91350972-91350996; Expression=1; Filter=Pass; sequence=TAAAGTGCTTATAGTGCAGGTAGT; number_of_paralogs=1
hsa-mir-20a miRBase21 SO:0002167 7 29 2.0 + . UID=isomiRNA-22-UKXZH3PKO; Name=hsa-miR-20a-5p; Parent=hsa-mir-20a; Variant=iso_3p:-1,iso_snp_seed; Hits=1; Genomic=chr13:91350972-91350994; Expression=1; Filter=Pass; sequence=TAAGGTGCTTATAGTGCAGGTA; number_of_paralogs=1
hsa-mir-20a miRBase21 SO:0002167 8 31 2.0 + . UID=isomiRNA-23-B3QXV4J3Z; Name=hsa-miR-20a-5p; Parent=hsa-mir-20a; Variant=iso_5p:-1,iso_3p:+1; Hits=1; Genomic=chr13:91350973-91350996; Expression=1; Filter=Pass; sequence=AAAGTGCTTATAGTGCAGGTAGT; number_of_paralogs=1
hsa-mir-19b-1 miRBase21 SO:0002166 53 76 0.0 + . UID=isomiRNA-23-9VBMJVBD0L; Name=hsa-miR-19b-3p-1-2; Parent=hsa-mir-19b-1; Variant=NA; Hits=2; Genomic=chr13:91351145-91351168; Expression=4; Filter=Pass; sequence=TGTGCAAATCCATGCAAAACTGA; number_of_paralogs=2
hsa-mir-19b-1 miRBase21 SO:0002167 53 75 1.0 + . UID=isomiRNA-22-9VBMJVBDP; Name=hsa-miR-19b-3p-1-2; Parent=hsa-mir-19b-1; Variant=iso_3p:-1; Hits=2; Genomic=chr13:91351145-91351167; Expression=2; Filter=Pass; sequence=TGTGCAAATCCATGCAAAACTG; number_of_paralogs=2
hsa-mir-92a-1 miRBase21 SO:0002166 47 69 0.0 + . UID=isomiRNA-22-VY2ZSR67N; Name=hsa-miR-92a-3p-1-2; Parent=hsa-mir-92a-1; Variant=NA; Hits=2; Genomic=chr13:91351261-91351283; Expression=7; Filter=Pass; sequence=TATTGCACTTGTCCCGGCCTGT; number_of_paralogs=2
hsa-mir-92a-1 miRBase21 SO:0002167 47 68 1.0 + . UID=isomiRNA-21-VY2ZSR670; Name=hsa-miR-92a-3p-1-2; Parent=hsa-mir-92a-1; Variant=iso_3p:-1; Hits=2; Genomic=chr13:91351261-91351282; Expression=2; Filter=Pass; sequence=TATTGCACTTGTCCCGGCCTG; number_of_paralogs=2
hsa-mir-92a-1 miRBase21 SO:0002167 47 70 1.0 + . UID=isomiRNA-23-VY2ZSR670B; Name=hsa-miR-92a-3p-1-2; Parent=hsa-mir-92a-1; Variant=iso_3p:+1; Hits=2; Genomic=chr13:91351261-91351284; Expression=1; Filter=Pass; sequence=TATTGCACTTGTCCCGGCCTGTA; number_of_paralogs=2
GFF report requires the use of miRBase21-master.tsv
with the following structure:
MIRNA PRI.ACCESSION PRIMIRNA PRI.SEQUENCE ACCESSION SEQUENCE PARALOGS MOTIF.13 N.MOTIF NON.N DUPLIMOTIF UNIQUE.MOTIF MOTIF.LEN SEED FAMILY STRAND CHROMOSOME X.COORDINATE Y.COORDINATE DIRECTION EXTENDED.SEQUENCE DUPLI.ID SECONDARY.STRUCTURE ENERGY
hsa-let-7a-2-3p MI0000061 hsa-let-7a-2 AGGTTGAGGTAGTAGGTTGTATAGTTTAGAATTACATCAAGGGAGATAACTGTACAGCCTCCTAGCTTTCCT MIMAT0010195 CTGTACAGCCTCCTAGCTTTCC 0 ACAGCCTCCTAGC NNANCCTCCTNNN 7 FALSE AGCCTCCTA 9 TGTACAG TRUE 3P chr11 122146422 122146693 - GCCCAAATAGGTGACAGCACGATGAATCATTATAAGACTAACTTGTAATTTCCCTGCTTAAGAAATGGTAGTTTTCCAGCCATTGTGACTGCATGCTCCCAGGTTGAGGTAGTAGGTTGTATAGTTTAGAATTACATCAAGGGAGATAACTGTACAGCCTCCTAGCTTTCCTTGGGTCTTGCACTAAACAACATGGTGAGAACGATCATGATTCCTCCAGGCCTTTTCTCCCTATGAAAGGTAAGATTGGGTACGATTATTTTATGGTATTT (((..(((.(((.(((((((((((((.........(((......)))))))))))))))).))).))).))) -25.2
hsa-let-7a-3p-1-2 MI0000060 hsa-let-7a-1 TGGGATGAGGTAGTAGGTTGTATAGTTTTAGGGTCACACCCACCACTGGGAGATAACTATACAATCTACTGTCTTTCCTA MIMAT0004481 CTATACAATCTACTGTCTTTC 2 TACAATCTACTGT NNNAATCNACNNN 6 FALSE AATCTAC 7 TATACAA TRUE 3P chr9 94175857 94176136 + TCACACAGGAAACCAGGATTACCGAGGAGGAAAAAAAGCCTTCCTGTGGTGCTCAACTGTGATTCCTTTTCACCATTCACCCTGGATGTTCTCTTCACTGTGGGATGAGGTAGTAGGTTGTATAGTTTTAGGGTCACACCCACCACTGGGAGATAACTATACAATCTACTGTCTTTCCTAACGTGATAGAAAAGTCTGCATCCAGGCGGTCTGATAGAAAGTCAGTTAACTAATTGTACAATATTTAAGATTAACTTGTCTTAAAGAGATGTAGTGCAGC (((((.(((((((((((((((((((((.....(((...((((....)))).))))))))))))))))))))))))))))) -34.2
hsa-let-7a-3p-1-2 MI0000062 hsa-let-7a-3 GGGTGAGGTAGTAGGTTGTATAGTTTGGGGCTCTGCCCTGCTATGGGATAACTATACAATCTACTGTCTTTCCT MIMAT0004481 CTATACAATCTACTGTCTTTC 2 TACAATCTACTGT NNNAATCNACNNN 6 FALSE AATCTAC 7 TATACAA TRUE 3P chr22 46112649 46112922 + TCGAGCCCCTGTTCTCCTCAGCCCTCTTTCCTCCCGCGTCCCCAGGAGGTGCCTCTGGAAGCCACGGAGTCCCATCGGCACCAAGACCGACTGCCCTTTGGGGTGAGGTAGTAGGTTGTATAGTTTGGGGCTCTGCCCTGCTATGGGATAACTATACAATCTACTGTCTTTCCTGAAGTGGCTGTAATATCTGCGGTGGACAGAGCGTCTGGAACCCTGGCTGGGAGCGGGCAGGGCCAGGTTTGGGGGCAGCCTTGGCAGCAGTCGGGGGCAG (((((.(((((((((((((((((((((.....(((...((((....)))).))))))))))))))))))))))))))))) -34.2
hsa-let-7b-3p MI0000063 hsa-let-7b CGGGGTGAGGTAGTAGGTTGTGTGGTTTCAGGGCAGTGATGTTGCCCCTCGGAAGATAACTATACAACCTACTGCCTTCCCTG MIMAT0004482 CTATACAACCTACTGCCTTCCC 0 ACAACCTACTGCC NNNACCTACTNNN 7 FALSE ACCTACT 7 TATACAA TRUE 3P chr22 46113586 46113868 + CCTGCCCAGCCCTCCTGCTCTGGTGACTGAGGACCGCCAGGCAGGGGCTGGTGCTGGGCGGGGGGCGGCGGGCCCTCCCGCAGTGCAAGGCCGGGCCTGGCGGGGTGAGGTAGTAGGTTGTGTGGTTTCAGGGCAGTGATGTTGCCCCTCGGAAGATAACTATACAACCTACTGCCTTCCCTGAGGAGCCCAGTGACACGACCCCATGGGAGGGCCGCCCCCTACCTCAGTGACACGACCCCACGGGAGGGCTGCCCCCCACCTCAGTGACCTGCAGGGGGCC (((((.(((((((((((((((((((((((.((((((.....))))))...))).....))))))))))))))))))))))))) -46.7
hsa-let-7c-3p MI0000064 hsa-let-7c GCATCCGGGTTGAGGTAGTAGGTTGTATGGTTTAGAGTTACACCCTGGGAGTTAACTGTACAACCTTCTAGCTTTCCTTGGAGC MIMAT0026472 CTGTACAACCTTCTAGCTTTCC 0 ACAACCTTCTAGC NNNANCTTNTANN 6 FALSE ACCTTCTA 8 TGTACAA FALSE 3P chr21 16539728 16540011 + TATCTATATCCTTGCCAAGCCCTTAGGTGTATGGCTGCCATATTTGGAGGAGCTGACTGAAGATATGATAAGGAGTTTGAAGCAACATTGGAAGCTGTGTGCATCCGGGTTGAGGTAGTAGGTTGTATGGTTTAGAGTTACACCCTGGGAGTTAACTGTACAACCTTCTAGCTTTCCTTGGAGCACACTTGAGCCGTCGAGGAATTCTTCATCACTTTAACCTGATTGAGCCAATTTGTGTGCAAGAAGGTAATGTGTCATGAGTATCTTGGATCATTGATTTG ((.((((((..(((.(((.(((((((((((((..((.(..((...))..).))))))))))))))).))).)))..)))))))) -31.6
hsa-let-7d-3p MI0000065 hsa-let-7d CCTAGGAAGAGGTAGTAGGTTGCATAGTTTTAGGGCAGGGATTTTGCCCACAAGGAGGTAACTATACGACCTGCTGCCTTTCTTAGG MIMAT0004484 CTATACGACCTGCTGCCTTTCT 0 ACGACCTGCTGCC NNGANCTGCTNNN 7 FALSE GACCTGCTG 9 TATACGA FALSE 3P chr9 94178734 94179020 + TTGAATTAGAAACAAAACTCAAAGAACATGACCTAATTTAACAGGTTAATTTGAAGTGCATCTGCCAAGTAGAAGACCAGCAAGAAAAAAAAAATGGGTTCCTAGGAAGAGGTAGTAGGTTGCATAGTTTTAGGGCAGGGATTTTGCCCACAAGGAGGTAACTATACGACCTGCTGCCTTTCTTAGGGCCTTATTATTCACCGATAACCTGTTTCCTTGCTACTTTGCTTTGGTGTAAGCAGAGTTCTTTCTGTAGGTTTTTTCAAATGAAAACATTGCAAGAATAT (((((((.((((((((((((((.((((((...((((((.....))))))..........)))))).))))))))))))))))))))) -42.7