Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Low quantity of RNA editings found #37

Open
s-a-nersisyan opened this issue Jul 7, 2017 · 2 comments
Open

Low quantity of RNA editings found #37

s-a-nersisyan opened this issue Jul 7, 2017 · 2 comments

Comments

@s-a-nersisyan
Copy link

Hi, Qing,

I'm using GIREMI to find some RNAE in my RNA-seq data. I have good aligned file src.bam, reads are paired end (and properly paired). I'm performing following steps:

  1. Sorting bam in coordinate order and indexing it via samtools sort and index. Output file: sorted.bam
  2. Variant calling: samtools mpileup -ugf genome.fa sorted.bam -q 1 | bcftools view -bvcg - > raw.bcf
  3. Filtering results:
    bcftools view raw.bcf | vcfutils.pl varFilter -D100 | awk '$6>=10' | grep "0/1:" > filtered.vcf
    In this line I remove homozygous SNV's by grep "0/1:"
  4. Annotating filtered.vcf and generating SNV list file (if dbSNP, gene name, strand etc) via snpEff and own simple python script
  5. Running GIREMI: ./giremi -f genome.fa -l SNV_list.txt -o result sorted.bam

My problem is that GIREMI generates 56616 lines .res file and only 3654 of them are RNAE. I have meanMI:0.669357 sdMI:0.143034 and it seems to be normal (as I read in previous issues). I think it's very low quantity of RNA editings (my bam file is 15GB) and they don't normally match with results of other RNA editing find tools. Maybe I'm doing something wrong or missing some important steps? Looking forward for your answer and thank you very much!

@zhqingit
Copy link
Owner

zhqingit commented Jul 10, 2017 via email

@s-a-nersisyan
Copy link
Author

Hello Qing,
Sorry for late reply. I've re-tested all procedure on https://www.encodeproject.org/experiments/ENCSR000COR/ dataset and it found 8228 RNA editings (5114 from them was found by MI). I think that it is low quantity of RNAE and it must find much more. Please find SNV list and .res file in the attachment. Thank you!
ENCFF836AMM.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants