Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A problem in using hg19_snp138.txt #199

Open
SYSUMSD opened this issue Jul 24, 2022 · 2 comments
Open

A problem in using hg19_snp138.txt #199

SYSUMSD opened this issue Jul 24, 2022 · 2 comments

Comments

@SYSUMSD
Copy link

SYSUMSD commented Jul 24, 2022

I have used annovar to annotate human SNPs.

After using convert2annovar.pl to convert a snp name list to a format which table_annovar.pl required, I found rs12990866 has 9 locations at different chromosome.

So I checked hg19_snp138.txt and found a strange thing.

rs12990866 corresponds to 8 lines in hg19_snp138.txt . It looks like this snp id corresponds to 8 locations in snp138:

1239 chr14 85836141 85836142 rs12990866 0 - A A C/T genomic single unknown 0 0 unknown exact 3 MultipleAlignmentsABI,BCMHGSC_JDW,SSAHASNP, 0
645 chr18 7866051 7866052 rs12990866 0 - T T C/T genomic single unknown 0 0 unknown exact 3 ObservedMismatch,MultipleAlignmentABI,BCMHGSC_JDW,SSAHASNP, 0
1738 chr3 151204911 151204912 rs12990866 0 + T T C/T genomic single unknown 0 0 unknown exact 3 MultipleAlignmentsABI,BCMHGSC_JDW,SSAHASNP, 0
998 chr6 54195546 54195547 rs12990866 0 + C C C/T genomic single unknown 0 0 unknown exact 3 MultipleAlignmentsABI,BCMHGSC_JDW,SSAHASNP, 0
1490 chr6 118745488 118745489 rs12990866 0 + T T C/T genomic single unknown 0 0 unknown exact 3 MultipleAlignmentsABI,BCMHGSC_JDW,SSAHASNP, 0
947 chr8 47522774 47522775 rs12990866 0 + T T C/T genomic single unknown 0 0 unknown exact 3 MultipleAlignmentsABI,BCMHGSC_JDW,SSAHASNP, 0
1510 chr9 121260627 121260628 rs12990866 0 - A A C/T genomic single unknown 0 0 unknown exact 3 MultipleAlignmentsABI,BCMHGSC_JDW,SSAHASNP, 0
1639 chrX 138258432 138258433 rs12990866 0 - G G C/T genomic single unknown 0 0 unknown exact 3 MultipleAlignmentsABI,BCMHGSC_JDW,SSAHASNP, 0

But in NCBI, rs12990866 is located at chr2:209372299(https://www.ncbi.nlm.nih.gov/snp/rs12990866).

Are there errors in hg19_snp138.txt?

@kaichop
Copy link
Contributor

kaichop commented Jul 27, 2022 via email

@SYSUMSD
Copy link
Author

SYSUMSD commented Jul 27, 2022

@kaichop I think it's a good idea.
But I find some glitches.

  1. The avinput file seems to need reference allele's start position and end position. But in my data I can't confirm which allele is reference allele to calculate the start position and end position. Can I use allele 1 as reference allele?
  2. The rs2066847 in your ex1.avinput has information like :
    50763778 50763778 - C comments: rs2066847 (c.3016_3017insC), a frameshift SNP in NOD2
    But in my data this snp has information like :
    16 rs2066847 50763778 G GC
    For indels my data has a different format in allele1 and allele2.

What can these differences affect?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants