-
Notifications
You must be signed in to change notification settings - Fork 332
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A problem in using hg19_snp138.txt #199
Comments
rs identifier is compiled by dbSNP. If the same context of sequence occurs
multiple times in the reference genome, then you will see a rs ID be mapped
to multiple positions by UCSC. There is nothing wrong, just a bad SNP.
hg19_snp138 is actually generated by UCSC, not ANNOVAR.
I suggest not to use rs ID or dbSNP in any genomic data analysis. Only use
chr:start-end if possible.
…On Sun, Jul 24, 2022 at 8:59 AM SIDI MA ***@***.***> wrote:
I have used annovar to annotate human SNPs.
After using convert2annovar.pl to convert a snp name list to a format
which table_annovar.pl required, I found rs12990866 has 9 locations at
different chromosome.
So I checked hg19_snp138.txt and found a strange thing.
rs12990866 corresponds to 8 lines in hg19_snp138.txt . It looks like this
snp id corresponds to 8 locations in snp138:
1239 chr14 85836141 85836142 rs12990866 0 - A A C/T genomic single unknown
0 0 unknown exact 3 MultipleAlignmentsABI,BCMHGSC_JDW,SSAHASNP, 0
645 chr18 7866051 7866052 rs12990866 0 - T T C/T genomic single unknown 0
0 unknown exact 3
ObservedMismatch,MultipleAlignmentABI,BCMHGSC_JDW,SSAHASNP, 0
1738 chr3 151204911 151204912 rs12990866 0 + T T C/T genomic single
unknown 0 0 unknown exact 3 MultipleAlignmentsABI,BCMHGSC_JDW,SSAHASNP, 0
998 chr6 54195546 54195547 rs12990866 0 + C C C/T genomic single unknown 0
0 unknown exact 3 MultipleAlignmentsABI,BCMHGSC_JDW,SSAHASNP, 0
1490 chr6 118745488 118745489 rs12990866 0 + T T C/T genomic single
unknown 0 0 unknown exact 3 MultipleAlignmentsABI,BCMHGSC_JDW,SSAHASNP, 0
947 chr8 47522774 47522775 rs12990866 0 + T T C/T genomic single unknown 0
0 unknown exact 3 MultipleAlignmentsABI,BCMHGSC_JDW,SSAHASNP, 0
1510 chr9 121260627 121260628 rs12990866 0 - A A C/T genomic single
unknown 0 0 unknown exact 3 MultipleAlignmentsABI,BCMHGSC_JDW,SSAHASNP, 0
1639 chrX 138258432 138258433 rs12990866 0 - G G C/T genomic single
unknown 0 0 unknown exact 3 MultipleAlignmentsABI,BCMHGSC_JDW,SSAHASNP, 0
But in NCBI, rs12990866 is located at chr2:209372299(
https://www.ncbi.nlm.nih.gov/snp/rs12990866).
Are there errors in hg19_snp138.txt?
—
Reply to this email directly, view it on GitHub
<#199>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABNG3OHB3Y3QW2VMI5GETO3VVU445ANCNFSM54PUFN2Q>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
@kaichop I think it's a good idea.
What can these differences affect? |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I have used annovar to annotate human SNPs.
After using convert2annovar.pl to convert a snp name list to a format which table_annovar.pl required, I found rs12990866 has 9 locations at different chromosome.
So I checked hg19_snp138.txt and found a strange thing.
rs12990866 corresponds to 8 lines in hg19_snp138.txt . It looks like this snp id corresponds to 8 locations in snp138:
1239 chr14 85836141 85836142 rs12990866 0 - A A C/T genomic single unknown 0 0 unknown exact 3 MultipleAlignmentsABI,BCMHGSC_JDW,SSAHASNP, 0
645 chr18 7866051 7866052 rs12990866 0 - T T C/T genomic single unknown 0 0 unknown exact 3 ObservedMismatch,MultipleAlignmentABI,BCMHGSC_JDW,SSAHASNP, 0
1738 chr3 151204911 151204912 rs12990866 0 + T T C/T genomic single unknown 0 0 unknown exact 3 MultipleAlignmentsABI,BCMHGSC_JDW,SSAHASNP, 0
998 chr6 54195546 54195547 rs12990866 0 + C C C/T genomic single unknown 0 0 unknown exact 3 MultipleAlignmentsABI,BCMHGSC_JDW,SSAHASNP, 0
1490 chr6 118745488 118745489 rs12990866 0 + T T C/T genomic single unknown 0 0 unknown exact 3 MultipleAlignmentsABI,BCMHGSC_JDW,SSAHASNP, 0
947 chr8 47522774 47522775 rs12990866 0 + T T C/T genomic single unknown 0 0 unknown exact 3 MultipleAlignmentsABI,BCMHGSC_JDW,SSAHASNP, 0
1510 chr9 121260627 121260628 rs12990866 0 - A A C/T genomic single unknown 0 0 unknown exact 3 MultipleAlignmentsABI,BCMHGSC_JDW,SSAHASNP, 0
1639 chrX 138258432 138258433 rs12990866 0 - G G C/T genomic single unknown 0 0 unknown exact 3 MultipleAlignmentsABI,BCMHGSC_JDW,SSAHASNP, 0
But in NCBI, rs12990866 is located at chr2:209372299(https://www.ncbi.nlm.nih.gov/snp/rs12990866).
Are there errors in hg19_snp138.txt?
The text was updated successfully, but these errors were encountered: