You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An idea for how this could be performed... For all SNPs in a dataset, lookup RSIDs in batches of 50k SNPs (1 request / second) using the VCF endpoint. Then, use the resulting information to populate any missing RSIDs. However, note that this method would require the REF allele and a valid ALT allele.
As an example, one row for this query (using GRCh38) could be constructed as follows:
Utilize NCBI's Variation Services API to populate missing RSIDs.
An idea for how this could be performed... For all SNPs in a dataset, lookup RSIDs in batches of 50k SNPs (1 request / second) using the VCF endpoint. Then, use the resulting information to populate any missing RSIDs. However, note that this method would require the
REF
allele and a validALT
allele.As an example, one row for this query (using
GRCh38
) could be constructed as follows:1 817186 . G A,C,G,T
Query:
curl -X POST "https://api.ncbi.nlm.nih.gov/variation/v0/vcf/file/set_rsids?assembly=GCF_000001405.38" -H "accept: text/plain; charset=utf-8" -H "Content-Type: text/plain; charset=utf-8" -d "1\t817186\t.\tG\tA,C,G,T"
Which would return:
1 817186 rs3094315 G A,C,G,T
This could also be used to verify SNP positions and help with #6.
Thanks to @gedankenstuecke for helping develop the idea!
The text was updated successfully, but these errors were encountered: