SNP Info Extractor

A simple Python script that extracts basic SNP (rsID) information from NCBI SNP pages and saves it to an Excel file.
It retrieves chromosome, GRCh37 / GRCh38 positions, alleles, and gene consequence.

Features

Input: a comma-separated list of rs IDs (e.g. rs429358, rs7412)
Output: Excel file with columns: rsID, Chromosome, GRCh37 Position, GRCh38 Position, Alleles, Gene Consequence
Ideal for small-scale SNP lookups.

Requirements

Python 3.8+
Python packages:

pip install requests beautifulsoup4 pandas openpyxl lxml

Output Columns

rsID — SNP identifier
Chromosome — Chromosome number (if available)
GRCh37 Position — Numeric position on GRCh37
GRCh38 Position — Numeric position on GRCh38
Alleles — Allele string
Gene Consequence — Consequence or None if missing

How it Works

Fetches NCBI SNP webpage for each rsID: https://www.ncbi.nlm.nih.gov/snp/{rsid}.
Uses BeautifulSoup to parse HTML and extract:
- Position, Alleles, Gene : Consequence
- Placement table (genomics_placements_table) for GRCh37/GRCh38 coordinates
Combines all results in a pandas DataFrame and writes to Excel.

Important Notes

The script scrapes NCBI public pages. For large-scale queries, use NCBI Entrez/e-utilities.
Some fields may be missing for certain SNPs; these will appear blank in Excel.
Internet connection is required.

License

This project is licensed under the MIT License. See LICENSE for details.

Contact / Author

Created by Rahul Madhav

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
LICENSE		LICENSE
README.md		README.md
dbSNP_dataminer.py		dbSNP_dataminer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SNP Info Extractor

Features

Requirements

Output Columns

How it Works

Important Notes

License

Contact / Author

About

Uh oh!

Languages

License

Scriptococcus/dbSNP_Dataminer

Folders and files

Latest commit

History

Repository files navigation

SNP Info Extractor

Features

Requirements

Output Columns

How it Works

Important Notes

License

Contact / Author

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages