# Instructions

Set your SNP list as input, with one rsid on every line. Run this entire notebook and the ID mapping file will be output to your working directory.

## Example

```
my_genes_file = "160427-humsavar_r3_rsids_input.txt"
my_selected_outputs = c('refsnp_id',
                        'refsnp_source',
                        'allele',
                        'minor_allele',
                        'minor_allele_freq',
                        'ensembl_gene_stable_id',
                        'ensembl_transcript_stable_id',
                        'polyphen_prediction',
                        'polyphen_score',
                        'sift_prediction',
                        'sift_score',
                        )
```

## Other useful outputs

See http://www.ensembl.org/biomart/martview for all possible attributes

```
additional_attributes
```

## Caveats

* None?

# Input

In [1]:
my_snps_file = "160427-humsavar_r3_rsids_input.txt"
my_selected_outputs = c('refsnp_id',
                        'refsnp_source',
                        'allele',
                        'minor_allele',
                        'minor_allele_freq',
                        'ensembl_gene_stable_id',
                        'ensembl_transcript_stable_id',
                        'polyphen_prediction',
                        'polyphen_score',
                        'sift_prediction',
                        'sift_score'
                        )

# Code

In [2]:
# Loading biomaRt
options(useHTTPS=FALSE)
library("biomaRt")

# date format is yymmdd (ie. 160428 = 2016-04-28)
short_date = format(Sys.time(), "%y%m%d")

In [3]:
hs_snps = useMart(biomart="ENSEMBL_MART_SNP", dataset="hsapiens_snp", host="www.ensembl.org")

In [4]:
my_snps_table = read.table(my_snps_file, header=FALSE)
my_snps = my_snps_table[['V1']]

In [None]:
my_snps_mapped <- getBM(attributes = my_selected_outputs,
                        filters = 'snp_filter',
                        values = my_snps, 
                        mart = hs_snps)

In [None]:
write.csv(my_snps_mapped, file=paste(short_date,'-',my_snps_file,sep=''))

# Unsuccessful mappings

Please note that these will not be in the output file!

In [None]:
setdiff(my_snps, my_snps_mapped$refsnp_id) 