### Initialize Client
Using the GA4GH client library, to ensure that the request is being formulated in acordance with the GA4GH requirements

In [1]:
from ga4gh_client import client
c = client.HttpClient("http://127.0.0.1:8000/data/ga4gh/v0.6.0a7/")

### Obtain available datasets
Datasets is the highest data level, and the info returned can be use to recursively access the lowest data level. That is, each individual variant.

In [3]:
dataset = c.search_datasets().next()
print dataset

id: "brca"
name: "brca-exchange-variants"
description: "Variants observed in brca-exchange project"



### Data set by id method
Observe that we can obtain a given dataset, just by knowing its id. Observe that on the previous instance we obtained all the available sets. Currently only one is supported.

In [4]:
individual_dataset = c.get_dataset(dataset_id="brca")
print individual_dataset

id: "brca"
name: "brca-exchange-variants"
description: "Variants observed in brca-exchange project"



### Obtaining Variant Sets
Note that by querying datasets, we obtain the variants id so that we can 
make a query to the variant sets and become more specific with which set we want our variants to be build and obtained.

In [5]:
# `dataset.id` obtained in search datasets call # 
variant_sets = [i for i in c.search_variant_sets(dataset_id=dataset.id)]
Sets = {}
for variantSets in variant_sets:
    Sets[variantSets.id] = {"Name" : variantSets.name, "Reference Set Id" : variantSets.reference_set_id,
                            "Data Set Id" : variantSets.dataset_id}
    print"Variant Set Id: {}\n\tName: {}\n\tReference Set Id: {}\n\tData Set Id: {}\n".format(variantSets.id,
        variantSets.name, variantSets.reference_set_id, variantSets.dataset_id)

Variant Set Id: brca-hg36
	Name: brca-exchange-variants-hg36
	Reference Set Id: Genomic-Coordinate-hg36
	Data Set Id: brca

Variant Set Id: brca-hg37
	Name: brca-exchange-variants-hg37
	Reference Set Id: Genomic-Coordinate-hg37
	Data Set Id: brca

Variant Set Id: brca-hg38
	Name: brca-exchange-variants-hg38
	Reference Set Id: Genomic-Coordinate-hg38
	Data Set Id: brca



##### Note, only selected fields where shown for illustration purposes. Because the independent variantset search will provide other informational parameters, stored as metadata, each with a individual description.

### Get variant set by id method
We can also call a specific set of variants by knowing its id. Currently the distinction of such is derrived from the supported genomic coordinates, so only 3 sets are available. GA4GH supports coordinate hg37, so we will use such to demonstrate the get variantset function 

In [6]:
Varset = c.get_variant_set(variant_set_id="brca-hg37")
print "Variant Id: {}\nName: {}\nDataset Id: {}\nReference Set Id: {}\n".format(Varset.id, Varset.name, Varset.dataset_id, Varset.reference_set_id)
for i in Varset.metadata:
    print "Metadata Field: {} ;  Value: {} ;  Type: {}".format(i.key, i.value, i.type)

Variant Id: brca-hg37
Name: brca-exchange-variants-hg37
Dataset Id: brca
Reference Set Id: Genomic-Coordinate-hg37

Metadata Field: id ;  Value: - ;  Type: AutoField
Metadata Field: Variant_in_ENIGMA ;  Value: - ;  Type: BooleanField
Metadata Field: Variant_in_ClinVar ;  Value: - ;  Type: BooleanField
Metadata Field: Variant_in_1000_Genomes ;  Value: - ;  Type: BooleanField
Metadata Field: Variant_in_ExAC ;  Value: - ;  Type: BooleanField
Metadata Field: Variant_in_LOVD ;  Value: - ;  Type: BooleanField
Metadata Field: Variant_in_BIC ;  Value: - ;  Type: BooleanField
Metadata Field: Variant_in_ESP ;  Value: - ;  Type: BooleanField
Metadata Field: Variant_in_exLOVD ;  Value: - ;  Type: BooleanField
Metadata Field: Source ;  Value: - ;  Type: TextField
Metadata Field: URL_ENIGMA ;  Value: - ;  Type: TextField
Metadata Field: Condition_ID_type_ENIGMA ;  Value: - ;  Type: TextField
Metadata Field: Condition_ID_value_ENIGMA ;  Value: - ;  Type: TextField
Metadata Field: Condition_category_E

### Searching variants
Knowing some variant set id obtained in the previous call and the reference name, we can make a taylorized search for variants and obtain the ones which comply with our request.


In [7]:
counter = 0
for Vars in c.search_variants(reference_name="chr17", variant_set_id="brca-hg37", start=41246794, end=41296814):
    print "Variant Id: {},\tVariant Set Id: {},\tReference Name: {}\n\tVariant Start: {},\tVariant End: {}\n\tReference Bases: {},\tAlternate Bases: {},\t".format(
        Vars.id, Vars.variant_set_id,Vars.reference_name,Vars.start, Vars.end,Vars.reference_bases,Vars.alternate_bases)
    if counter >= 5:
        break
    counter += 1

Variant Id: hg37-10795,	Variant Set Id: brca-hg37,	Reference Name: 17
	Variant Start: 41246798,	Variant End: 41246798
	Reference Bases: C,	Alternate Bases: [u'T'],	
Variant Id: hg37-10796,	Variant Set Id: brca-hg37,	Reference Name: 17
	Variant Start: 41246801,	Variant End: 41246803
	Reference Bases: AGT,	Alternate Bases: [u'A'],	
Variant Id: hg37-10797,	Variant Set Id: brca-hg37,	Reference Name: 17
	Variant Start: 41246803,	Variant End: 41246803
	Reference Bases: T,	Alternate Bases: [u'A'],	
Variant Id: hg37-10798,	Variant Set Id: brca-hg37,	Reference Name: 17
	Variant Start: 41246804,	Variant End: 41246804
	Reference Bases: G,	Alternate Bases: [u'C'],	
Variant Id: hg37-10800,	Variant Set Id: brca-hg37,	Reference Name: 17
	Variant Start: 41246805,	Variant End: 41246805
	Reference Bases: G,	Alternate Bases: [u'GT'],	
Variant Id: hg37-10799,	Variant Set Id: brca-hg37,	Reference Name: 17
	Variant Start: 41246805,	Variant End: 41246805
	Reference Bases: G,	Alternate Bases: [u'T'],	


##### Note, observe that only a selected amount of parameters where chosen to be displayed under variants search method. But the metadata fields which are defined are also available in this request. Potential fields available are defined under "Get variant set by id method" example 

### Get variant by id
Observe that in the above example we obtain id's for the variants available, which are contained within the requested genomic range. Also, other fields which are present are also displayed in the call.  

In [8]:
SingleVar = c.get_variant(variant_id="hg37-398")
print "Variant Id: {},\tVariant Set Id: {},\tReference Name: {}\n\tVariant Start: {},\tVariant End: {}\n\tReference Bases: {},\tAlternate Bases: {},\t".format(SingleVar.id, SingleVar.variant_set_id,SingleVar.reference_name,SingleVar.start, SingleVar.end,SingleVar.reference_bases,SingleVar.alternate_bases)
for i in SingleVar.info:
    print "{}: \t{}".format(i, SingleVar.info[str(i)].values[0].string_value or SingleVar.info[str(i)].values[0].number_value)

Variant Id: hg37-398,	Variant Set Id: brca-hg37,	Reference Name: 13
	Variant Start: 32899305,	Variant End: 32899306
	Reference Bases: TC,	Alternate Bases: [u'T'],	
Hg37_End: 	32899306
Date_Last_Updated_ClinVar: 	2013-02-01,2003-12-23
Hg38_Start: 	32325168
Variant_in_LOVD: 	0.0
Source_URL: 	http://www.ncbi.nlm.nih.gov/clinvar/?term=SCV000146782, http://www.ncbi.nlm.nih.gov/clinvar/?term=SCV000072365
Source: 	BIC,ClinVar
Chr: 	13
Pathogenicity_expert: 	Not Yet Classified
HGVS_Protein: 	NP_000050.2:p.(Ser137PhefsTer15)
Ref: 	TC
id: 	398
Reference_Sequence: 	NM_000059.3
Submitter_ClinVar: 	Breast_Cancer_Information_Core_(BIC)_(BRCA2),Invitae_
Variant_in_ESP: 	0.0
Hg36_End: 	31797306
Variant_in_ExAC: 	0.0
Clinical_classification_BIC: 	Class 5
Variant_in_exLOVD: 	0.0
Gene_Symbol: 	BRCA2
Variant_in_ENIGMA: 	0.0
Method_ClinVar: 	clinical_testing,literature_only
Pathogenicity_all: 	Pathogenic,not_provided (ClinVar); Class 5 (BIC)
Germline_or_Somatic_BIC: 	G
Genomic_Coordinate_hg37: 	chr13:g.328