## GA4GH 1000 Genomes Datasets Example
This example illustrates how to access the available datasets in the server. 

### Initialize the client
In this step we create a client object which will be used to communicate with the server. It is initialized using the URL.

In [1]:
import ga4gh.client as client
c = client.HttpClient("http://1kgenomes.ga4gh.org")

### Search read group set
We can obtain read group sets via `search_read_group_sets`, observe that this request takes as main parameter `dataset_id` which we illustrate how to obtain it in the `1kg_metadata_service` examples via `search_datasets` request.

In [2]:
counter = 0
for read_groups in c.search_read_group_sets(dataset_id="WyIxa2dlbm9tZXMiXQ"):
    counter += 1
    if counter < 6:
        print "\nRead Group Set: {}\n\tId: {}\n\tData Set: {}\n\tAligned Read Count: {}\n\tUnaligned Read Count: {}".format(read_groups.name, read_groups.id, read_groups.dataset_id,read_groups.stats.aligned_read_count, read_groups.stats.unaligned_read_count)
        for elems in read_groups.read_groups:
            print "\tReads:\n\t\tId: {}\n\t\tName: {}\n\t\tDescription: {}\n\t\tBio Sample Id: {}".format(elems.id, elems.name, elems.description,elems.bio_sample_id)
print "\nTotal available 'read group sets': {}, for this dataset id".format(counter)


Read Group Set: HG03270
	Id: WyIxa2dlbm9tZXMiLCJyZ3MiLCJIRzAzMjcwIl0
	Data Set: WyIxa2dlbm9tZXMiXQ
	Aligned Read Count: 177645990
	Unaligned Read Count: 746202
	Reads:
		Id: WyIxa2dlbm9tZXMiLCJyZ3MiLCJIRzAzMjcwIiwiRVJSMTgxMzI5Il0
		Name: ERR181329
		Description: SRP015238
		Bio Sample Id: WyIxa2dlbm9tZXMiLCJiIiwiSEcwMzI3MCJd
	Reads:
		Id: WyIxa2dlbm9tZXMiLCJyZ3MiLCJIRzAzMjcwIiwiRVJSMTg0MzI4Il0
		Name: ERR184328
		Description: SRP015238
		Bio Sample Id: WyIxa2dlbm9tZXMiLCJiIiwiSEcwMzI3MCJd
	Reads:
		Id: WyIxa2dlbm9tZXMiLCJyZ3MiLCJIRzAzMjcwIiwiRVJSMTg0MzM2Il0
		Name: ERR184336
		Description: SRP015238
		Bio Sample Id: WyIxa2dlbm9tZXMiLCJiIiwiSEcwMzI3MCJd
	Reads:
		Id: WyIxa2dlbm9tZXMiLCJyZ3MiLCJIRzAzMjcwIiwiRVJSMTg0MzQ0Il0
		Name: ERR184344
		Description: SRP015238
		Bio Sample Id: WyIxa2dlbm9tZXMiLCJiIiwiSEcwMzI3MCJd

Read Group Set: HG03271
	Id: WyIxa2dlbm9tZXMiLCJyZ3MiLCJIRzAzMjcxIl0
	Data Set: WyIxa2dlbm9tZXMiXQ
	Aligned Read Count: 201280730
	Unaligned Read Count: 944735
	Reads:
		

###### Note: only a small subset of elements is being illustrated, the data returned by the servers is richer, that is, it contains other informational fields which my be of interest. 

### Get readgroup set
Similarly, we can obtain an individual readgroup set just by providing the denoting id.

In [3]:
read_group_set = c.get_read_group_set(read_group_set_id="WyIxa2dlbm9tZXMiLCJyZ3MiLCJOQTE5Njc4Il0")
print "\nRead Group Set: {}\n\tId: {}\n\tData Set: {}\n\tAligned Read Count: {}\n\tUnaligned Read Count: {}".format(read_group_set.name, read_group_set.id, read_group_set.dataset_id,read_group_set.stats.aligned_read_count, read_group_set.stats.unaligned_read_count)
for elems in read_group_set.read_groups:
    print "\tReads:\n\t\tId: {}\n\t\tName: {}\n\t\tDescription: {}\n\t\tBio Sample Id: {}".format(elems.id, elems.name, elems.description,elems.bio_sample_id)
print "\nTotal available reads: {}, for group set: {}".format(len(read_group_set.read_groups), read_group_set.name) 


Read Group Set: NA19678
	Id: WyIxa2dlbm9tZXMiLCJyZ3MiLCJOQTE5Njc4Il0
	Data Set: WyIxa2dlbm9tZXMiXQ
	Aligned Read Count: 449711566
	Unaligned Read Count: 5831622
	Reads:
		Id: WyIxa2dlbm9tZXMiLCJyZ3MiLCJOQTE5Njc4IiwiU1JSMDM0NTc4Il0
		Name: SRR034578
		Description: SRP000803
		Bio Sample Id: WyIxa2dlbm9tZXMiLCJiIiwiTkExOTY3OCJd
	Reads:
		Id: WyIxa2dlbm9tZXMiLCJyZ3MiLCJOQTE5Njc4IiwiU1JSMDM0NTc5Il0
		Name: SRR034579
		Description: SRP000803
		Bio Sample Id: WyIxa2dlbm9tZXMiLCJiIiwiTkExOTY3OCJd
	Reads:
		Id: WyIxa2dlbm9tZXMiLCJyZ3MiLCJOQTE5Njc4IiwiU1JSMDM1NDg4Il0
		Name: SRR035488
		Description: SRP000803
		Bio Sample Id: WyIxa2dlbm9tZXMiLCJiIiwiTkExOTY3OCJd
	Reads:
		Id: WyIxa2dlbm9tZXMiLCJyZ3MiLCJOQTE5Njc4IiwiU1JSMDM4NTg1Il0
		Name: SRR038585
		Description: SRP000803
		Bio Sample Id: WyIxa2dlbm9tZXMiLCJiIiwiTkExOTY3OCJd
	Reads:
		Id: WyIxa2dlbm9tZXMiLCJyZ3MiLCJOQTE5Njc4IiwiU1JSMDUxNTc1Il0
		Name: SRR051575
		Description: SRP000803
		Bio Sample Id: WyIxa2dlbm9tZXMiLCJiIiwiTkExOTY3OCJd
	Re

###### Note, like in the previous example. Only a selected amount of parameters are selected for illustration, the data returned by the server is far richer, this format is only to have a more aesthetic presentation.

### Search reads
This request returns the reads which correspond to the group set names we obtained before.

In [4]:
for reads in read_group_set.read_groups:
    reads = c.search_reads(read_group_ids=[reads.id], start=0, end=1000000, reference_id="WyJOQ0JJMzciLCIxIl0").next()
    print "Id: {}\nRead Group Id: {}\nFragment Name: {}\nRead Sequence: {}\n".format(reads.id, reads.read_group_id, reads.fragment_name, reads.aligned_sequence)

Id: WyIxa2dlbm9tZXMiLCJyZ3MiLCJOQTE5Njc4IiwiU1JSMDM0NTc4LjE3MzYwMCJd
Read Group Id: WyIxa2dlbm9tZXMiLCJyZ3MiLCJOQTE5Njc4IiwiU1JSMDM0NTc4Il0
Fragment Name: SRR034578.173600
Read Sequence: ACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCC

Id: WyIxa2dlbm9tZXMiLCJyZ3MiLCJOQTE5Njc4IiwiU1JSMDM0NTc5LjY2MzcyODkiXQ
Read Group Id: WyIxa2dlbm9tZXMiLCJyZ3MiLCJOQTE5Njc4IiwiU1JSMDM0NTc5Il0
Fragment Name: SRR034579.6637289
Read Sequence: CCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCT

Id: WyIxa2dlbm9tZXMiLCJyZ3MiLCJOQTE5Njc4IiwiU1JSMDM1NDg4Ljc1NTQ1NDciXQ
Read Group Id: WyIxa2dlbm9tZXMiLCJyZ3MiLCJOQTE5Njc4IiwiU1JSMDM1NDg4Il0
Fragment Name: SRR035488.7554547
Read Sequence: ACCCTGACCCCGACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTCACCCTCACCCTAACCCCTAAAC

Id: WyIxa2dlbm9tZXMiLCJyZ3MiLCJOQTE5Njc4IiwiU1JSMDM4NTg1LjE1MzI5MDkiXQ
Read Group Id: WyIxa2dlbm9tZXMiLCJyZ3MiLCJOQTE5Njc4IiwiU1JSMDM4NTg1Il0
Fragment Name: SRR038585.1532909
Read Sequence: CCCTGACCC

#### For documentation on the service, and more information go to.
https://ga4gh-schemas.readthedocs.io/en/latest/schemas/read_service.proto.html