# 2 Exploring the SeqRepo DataProxy
The SeqRepo DataProxy has sequence related functionality that may be of use.

#### Step 1 - Setup Data Proxy Access
The *DataProxy* provides access to sequence references.

In [1]:
from ga4gh.vrs.dataproxy import create_dataproxy
seqrepo_rest_service_url = "seqrepo+https://services.genomicmedlab.org/seqrepo"
seqrepo_dataproxy = create_dataproxy(uri=seqrepo_rest_service_url)

#### Step 2 - Information on refseq accessions

It is often necessary when building *SequenceLocation* objects, to obtain the refget accession from a public accession identifier. The *DataProxy* method *derive_refget_accession* can do this for you.

In [2]:
seqrepo_dataproxy.derive_refget_accession('refseq:NM_002439.5')

'SQ.Pw3Ch0x3XWD6ljsnIfmk_NERcZCI9sNM'

The *DataProxy* *get_metadata* method provides metadata information on the accession including: the date the accession was added, aliases for the accession and reference length.

In [3]:
seqrepo_dataproxy.get_metadata("refseq:NM_000551.3")

{'added': '2016-08-24T05:03:11Z',
 'aliases': ['MD5:215137b1973c1a5afcf86be7d999574a',
  'NCBI:NM_000551.3',
  'refseq:NM_000551.3',
  'SEGUID:T12L0p2X5E8DbnL0+SwI4Wc1S6g',
  'SHA1:4f5d8bd29d97e44f036e72f4f92c08e167354ba8',
  'VMC:GS_v_QTc1p-MUYdgrRv4LMT6ByXIOsdw3C_',
  'sha512t24u:v_QTc1p-MUYdgrRv4LMT6ByXIOsdw3C_',
  'ga4gh:SQ.v_QTc1p-MUYdgrRv4LMT6ByXIOsdw3C_'],
 'alphabet': 'ACGT',
 'length': 4560}

*DataProxy* *get_sequence* returns actual sequence for given identifier, optionally limited to interbase <start, end> intervals.

In [4]:
identifier = "ga4gh:SQ.v_QTc1p-MUYdgrRv4LMT6ByXIOsdw3C_"
seqrepo_dataproxy.get_sequence(identifier, start=0, end=51)

'CCTCGCCTCCGTTACAACGGCCTACGGTGCTGGAGGATCCTTCTGCGCACG'

*DataProxy* *translate_sequence_identifier* returns a list of equivalent identifiers in the given namespace.

In [5]:
seqrepo_dataproxy.translate_sequence_identifier("GRCh38:19", "ga4gh")

['ga4gh:SQ.IIB53T8CNeJJdUqzn9V_JnRtQadwWCbl']

In [6]:
seqrepo_dataproxy.translate_sequence_identifier("ga4gh:SQ.IIB53T8CNeJJdUqzn9V_JnRtQadwWCbl", "GRCh38")

['GRCh38:19', 'GRCh38:chr19']