# DataPath Example 2
This notebook gives a very basic example of how to access data. 
It assumes that you understand the concepts presented in the 
example 1 notebook.

In [1]:
# Import deriva modules
from deriva.core import ErmrestCatalog, get_credential

In [2]:
# Connect with the deriva catalog
protocol = 'https'
hostname = 'www.facebase.org'
catalog_number = 1
credential = get_credential(hostname)
catalog = ErmrestCatalog(protocol, hostname, catalog_number, credential)

In [3]:
# Get the path builder interface for this catalog
pb = catalog.getPathBuilder()

## DataPaths
The `PathBuilder` object allows you to begin `DataPath`s from the base `Table`s. A `DataPath` begins with a `Table` (or an `TableAlias` to be discussed later) as its "root" from which one can "`link`", "`filter`", and fetch its "`entities`".

### Start a path rooted at a table from the catalog
We will reference a table from the PathBuilder `pb` variable from above. Using the PathBuilder, we will reference the "isa" schema, then the "dataset" table, and from that table start a path.

In [4]:
path = pb.schemas['isa'].tables['dataset'].path

We could have used the more compact dot-notation to start the same path.

In [5]:
path = pb.isa.dataset.path

### Getting the URI of the current path
All DataPaths have URIs for the referenced resources in ERMrest. The URI identifies the resources which are available through "RESTful" Web protocols supported by ERMrest.

In [6]:
print(path.uri)

https://www.facebase.org/ermrest/catalog/1/entity/dataset:=isa:dataset


## EntitySets
The entities from a DataPath are accessed through a pythonic container object, the `EntitySet`. The `EntitySet` is returned by the DataPath's `entities()` method.

In [7]:
entities = path.entities()

### Fetch entities from the catalog
Now we can get entities from the server using the EntitySet's `fetch()` method.

In [8]:
entities.fetch()

<deriva.core.datapath.EntitySet at 0x10f24c9b0>

`EntitySet`s behave like python containers. For example, we can check the count of rows in this EntitySet.

In [9]:
len(entities)

777

**Note**: If we had not explicitly called the `fetch()` method, then it would have been called implicitly on the first container operation such as `len(...)`, `list(...)`, `iter(...)` or get item `[...]`.

### Get an entity
To get one entity from the set, use the usual container operator to get an item.

In [10]:
entities[9]

{'id': 6488,
 'accession': 'FB00000388.01',
 'title': 'microMRI images of skulls of Wnt1 * Tgfbr2F/F * Alk5F/+ mice at E18.5',
 'project': 151,
 'funding': 'PIs: Scott Fraser and Seth Ruffins. This work was funded by a grant from NIH NIDCR to Scott Fraser.\r\n',
 'summary': ' microMRI images of skulls of Wnt1Tgfbr2F/F * Alk5F/+ mice at E18.5. The dataset contains 3-D images in niftii format for 1 mouse. ',
 'description': ' microMRI images of skulls of Wnt1Tgfbr2F/F * Alk5F/+ mice at E18.5. The dataset contains 3-D images in niftii format for 1 mouse.  \n',
 'view_gene_summary': None,
 'mouse_genetic': None,
 'human_anatomic': None,
 'study_design': None,
 'release_date': '2015-06-01',
 'status': 2,
 'gene_summary': None,
 'thumbnail': 382,
 'show_in_jbrowse': None,
 '_keywords': 'Functional Analysis of Neural Crest and Palate Released',
 'RID': 13624,
 'RCB': None,
 'RMB': None,
 'RCT': '2017-09-22T17:33:18.797126-07:00',
 'RMT': '2018-03-12T03:29:19.902128-07:00'}

### Get a specific attribute value from an entity
To get one attribute value from an entity get the item using its `Column`'s `name` property.

In [11]:
dataset = pb.schemas['isa'].tables['dataset']
print (entities[9][dataset.accession.name])

FB00000388.01


## Fetch a Limited Number of Entities
To set a limit on the number of entities to be fetched from the catalog, use the explicit `fetch(limit=...)` method with the desired upper limit to fetch from the catalog.

In [12]:
entities.fetch(limit=3)
len(entities)

3

### Iterate over the EntitySet
`EntitySet`s are iterable like a typical container.

In [13]:
for entity in entities:
    print(entity[dataset.accession.name])

FB00000965
FB00000942
FB00000952


## Convert to Pandas DataFrame
EntitySets can be transformed into the popular Pandas DataFrame.

In [14]:
entities.dataframe

Unnamed: 0,RCB,RCT,RID,RMB,RMT,_keywords,accession,description,funding,gene_summary,...,mouse_genetic,project,release_date,show_in_jbrowse,status,study_design,summary,thumbnail,title,view_gene_summary
0,https://auth.globus.org/f226978f-e0be-4f47-a57...,2018-03-23T14:34:48.481747-07:00,61861,https://auth.globus.org/f226978f-e0be-4f47-a57...,2018-03-23T14:34:48.481747-07:00,,FB00000965,We have generated histone ChIP-seq libraries f...,,,...,,153,2018-03-23,True,1,We have generated histone ChIP-seq libraries f...,,442,ChIP-seq and RNA-seq of mouse e15.5 mandibular...,
1,https://auth.globus.org/de244c2a-618a-4f51-949...,2018-02-27T14:15:27.183844-08:00,38541,https://auth.globus.org/b506963e-d274-11e5-99f...,2018-03-13T19:23:41.416142-07:00,Rapid Identification and Validation of Human C...,FB00000942,**This is restricted-access human data.** To...,,,...,,309,2018-02-27,,2,,,442,(FB0070) 12yo girl w/ micrognathia (progressiv...,
2,https://auth.globus.org/de244c2a-618a-4f51-949...,2018-02-27T14:29:03.694379-08:00,38561,https://auth.globus.org/b506963e-d274-11e5-99f...,2018-03-13T19:30:20.424786-07:00,Rapid Identification and Validation of Human C...,FB00000952,**This is restricted-access human data.** To...,,,...,,309,2018-02-27,,2,,,442,"(FB0108) Right coronal synostosis, anisometrop...",


## Selecting Attributes
It is also possible to fetch only a subset of attributes from the catalog. The `entities(...)` method accepts a variable argument list followed by keyword arguments. Each argument must be a `Column` object from the table's `columns` container.

### Renaming selected attributes
To rename the selected attributes, use "keyword" arguments in the method. For example, `entities(..., new_name=table.column)` will rename `table.column` with `new_name` in the entities returned from the server. (It will not change anything in the stored catalog data.) Note that in pythong, the keyword arguments _must come after_ other arguments.

In [15]:
entities = path.entities(dataset.accession, dataset.title, The_Statues_Code=dataset.status).fetch(limit=5)

### Convert to list
Now we can look at the results from the above fetch. To demonstrate a different access mode, we can convert the entities to a standard python list and dump to the console.

In [16]:
list(entities)

[{'accession': 'FB00000965',
  'title': 'ChIP-seq and RNA-seq of mouse e15.5 mandibular process',
  'The_Statues_Code': 1},
 {'accession': 'FB00000942',
  'title': '(FB0070) 12yo girl w/ micrognathia (progressive, she did not have as a child) & contracture of fingers',
  'The_Statues_Code': 2},
 {'accession': 'FB00000952',
  'title': '(FB0108) Right coronal synostosis, anisometropia and aniseikonia, ptosis of left eyelid, hyperopia',
  'The_Statues_Code': 2},
 {'accession': 'FB00000972',
  'title': 'E18.5 wildtype mouse microCT',
  'The_Statues_Code': 1},
 {'accession': 'FB00000964',
  'title': 'Cell Proliferation Heat Map of E10.5 mouse embryo',
  'The_Statues_Code': 1}]