# Datapath Example 2
This notebook gives a very basic example of how to access data. 
It assumes that you understand the concepts presented in the 
example 1 notebook.

In [1]:
# Import deriva modules
from deriva_common import ErmrestCatalog, get_credential

In [2]:
# Connect with the deriva catalog
protocol = 'https'
hostname = 'www.facebase.org'
catalog_number = 1
credential = None
# If you need to authenticate, use Deriva Auth agent and get the credential
# credential = get_credential(hostname)
catalog = ErmrestCatalog(protocol, hostname, catalog_number, credential)

In [3]:
# Get the path builder interface for this catalog
pb = catalog.getPathBuilder()

## DataPaths
The path builder interface allows you to build data paths from the base tables that are accessed via the interface. A data path begins with a table (or an "aliased" table to be discussed later). The this "root" one can "`link`", "`filter`", "`reset_context`", project "`attributes`", and fetch the referenced "`entities`".

Begin by getting a base table, "`dataset`", from the "`isa`" schema accessible from the path builder.

In [4]:
dataset = pb.isa.dataset

## DataPath URIs
All data paths have URIs for the referenced resources in ERMrest. The URI identifies the resources which are available through Web protocols.

In [5]:
print(dataset.uri)

https://www.facebase.org/ermrest/catalog/1/entity/isa:dataset


## EntitySets
The data from a data path are accessed through a pythonic container object, the "`EntitySet`." The entity set is returned by the datapath object's "`entities()`" method. The entity set fetches the entities from ERMrest on first access (such as checking its length, iterating over it, or getting an item).

In [6]:
entityset = dataset.entities()

Entity sets behave like python containers. For example, we can check the count of rows in this entity set. This is also the point in time that the actual fetch from the server will be performed.

In [7]:
len(entityset)

712

### Get an entity
Access and print one of the entities in the set.

In [8]:
print(entityset[9])

{'id': 10521, 'accession': 'FB00000288.02', 'title': 'microCT - Soft Tissue of Smad4fl/fl Control Mouse at E18.5', 'project': 156, 'funding': 'This study was supported by grants from the National Institute of Dental and Craniofacial Research, NIH (DE012711, DE014078, DE017007, and DE020065) to Yang Chai.', 'summary': 'Mouse ID: DH202; This dataset includes a microCT scan of the skull of a Smad4fl/fl mouse at E18.5. The scan is in NiFTI format, which can be read by a number of free software applications including those listed below. To receive the scan in DICOM format, please email help@facebase.org. The following programs allow you to load and explore files in NiFTI format: -- ImageJ, a java-based program that runs on most operating systems, including MAC OSX -- MBAT, from the Laboratory of Neural Imaging at USC -- Mango, from Neuroimaging Informatics Tools and Resources Clearinghouse, offers desktop, web, and iPad-compatible versions. -- MRIcro, which runs on Windows and Linux systems

### Get a column value from an entity
Access and print one attribute (column) from one entity in the set.

In [9]:
print (entityset[9][dataset.accession.name])

FB00000288.02


## Set a Limit
Set a limit on the query. Here we request only 3 entities from the server.

In [10]:
entityset_limit3 = dataset.entities(limit=3)
len(entityset_limit3)

3

### Iterate over the entity set
Entity sets are iterable.

In [11]:
for e in entityset_limit3:
    print(e[dataset.accession.name])

FB00000827
FB00000861
FB00000380.01


## Convert to Pandas DataFrame
Entity sets can be transformed into the popular Pandas DataFrame.

In [12]:
entityset_limit3.dataframe

Unnamed: 0,_keywords,accession,description,funding,gene_summary,human_anatomic,id,mouse_genetic,project,release_date,show_in_jbrowse,status,study_design,summary,thumbnail,title,view_gene_summary,view_related_datasets
0,Epigenetic landscapes and regulatory divergenc...,FB00000827,Tg(hg19_chr7:145843942-145844366::LacZ) activi...,U01 DE024430,,,14062,,307,2016-01-01,,3,,Activity of human neural crest enhancer in E11...,442,Activity of human neural crest enhancer in E11...,,
1,Developing 3D Craniofacial Morphometry Data an...,FB00000861,Human-subject dataset with scans (in .ply form...,,,,14096,,301,2017-03-03,,3,,Human-subject dataset with scans for the 3D cr...,442,Developing 3D Craniofacial Morphometry Data an...,,
2,Functional Analysis of Neural Crest and Palate...,FB00000380.01,microMRI images of skulls of Ctgftm1Kml/Ctgft...,PIs: Scott Fraser and Seth Ruffins. This work ...,,,6399,,151,2015-06-01,,2,,microMRI images of skulls of Ctgftm1Kml/Ctgft...,382,microMRI images of skulls of Ctgftm1Kml/Ctgftm...,,


## Fetch a Subset of Attributes
It is also possible to get a subset of attributes from the server. The
`attributes(...)` method accepts a variable argument list. Each argument
must be a column object from the table's `columns` container which we can
access in this case directly as a property of the table since the column
name is a property python identifier.

In [13]:
entityset_attrs = dataset.attributes(dataset.accession, dataset.title, dataset.status).entities(limit=5)

### Convert to list
Convert the entity set to a standard python list and dump to the console.

In [14]:
list(entityset_attrs)

[{'accession': 'FB00000827',
  'status': 3,
  'title': 'Activity of human neural crest enhancer in E11.5 mouse embryo - Tg(hg19_chr7:145843942-145844366::LacZ)'},
 {'accession': 'FB00000861',
  'status': 3,
  'title': 'Developing 3D Craniofacial Morphometry Data and Tools to Transform Dysmorphology'},
 {'accession': 'FB00000380.01',
  'status': 2,
  'title': 'microMRI images of skulls of Ctgftm1Kml/Ctgftm1Kml mice at E14.5 '},
 {'accession': 'FB00000393.01',
  'status': 2,
  'title': 'microMRI images of skulls of Tgfbr2fl/fl mice at E18.5'},
 {'accession': 'FB00000155',
  'status': 2,
  'title': 'Micro-CT images of adult mouse skulls, Collaborative Cross NZO x WSB'}]