# 1 - Introduction to Catalog and Basic Python Client Interactions

## Before starting, follow installation here

### INSTALL REQUIREMENTS
1) Install Globus Catalog-client (https://github.com/globusonline/catalog-client)
git clone https://github.com/globusonline/catalog-client
cd catalog-client
python setup.py install --user

2) Install Globus Transfer API (https://github.com/globusonline/transfer-api-client-python)
git clone https://github.com/globusonline/transfer-api-client-python
cd transfer-api-client-python
python setup.py install --user

### Globus Online - Catalog Command Line Client
[https://github.com/globusonline/catalog-client]

Catalog User Interface: https://catalog-alpha.globuscs.info/
Contact: Ben Blaiszik (blaiszik@uchicago.edu)

### OBTAIN GLOBUS CREDENTIALS
https://www.globus.org/SignUp - **this is the credential you will use with Catalog**


# Catalog Data Model

* <b> Catalogs </b>
    * Have specified "vocabularies" or tag definitions.
        * e.g. beam_energy - float, decription - text
    * Catalogs contain many datasets
        
* <b> Datsets </b>
    * Datasets can have tags added and ACLs specified
    * Datasets contain many members
* <b> Members </b>
    * Members can have tags added
    * Generally point to a data file or directory on a Globus endpoint
    * COuld be a more general URI

<img src="img/catalog-model.png" width=70%>

# Imports and Authentication
* For some versions of iPython Notebook this seems to fail. You can also paste this into your ipython shell using ipython -i

In [102]:
import os
from globusonline.catalog.client.catalog_wrapper import *
from globusonline.catalog.client.operators import Op
from globusonline.catalog.client.rest_client import RestClientError

# Store authentication data in a local file
token_file = os.getenv('HOME','')+"/.ssh/gotoken.txt"
wrap = CatalogWrapper(token_file=token_file)
client = wrap.catalogClient

# Create a Catalog and Save the ID

In [None]:
catalog_info = { 
                    "config": {
                        "name":"Ben Demo Catalog"
                    }
               }
_,response = client.create_catalog(catalog_info)
catalog_id = response['id']
response

# Create a Dataset within the New Catalog and Save the Dataset ID

In [None]:
dataset_info = {"name":"New Dataset"}
_,response = client.create_dataset(catalog_id, dataset_info)
dataset_id = response['id']
response

# Add a Member to the New Dataset and Save the Member ID

In [None]:
member_info = {"data_type":"file", "data_uri":"globus://go#ep1/~/test.tst"}
_,response = client.create_member(catalog_id, dataset_id, member_info)
member_id = response['id']

In [None]:
response

# Get all Members in a Dataset

In [None]:
_, response = client.get_members(catalog_id, dataset_id)
for member in response:
    print "[%s] %s  %s"%(member['id'],member['data_type'],member['data_uri'])

In [None]:
response

# Get all Datasets in a Catalog

In [None]:
_,response = client.get_datasets(catalog_id)
for dataset in response:
    print "[%s] %s"%(dataset['id'],dataset['name'])

In [None]:
response

# List all Catalog in the Database

In [None]:
_,response = client.get_catalogs()
for catalog in response:
    print "[%s] %s"%(catalog['id'],catalog['config']['name'])

# Add an Annotation Definition and Apply it to a Dataset
* Available Annotation types {'enum': ['text', 'int8', 'float8', 'boolean', 'timestamptz', 'date']}


In [None]:
help(client.create_annotation_def)

In [None]:
new_annotations = [ {"name":"beam_energy", "type":"float8"},
                    {"name":"reference", "type":"text"}, 
                    {"name":"sample_number", "type":"int8"}]
responses = []
for annotation in new_annotations:
    _,response = client.create_annotation_def(catalog_id, annotation['name'],annotation['type'])
    responses.append(response)

In [None]:
responses

In [None]:
help(client.add_dataset_annotations)

In [None]:
_,response = client.add_dataset_annotations(catalog_id, dataset_id, {"beam_energy":"1.1", "reference":"this is a reference", 
                                                                     "sample_number":1})
response

# Retrieve Annotations on a Dataset

In [None]:
catalog_annotations = []
_,annotation_list = client.get_annotation_defs(catalog_id)
for annotation in annotation_list:
        catalog_annotations.append(annotation['name'])

_,response = client.get_dataset_annotations(catalog_id, dataset_id, catalog_annotations)
response

# Query for Datasets in a Catalog

In [None]:
help(client.get_datasets)

### Valid Operators

In [None]:
Op

In [None]:
_,response =client.get_datasets(catalog_id, selector_list=[("beam_energy",Op['GT'],1)])
response