# DataFlow API walkthrough
Suhas Somnath <br>
4/6/2022 <br>
Oak Ridge National Laboratory

In [1]:
api_key = "Bearer eyJhbGciOiJIUzI1NiJ9.eyJ1c2VyX2lkIjoyLCJleHAiOjE2ODAzMDcyMDB9.MySTFxu-_PkSDSFWbls6ASMRTZ36CoPDX5AofHnqDm0"

In [2]:
from dlow import API

## 1. Instantiate the API

In [3]:
api = API(api_key)

Using staging server as default


## 2. Check to see if Globus endpoints are active:

In [4]:
response = api.endpoints_active()
response

{'source_activation': {'code': 'AlreadyActivated'},
 'destination_activation': {'code': 'AutoActivated.CachedCredential'}}

## 3. Activate one or both endpoints as necessary:

In [5]:
response = api.endpoints_activate("syz", "password", endpoint="destination")
response

{'status': 'ok'}

In [6]:
response = api.endpoints_active()
response

{'source_activation': {'code': 'AlreadyActivated'},
 'destination_activation': {'code': 'AlreadyActivated'}}

## 4. Create a measurement Dataset
This creates a directory at the destination Globus Endpoint:

In [7]:
response = api.dataset_create("Atomic Force Microscopy Scan of PZT",
                               # metadata = {"scientific metadata": "coming very soon!"},
                              )
response

{'id': 14,
 'name': 'Atomic Force Microscopy Scan of PZT',
 'creator': {'id': 2, 'name': 'Suhas Somnath'},
 'instrument': None,
 'metadata_field_values': []}

Getting the dataset ID programmatically to use later on:

In [9]:
dataset_id = response['id']
dataset_id

14

## 5. Upload data file(s) to Dataset

In [10]:
response = api.file_upload("./AFM_Topography.PNG", dataset_id)
response

using Globus since other file transfer adapters have not been implemented


{'id': 46,
 'name': 'AFM_Topography.PNG',
 'file_length': 133776,
 'file_type': '',
 'created_at': '2022-04-06 18:03:23 UTC',
 'relative_path': '',
 'is_directory': False}

Upload another data file to the same dataset:

In [11]:
response = api.file_upload("./measurement_configuration.txt", dataset_id, relative_path="foo/bar")
response

using Globus since other file transfer adapters have not been implemented


{'id': 47,
 'name': 'measurement_configuration.txt',
 'file_length': 105,
 'file_type': '',
 'created_at': '2022-04-06 18:04:59 UTC',
 'relative_path': 'foo/bar',
 'is_directory': False}

## 6. Search Dataset:

In [12]:
response = api.dataset_search("Scan")
response

{'total': 1,
 'has_more': False,
 'results': [{'id': 14,
   'created_at': '2022-04-06T17:57:50Z',
   'name': 'Atomic Force Microscopy Scan of PZT'}]}

In [13]:
dset_id = response['results'][0]['id']
dset_id

14

## 7. View this Dataset:

In [14]:
response = api.dataset_info(dset_id)
response

{'id': 14,
 'name': 'Atomic Force Microscopy Scan of PZT',
 'creator': {'id': 2, 'name': 'Suhas Somnath'},
 'instrument': None,
 'metadata_field_values': []}

## 8. View files uploaded via DataFlow:
We're not using DataFlow here but just viewing the destination file system.

Datasets are sorted by date:

In [24]:
! ls -hlt ~/dataflow/untitled_instrument/

total 23K
drwxr-xr-x 3 syz users 3 Apr  6 14:02 2022-04-06
drwxr-xr-x 3 syz users 3 Apr  4 13:47 2022-04-04
drwxr-xr-x 4 syz users 4 Apr  1 20:18 2022-04-01
drwxr-xr-x 4 syz users 4 Mar 30 14:10 2022-03-30
drwxr-xr-x 3 syz users 3 Oct 21 13:56 2021-10-21
drwxr-xr-x 3 syz users 3 Sep 28  2021 2021-09-28
drwxr-xr-x 6 syz users 6 Sep 21  2021 2021-09-21
drwxr-xr-x 4 syz users 4 Jul 30  2021 2021-07-30


There may be more than one dataset per day. Here we only have one

In [23]:
!ls -hlt ~/dataflow/untitled_instrument/2022-04-06/

total 512
drwxr-xr-x 3 syz users 5 Apr  6 14:05 135750_atomic_force_microscopy_scan_of_pzt


Viewing the root directory of the dataset we just created:

In [21]:
!ls -hlt ~/dataflow/untitled_instrument/2022-04-06/135750_atomic_force_microscopy_scan_of_pzt/

total 157K
drwxr-xr-x 3 syz users    3 Apr  6 14:05 foo
-rw-r--r-- 1 syz users 131K Apr  6 14:04 AFM_Topography.PNG
-rw-r--r-- 1 syz users    2 Apr  6 14:02 metadata.json


We will very soon be able to specify root level metadata that will be stored in ``metadata.json``.

We can also see the nested directories: ``foo/bar`` where we uploaded the second file:

In [22]:
!ls -hlt  ~/dataflow/untitled_instrument/2022-04-06/135750_atomic_force_microscopy_scan_of_pzt/foo/

total 512
drwxr-xr-x 2 syz users 3 Apr  6 14:05 bar


Looking at the inner most directory - ``bar``:

In [25]:
!ls -hlt ~/dataflow/untitled_instrument/2022-04-06/135750_atomic_force_microscopy_scan_of_pzt/foo/bar

total 9.5K
-rw-r--r-- 1 syz users 105 Apr  6 14:05 measurement_configuration.txt
