# TOY PIPELINE

In this notebook we set-up a toy-pipeline that take metadata by elabFTW and insert them in a hdf5 file. To realize ths we will
- get the metadata by elabFTW by using its API
- read the output file
- create an empty hdf5 file
- map the metadata we are interested in, in the hdf5 fields

### Set up

In [1]:
pip install nexusformat


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.1.2[0m[39;49m -> [0m[32;49m24.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [20]:
import datetime
# the python library for elabftw
import elabapi_python
import json
import csv
import os
import ast
import pprint
import h5py
from nexusformat.nexus import *

path=" " #insert your path




###  API configuration

In [21]:

# replace with the URL of your instance
API_HOST_URL = 'https://nffa-di-electronic-lab.areasciencepark.it/api/v2/'
# replace with your api key
API_KEY = ''#insert your key

### ElabFTW get

Now we will use the elab API to get the experiment that have id=48 and save the response  in a json file 'exp48.json'. We are not going to use the python library elabapi_python but using the package 'os' we work as we were on the shell and use the command curl.

In [22]:
os.system('curl -H "Authorization: '+API_KEY+'"'+"GET https://nffa-di-electronic-lab.areasciencepark.it/api/v2/experiments/48  -H 'accept: application/json' >"+ path+'exp48.json')  

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2464    0  2464    0     0   8730      0 --:--:-- --:--:-- --:--:--  8737


0

The response is '0' that means everything has gone well!
Now let's look at the file.

In [23]:
with open(path+'exp48.json', "r") as jsonfile:
        dic_exp = json.load(jsonfile) 
dic_exp

{'access_key': None,
 'body': '',
 'body_html': '',
 'canread': '{"base": 30, "teams": [], "users": [], "teamgroups": []}',
 'canwrite': '{"base": 20, "teams": [], "users": [], "teamgroups": []}',
 'category': 6,
 'category_color': 'b1a9d1',
 'category_title': 'NXem_toy',
 'comments': [],
 'content_type': 1,
 'created_at': '2024-10-10 09:49:44',
 'custom_id': None,
 'date': '2024-10-10',
 'elabid': '20241010-2c7fc8842cab5d99c5d67ec8d8bbe743a5f9376e',
 'events_start': None,
 'experiments_links': [],
 'firstname': 'Federica',
 'fullname': 'Federica Bazzocchi',
 'has_attachment': None,
 'has_comment': 0,
 'id': 48,
 'items_links': [],
 'lastchangeby': 3,
 'lastname': 'Bazzocchi',
 'locked': 0,
 'locked_at': None,
 'lockedby': None,
 'metadata': '{"elabftw": {"extra_fields_groups": [{"id": 1, "name": "Sample"}, {"id": 2, "name": "Coordinate_System_Set"}, {"id": 3, "name": "Fields"}]}, "extra_fields": {"Definition": {"type": "text", "value": "NXem", "group_id": 3, "required": true}, "Start_

It is a json file that contains all the informations present in the experiment page (all the metadata!) but we built the template in such  a way that we are interested only in the 'value' correspponding to the 'metadata' key. Notice that it was our choice neglecting comments or attachments. If we would us ethem we should change the mapping! 
Let's look at the 'value' corresponding to the metadata key.

In [24]:
dic_exp['metadata']

'{"elabftw": {"extra_fields_groups": [{"id": 1, "name": "Sample"}, {"id": 2, "name": "Coordinate_System_Set"}, {"id": 3, "name": "Fields"}]}, "extra_fields": {"Definition": {"type": "text", "value": "NXem", "group_id": 3, "required": true}, "Start_Time": {"type": "date", "value": "2024-10-10", "group_id": 3, "required": true}, "Experiment_alias": {"type": "text", "value": "em_toy_id1", "group_id": 3, "required": true}}}'

It is a string...let us transform it in a dictionary. Moreover since "true" will not be recognized by python let us replace "true" with "True".

In [25]:
dic_meta=eval(dic_exp['metadata'].replace('true','True') )


Let's explore 

In [26]:
for key, value in dic_meta.items():
    print('key:',key)
    print(value)

key: elabftw
{'extra_fields_groups': [{'id': 1, 'name': 'Sample'}, {'id': 2, 'name': 'Coordinate_System_Set'}, {'id': 3, 'name': 'Fields'}]}
key: extra_fields
{'Definition': {'type': 'text', 'value': 'NXem', 'group_id': 3, 'required': True}, 'Start_Time': {'type': 'date', 'value': '2024-10-10', 'group_id': 3, 'required': True}, 'Experiment_alias': {'type': 'text', 'value': 'em_toy_id1', 'group_id': 3, 'required': True}}


Corresponding to the 'elabftw' key there are the groups according to which we have organized the metadata. We have used only the group 'fields' to tell us that in that group we will put metadata that will be mapped in fields of the hdf5 file.
The 'extra_fields' key's value is a dictionary where:
- the key are the fields name
- the value are dictionary with the key type,value acquired by the field (we are interested in this!!!) and the corresponding group (in our case there is only a group, the fields one)

# Preparing the hdf5/NeXus file

First we initialize a file, then we create the main group corresponding to the NXentry

In [27]:
f=h5py.File(path+'NXem_toy.nxs','w')
f.attrs['default']='entry'

In [28]:
g_entry=f.create_group("entry")
g_entry.attrs["NX_class"]="NXentry"

In [29]:
g_entry["definition"]=str(dic_meta['extra_fields']['Definition']['value'])
g_entry["start_time"]=dic_meta['extra_fields']['Start_Time']['value']
f["/entry"].create_dataset("experiment_alias",data=dic_meta['extra_fields']['Experiment_alias']['value'])
#g_entry['experiment_alias']=dic_meta['extra_fields']['Experiment_alias']['value']

<HDF5 dataset "experiment_alias": shape (), type "|O">

In [30]:
def printname(name):
    print(name)
f.visit(printname)    

entry
entry/definition
entry/experiment_alias
entry/start_time


In [31]:
f.close()

In [32]:
test=nxload(path+'NXem_toy.nxs')
print(test.tree)

root:NXroot
  @default = 'entry'
  entry:NXentry
    definition = 'NXem'
    experiment_alias = 'em_toy_id1'
    start_time = '2024-10-10'


## Pre-Exercise 
Visualize the nxs we have created by using the online service https://myhdf5.hdfgroup.org/

## Exercise 1 
By using the 'str'.lower() function define the fields name g_entry[field_name] as function of the key of the dictionary dic_meta[extra_fields]


## Exercise 2 
By using the result in Exercize 1 try to define a function that take as input dic_meta['extra_field'] and gives as output the 3 fields g_entry[field_name]
