<img src='https://www.actris.eu/sites/default/files/inline-images/Actris%20logo.png' width=200 align=right>

# ACTRIS DC 
## Search with ACTRIS Metadata Rest API 

### Using ACTRIS metadata catalog REST API

Using the ACTRIS REST API you can access all ACTRIS metadata, stations, instruments, networks etc. It is possible to get the full metadata archive at once, but(!) this can take a bit of time. 

ACTRIS Rest API documentation: https://prod-actris-md.nilu.no/index.html

For the latest version including the metadata schema for model data, you would need to look here: https://dev-actris-md.nilu.no/index.html 

The ACTRIS Rest API uses the ACTRIS vocabulary for several of the search criteria, the vocabulary can be found here: https://vocabulary.actris.nilu.no/skosmos/actris_vocab/en/

NB! The ACTRIS REST API is currently under development, both production version and development version. A fully stable version should be launched in April 2024 at latest.  

In [2]:
# import packages

import pandas as pd
import requests
import json
import ipywidgets as widgets

## Browse the metadata archive

This is an example of how to browse and get used to the ACTRIS Rest API metadata catalog and each search element. Some of the most used metadata elements in the the ACTRIS metadata catalog is displayed with all values as dropdown widgets. 

In [6]:
# Vocabulary categories

response = requests.get("https://dev-actris-md.nilu.no/Vocabulary/categories") # get all countries in metadata archive
archive = response.json()
df = pd.DataFrame(archive)

# dropdown widget 
dropdown_categories = widgets.Dropdown(
    options=list(df['category']),
    value=list(df['category'])[0],
    description='Categories:',
    disabled=False,
)

display(dropdown_categories)

Dropdown(description='Categories:', options=('compliance', 'constrainttype', 'contentattribute', 'contenttype'…

In [9]:
# Vocabulary category values, choose from the above categories and explore the values. 

category = 'instrumenttype' #Gives all instrument categories
#category = 'contentattribute' # Gives all variable categories

response = requests.get("https://dev-actris-md.nilu.no/Vocabulary/{}".format(category))  # get all Facilities in metadata archive
archive = response.json()
df = pd.DataFrame(archive)

# dropdown widget 
dropdown_category = widgets.Dropdown(
    options=list(df['label']),
    value=list(df['label'])[0],
    description='{}:'.format(category),
    disabled=False,
    
)

display(dropdown_category)

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

In [7]:
# Facilities

response = requests.get("https://dev-actris-md.nilu.no/Facilities") # get all countries in metadata archive
archive = response.json()
df = pd.DataFrame(archive)

# dropdown widget 
dropdown_facilities = widgets.Dropdown(
    options=list(df['name']),
    value=list(df['name'])[0],
    description='Facilities:',
    disabled=False,
)

display(dropdown_facilities)

Dropdown(description='Facilities:', options=('Primorskaya', 'Hvasser', 'Cottered', 'Anholt', 'Ansbach', 'Malvi…

In [44]:
# Show facility chosen in dropdown menu
df[df['name']==dropdown_facilities.value]

Unnamed: 0,num_id,identifier,name,lat,lon,alt,country_code,wmo_region,identifier_type,uri
147,626,8d7e,Experimental Lakes Area,49.663889,-93.721111,369.0,CA,North and Central America,other PID,https://dev-dc.actris.nilu.no/facility/8d7e


In [45]:
# show all metadata for norwegian facilities 
facilities_norway = df[df['country_code']=='NO'] #select norwegian facilities
facilities_norway # show archive as table 

Unnamed: 0,num_id,identifier,name,lat,lon,alt,country_code,wmo_region,identifier_type,uri
3,933,07oj,Sandve,59.2,5.2,15.0,NO,Europe,other PID,https://dev-dc.actris.nilu.no/facility/07oj
19,443,1dm1,Ny-Ålesund - Gruvebadet,78.918611,11.891667,40.0,NO,Europe,other PID,https://dev-dc.actris.nilu.no/facility/1dm1
41,940,2mgh,Karpdalen,69.656165,30.421796,70.0,NO,Europe,other PID,https://dev-dc.actris.nilu.no/facility/2mgh
71,420,4at2,Hurdal,60.372386,11.078142,300.0,NO,Europe,other PID,https://dev-dc.actris.nilu.no/facility/4at2
75,937,4ke6,Møsvatn,59.833333,8.333333,940.0,NO,Europe,other PID,https://dev-dc.actris.nilu.no/facility/4ke6
82,931,4xxq,Svanvik,69.45,30.033333,30.0,NO,Europe,other PID,https://dev-dc.actris.nilu.no/facility/4xxq
168,417,9cxe,Birkenes II,58.38853,8.252,219.0,NO,Europe,other PID,https://dev-dc.actris.nilu.no/facility/9cxe
200,923,azm0,Tustervatn,65.833333,13.916667,439.0,NO,Europe,other PID,https://dev-dc.actris.nilu.no/facility/azm0
208,477,bc6t,Andøya,69.278333,16.011666,380.0,NO,Europe,other PID,https://dev-dc.actris.nilu.no/facility/bc6t
220,935,byxy,Øverbygd,69.05,19.366667,90.0,NO,Europe,other PID,https://dev-dc.actris.nilu.no/facility/byxy


In [8]:
response = requests.get("https://dev-actris-md.nilu.no/Providers") # get all networks in metadata archive
archive = response.json()
df = pd.DataFrame(archive)

# dropdown widget
dropdown_providers = widgets.Dropdown(
    options=list(df['name']),
    value=list(df['name'])[0],
    description='Providers:',
    disabled=False,
)

display(dropdown_providers)

Dropdown(description='Providers:', options=('Norwegian Institute for Air Research', 'admin', 'Barcelona Superc…

In [88]:
df[df['name']==dropdown_providers.value]

Unnamed: 0,id,name,acronym,created,description
8,14,IN-SITU,IN-SITU,2020-06-29T07:20:45.2311600Z,ACTRIS In situ data centre unit (In-Situ)


## Accessing metadata

The full ACTRIS metadata catalog can be accessed with https://prod-actris-md.nilu.no/Metadata/, but(!) this can take a bit of time. Therefore its best search using the available search elements such as instrument, country, station, provider etc. 


In [9]:
# get all metadata in catalogue 
response = requests.get("https://dev-actris-md.nilu.no/Metadata/") 
metadata_archive = response.json() 


In [10]:
metadata_archive # show metadata

[{'md_metadata': {'id': 126733,
   'provider': {'name': 'GRES', 'atom': 'http://localhost:5009/providers/13'},
   'file_identifier': 'groundbased_ftir.c2h6_awi019_paramaribo_20040926t164033z_20041123t121322z_004.hdf',
   'language': 'en',
   'hierarchy_level': 'dataset',
   'online_resource': {'linkage': 'https://gres-aeris.ipsl.fr/'},
   'datestamp': '2024-03-04T19:33:19.0000000Z',
   'created': '2024-03-04T18:46:12.0000000Z',
   'contact': [{'first_name': 'Cathy',
     'last_name': 'Boonne',
     'organisation_name': 'Institut Pierre Simon Laplace (IPSL-CNRS)',
     'role_code': ['custodian'],
     'country_code': 'FR',
     'delivery_point': 'Boite 101',
     'address_city': '4 place Jussieu',
     'administrative_area': 'Paris',
     'postal_code': 75252,
     'email': 'gres-contact@aeris-data.fr',
     'position_name': 'engineer'}]},
  'md_identification': {'abstract': 'Ethane partial and total column from FTIR at Paramaribo - 26 September 2004',
   'title': 'Ethane partial and to

In [11]:
# get all metadata from station Birkenes II (9cxe) in catalogue 
response = requests.get("https://dev-actris-md.nilu.no/Metadata/facility/9cxe") 
metadata_archive = response.json() 
metadata_archive # show metadata

In [None]:
# Each metadata element consists of a dictionary with keys shown in the dropdown menu

dropdown_md_elements = widgets.Dropdown(
    options=list(metadata_archive[0].keys()),
    value=list(metadata_archive[0].keys())[0],
    description='',
    disabled=False,
)

dropdown_md_elements

Dropdown(options=('md_metadata', 'md_identification', 'md_constraints', 'md_keywords', 'md_data_identification…

In [None]:
metadata_archive[0][dropdown_md_elements.value]

{'attribute_descriptions': ['acetaldehyde amount fraction'],
 'content_type': 'physicalMeasurement'}

In [72]:
# Most of these keys consists of a new dictonary with metadata information. 
# An example is md_metadata 
md_list = []
for f in metadata_archive:
    md_list.append(f['md_metadata']) 
df_md_metadata = pd.DataFrame.from_records(md_list)

df_md_metadata.iloc[0] #only show first element in list of metadata

id                                                                11
provider           {'name': 'IN-SITU', 'atom': 'http://localhost:...
file_identifier                                         UMFT-FGZ3.nc
language                                                          en
hierarchy_level                                              dataset
online_resource                  {'linkage': 'http://ebas.nilu.no/'}
datestamp                               2023-12-07T23:00:00.0000000Z
created                                 2023-12-08T20:30:03.0000000Z
contact            [{'first_name': 'Markus', 'last_name': 'Fiebig...
Name: 0, dtype: object

In [82]:
# Above the column 'contact' includes more information about a contact person for each dataset. 

df_md_metadata.iloc[0]['contact'] # show contact information for first dataset

[{'first_name': 'Markus',
  'last_name': 'Fiebig',
  'organisation_name': 'NILU',
  'role_code': ['custodian'],
  'country_code': 'NO',
  'delivery_point': 'Instituttveien 18',
  'address_city': 'Kjeller',
  'administrative_area': 'Viken',
  'postal_code': 2007,
  'email': 'ebas@nilu.no',
  'position_name': 'Senior Scientist'}]

In [85]:
# Another example of extracting metadata, here the content information.
files_list = []
for f in metadata_archive:
    url = f['md_content_information']
    files_list.append(url)
    
df_content_information = pd.DataFrame.from_records(files_list)
# Displays the content information for all datasets from Birkenes II 
df_content_information 


Unnamed: 0,attribute_descriptions,content_type
0,[acetaldehyde amount fraction],physicalMeasurement
1,[acetaldehyde amount fraction],physicalMeasurement
2,[aerosol particle light absorption coefficient],physicalMeasurement
3,[aerosol particle light absorption coefficient],physicalMeasurement
4,[aerosol particle light absorption coefficient],physicalMeasurement
5,[aerosol particle light absorption coefficient],physicalMeasurement
6,[aerosol particle light absorption coefficient],physicalMeasurement
7,[aerosol particle light absorption coefficient],physicalMeasurement
8,[aerosol particle light absorption coefficient],physicalMeasurement
9,[aerosol particle light absorption coefficient],physicalMeasurement


In [87]:
# Another example of extracting metadata, here the distribution information.
# The distribution information includes data format, dataset url, protocol, restrictions and more.

files_list = []
for f in metadata_archive:
    url = f['md_distribution_information'][0]
    files_list.append(url)
    
df_distribution_information = pd.DataFrame.from_records(files_list)
df_distribution_information.iloc[0] #show the distribution information for the first dataset. 
# If you wish to see distribution information about all Birkenes II datasets, remove .iloc[0]

data_format                                                       NETCDF
version_data_format                                                    4
dataset_url            https://dev-thredds.nilu.no/thredds/fileServer...
protocol                                                            HTTP
function                                                        download
restriction                                               {'set': False}
transfersize                                                    203290.0
Name: 0, dtype: object