<img src='https://www.actris.eu/sites/default/files/inline-images/Actris%20logo.png' width=200 align=right>

# ACTRIS DC 
## Search with ACTRIS Metadata Rest API 

### Using ACTRIS metadata catalog REST API

Using the ACTRIS REST API you can access all ACTRIS metadata, stations, instruments, networks etc. It is possible to get the full metadata archive at once, but(!) this can take a bit of time. 

ACTRIS Rest API documentation: https://prod-actris-md.nilu.no/index.html

For the latest version including the metadata schema for model data, you would need to look here: https://dev-actris-md.nilu.no/index.html 

The ACTRIS Rest API uses the ACTRIS vocabulary for several of the search criteria, the vocabulary can be found here: https://vocabulary.actris.nilu.no/skosmos/actris_vocab/en/

NB! The ACTRIS REST API is currently under development, both production version and development version. A fully stable version should be launched in April 2024 at latest.  

In [1]:
# import packages

import pandas as pd
import requests
import json
import ipywidgets as widgets

## Browse the metadata archive

This is an example of how to browse and get used to the ACTRIS Rest API metadata catalog and each search element. Some of the most used metadata elements in the the ACTRIS metadata catalog is displayed with all values as dropdown widgets. 

In [10]:
# Vocabulary categories

response = requests.get("https://prod-actris-md.nilu.no/Vocabulary/categories") # get all countries in metadata archive
archive = response.json()
df = pd.DataFrame(archive)

# dropdown widget 
dropdown_categories = widgets.Dropdown(
    options=list(df['category']),
    value=list(df['category'])[0],
    description='Categories:',
    disabled=False,
)

display(dropdown_categories)

Dropdown(description='Categories:', options=('compliance', 'constrainttype', 'contentattribute', 'contenttype'…

In [11]:
# Vocabulary category values, choose from the above categories and explore the values. 

category = 'instrumenttype' #Gives all instrument categories
#category = 'contentattribute' # Gives all variable categories

response = requests.get("https://prod-actris-md.nilu.no/Vocabulary/{}".format(category))  # get all Facilities in metadata archive
archive = response.json()
df = pd.DataFrame(archive)

# dropdown widget 
dropdown_category = widgets.Dropdown(
    options=list(df['label']),
    value=list(df['label'])[0],
    description='{}:'.format(category),
    disabled=False,
    
)

display(dropdown_category)

Dropdown(description='instrumenttype:', options=('absorption filter sampler', 'absorption solution sampler', '…

In [12]:
# Facilities

response = requests.get("https://prod-actris-md.nilu.no/Facilities") # get all countries in metadata archive
archive = response.json()
df = pd.DataFrame(archive)

# dropdown widget 
dropdown_facilities = widgets.Dropdown(
    options=list(df['name']),
    value=list(df['name'])[0],
    description='Facilities:',
    disabled=False,
)

display(dropdown_facilities)

Dropdown(description='Facilities:', options=('Primorskaya', 'Hvasser', 'Cottered', 'Anholt', 'Ansbach', 'Malvi…

In [13]:
# Show facility chosen in dropdown menu
df[df['name']==dropdown_facilities.value]

Unnamed: 0,num_id,identifier,name,lat,lon,alt,country_code,identifier_type,uri,wmo_region,active,contact_organisation,facility_type,actris_national_facility,actris_nf_uri
0,2489,00LJ,Primorskaya,43.629167,132.236944,85.0,RU,other PID,https://prod-actris-md.nilu.no/facilities/00LJ,,,,,,


In [14]:
# show all metadata for norwegian facilities 
facilities_norway = df[df['country_code']=='NO'] #select norwegian facilities
facilities_norway # show archive as table 

Unnamed: 0,num_id,identifier,name,lat,lon,alt,country_code,identifier_type,uri,wmo_region,active,contact_organisation,facility_type,actris_national_facility,actris_nf_uri
1,2490,03MW,Hvasser,59.066667,10.433333,35.0,NO,other PID,https://prod-actris-md.nilu.no/facilities/03MW,,,,,,
5,2494,06HE,Malvik (moss),63.378300,10.605783,150.0,NO,other PID,https://prod-actris-md.nilu.no/facilities/06HE,,,,,,
7,313,07oj,Sandve,59.200000,5.200000,15.0,NO,other PID,https://prod-actris-md.nilu.no/facilities/07oj,Europe,,,,,
10,2499,09GZ,Nordre Osen (moss),61.321567,11.798717,470.0,NO,other PID,https://prod-actris-md.nilu.no/facilities/09GZ,,,,,,
12,2501,0AYU,Åsane (moss),60.493183,5.387733,80.0,NO,other PID,https://prod-actris-md.nilu.no/facilities/0AYU,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1355,3839,ZOS0,Vårli,63.457910,8.604890,40.0,NO,other PID,https://prod-actris-md.nilu.no/facilities/ZOS0,,,,,,
1356,3840,ZPYU,Eikefjord (moss),61.616817,5.626817,80.0,NO,other PID,https://prod-actris-md.nilu.no/facilities/ZPYU,,,,,,
1357,3841,ZQNJ,Ringebu (moss),61.615333,10.070200,600.0,NO,other PID,https://prod-actris-md.nilu.no/facilities/ZQNJ,,,,,,
1358,3842,ZR67,Hvittingfoss (moss),59.509767,9.910867,70.0,NO,other PID,https://prod-actris-md.nilu.no/facilities/ZR67,,,,,,


In [15]:
response = requests.get("https://prod-actris-md.nilu.no/Providers") # get all networks in metadata archive
archive = response.json()
df = pd.DataFrame(archive)

# dropdown widget
dropdown_providers = widgets.Dropdown(
    options=list(df['name']),
    value=list(df['name'])[0],
    description='Providers:',
    disabled=False,
)

display(dropdown_providers)

Dropdown(description='Providers:', options=('admin', 'DVAS', 'Norwegian Institute for Air Research', 'GRES', '…

In [16]:
df[df['name']==dropdown_providers.value]

Unnamed: 0,id,name,acronym,description,created
0,8,admin,admin,admin,2020-06-04T11:05:00.8037550Z


## Accessing metadata

The full ACTRIS metadata catalog can be accessed with https://prod-actris-md.nilu.no/Metadata/, but(!) this can take a bit of time. Therefore its best search using the available search elements such as instrument, country, station, provider etc. 


In [9]:
# get all metadata in catalogue 
response = requests.get("https://prod-actris-md.nilu.no/Metadata/") 
metadata_archive = response.json() 

In [17]:
metadata_archive # show metadata

[{'md_metadata': {'id': 150540,
   'provider': {'name': 'ARES', 'atom': 'http://localhost:5009/providers/10'},
   'file_identifier': 'EARLINET_AerRemSen_dus_Lev01_b1064_202311130600_202311130700_v01_qc03.nc',
   'language': 'en',
   'hierarchy_level': 'dataset',
   'online_resource': {'linkage': 'https://data.earlinet.org/'},
   'datestamp': '2023-11-13T14:06:00.0000000Z',
   'created': '2024-06-11T08:13:24.0000000Z',
   'contact': [{'first_name': 'Lucia',
     'last_name': 'Mona',
     'organisation_name': 'CNR-IMAA',
     'role_code': ['custodian',
      'distributor',
      'pointOfContact',
      'processor',
      'publisher',
      'resourceProvider'],
     'country_code': 'IT',
     'delivery_point': 'Contrada S.Loja, Zona Industriale - Tito Scalo I-85050 Potenza',
     'address_city': 'Potenza',
     'email': 'lucia.mona@cnr.it',
     'position_name': 'Senior Researcher'}]},
  'md_identification': {'abstract': 'Profiles of aerosol optical properties',
   'title': 'Aerosol parti

In [18]:
# get all metadata from station Birkenes II (9cxe) in catalogue 
response = requests.get("https://prod-actris-md.nilu.no/Metadata/facility/9cxe") 
metadata_archive = response.json() 
metadata_archive # show metadata

[{'md_metadata': {'id': 203849,
   'provider': {'name': 'IN-SITU',
    'atom': 'http://localhost:5009/providers/14'},
   'file_identifier': 'P3HD-KWCT.nc',
   'language': 'en',
   'hierarchy_level': 'dataset',
   'online_resource': {'linkage': 'http://ebas.nilu.no/'},
   'datestamp': '2024-06-13T22:00:00.0000000Z',
   'created': '2024-06-14T08:17:20.0000000Z',
   'contact': [{'first_name': 'Markus',
     'last_name': 'Fiebig',
     'organisation_name': 'NILU',
     'role_code': ['custodian'],
     'country_code': 'NO',
     'delivery_point': 'Instituttveien 18',
     'address_city': 'Kjeller',
     'administrative_area': 'Viken',
     'postal_code': 2007,
     'email': 'ebas@nilu.no',
     'position_name': 'Senior Scientist'}]},
  'md_identification': {'abstract': 'PM mass at Birkenes II. These measurements are gathered as a part of the following projects NILU, GAW-WDCA, ACTRIS, EMEP_preliminary',
   'title': 'PM mass at Birkenes II',
   'date_type': 'creation',
   'contact': [{'first_

In [19]:
# Each metadata element consists of a dictionary with keys shown in the dropdown menu

dropdown_md_elements = widgets.Dropdown(
    options=list(metadata_archive[0].keys()),
    value=list(metadata_archive[0].keys())[0],
    description='',
    disabled=False,
)

dropdown_md_elements

Dropdown(options=('md_metadata', 'md_identification', 'md_constraints', 'md_keywords', 'md_data_identification…

In [20]:
metadata_archive[0][dropdown_md_elements.value]

{'id': 203849,
 'provider': {'name': 'IN-SITU', 'atom': 'http://localhost:5009/providers/14'},
 'file_identifier': 'P3HD-KWCT.nc',
 'language': 'en',
 'hierarchy_level': 'dataset',
 'online_resource': {'linkage': 'http://ebas.nilu.no/'},
 'datestamp': '2024-06-13T22:00:00.0000000Z',
 'created': '2024-06-14T08:17:20.0000000Z',
 'contact': [{'first_name': 'Markus',
   'last_name': 'Fiebig',
   'organisation_name': 'NILU',
   'role_code': ['custodian'],
   'country_code': 'NO',
   'delivery_point': 'Instituttveien 18',
   'address_city': 'Kjeller',
   'administrative_area': 'Viken',
   'postal_code': 2007,
   'email': 'ebas@nilu.no',
   'position_name': 'Senior Scientist'}]}

In [21]:
# Most of these keys consists of a new dictonary with metadata information. 
# An example is md_metadata 
md_list = []
for f in metadata_archive:
    md_list.append(f['md_metadata']) 
df_md_metadata = pd.DataFrame.from_records(md_list)

df_md_metadata.iloc[0] #only show first element in list of metadata

id                                                            203849
provider           {'name': 'IN-SITU', 'atom': 'http://localhost:...
file_identifier                                         P3HD-KWCT.nc
language                                                          en
hierarchy_level                                              dataset
online_resource                  {'linkage': 'http://ebas.nilu.no/'}
datestamp                               2024-06-13T22:00:00.0000000Z
created                                 2024-06-14T08:17:20.0000000Z
contact            [{'first_name': 'Markus', 'last_name': 'Fiebig...
Name: 0, dtype: object

In [22]:
# Above the column 'contact' includes more information about a contact person for each dataset. 

df_md_metadata.iloc[0]['contact'] # show contact information for first dataset

[{'first_name': 'Markus',
  'last_name': 'Fiebig',
  'organisation_name': 'NILU',
  'role_code': ['custodian'],
  'country_code': 'NO',
  'delivery_point': 'Instituttveien 18',
  'address_city': 'Kjeller',
  'administrative_area': 'Viken',
  'postal_code': 2007,
  'email': 'ebas@nilu.no',
  'position_name': 'Senior Scientist'}]

In [23]:
# Another example of extracting metadata, here the content information.
files_list = []
for f in metadata_archive:
    url = f['md_content_information']
    files_list.append(url)
    
df_content_information = pd.DataFrame.from_records(files_list)
# Displays the content information for all datasets from Birkenes II 
df_content_information 


Unnamed: 0,attribute_descriptions,content_type
0,[aerosol particle mass concentration],physicalMeasurement
1,[aerosol particle light absorption coefficient],physicalMeasurement
2,[aerosol particle light absorption coefficient],physicalMeasurement
3,[aerosol particle light absorption coefficient],physicalMeasurement
4,[aerosol particle light absorption coefficient],physicalMeasurement
...,...,...
105,[aerosol particle elemental carbon mass concen...,physicalMeasurement
106,[ozone mass concentration],physicalMeasurement
107,[hydrogen amount fraction],physicalMeasurement
108,[aerosol particle aluminium mass concentration...,physicalMeasurement


In [24]:
# Another example of extracting metadata, here the distribution information.
# The distribution information includes data format, dataset url, protocol, restrictions and more.

files_list = []
for f in metadata_archive:
    url = f['md_distribution_information'][0]
    files_list.append(url)
    
df_distribution_information = pd.DataFrame.from_records(files_list)
df_distribution_information.iloc[0] #show the distribution information for the first dataset. 
# If you wish to see distribution information about all Birkenes II datasets, remove .iloc[0]

data_format                                                       NETCDF
version_data_format                                                    4
dataset_url            https://thredds.nilu.no/thredds/dodsC/ebas_doi...
protocol                                                         OPeNDAP
function                                                       streaming
restriction                                               {'set': False}
transfersize                                                   2143439.0
Name: 0, dtype: object