<img src='https://www.actris.eu/sites/default/files/inline-images/Actris%20logo.png' width=200 align=right>

# ACTRIS DC 
## Search with ACTRIS Metadata Rest API 

The goal of this notebook is to provide a guide on how to access data through the ACTRIS Metadata Rest API. This is a machine to machine approch to accessing data and is suited when you plan to access large amounts of data or only want to use a programming interface to access data. 

Let's get started!

### Using ACTRIS metadata catalog REST API

ACTRIS metadata catalog REST API: https://prod-actris-md2.nilu.no/index.html

The ACTRIS Rest API uses the ACTRIS vocabulary for several of the search criteria, the vocabulary can be found here: https://vocabulary.actris.nilu.no/skosmos/actris_vocab/en/

### Import libraries

In [1]:
# Library for working with multi-dimensional arrays 
import pandas as pd

# Libraries for working with JSON files, making HTTP requests, and handling file system operations
import json
import requests
import os

# Libary for creating python widgets
import ipywidgets as widgets

# Library for creating interactive plots
import plotly.express as px

## Accessing metadata

The full ACTRIS metadata catalog can be accessed with https://prod-actris-md2.nilu.no/metadata/. The API uses pagination, so to get the full catalogue of what you are searching for you have to go through all pages untill you hit an empty page. 

## Show this... 


In [19]:
# get all metadata in catalogue 
response = requests.get("https://prod-actris-md2.nilu.no/metadata/page/0") 
metadata_archive = response.json()  

In [20]:
metadata_archive[0] # show metadata

{'id': '674794b5a34b92517e7f7af3',
 'md_metadata': {'id': 16519,
  'provider': {'name': 'CLU', 'atom': 'http://localhost:5009/providers/11'},
  'file_identifier': '20230205_galati_lwc-scaled-adiabatic.nc',
  'language': 'en',
  'hierarchy_level': 'dataset',
  'online_resource': {'linkage': 'https://cloudnet.fmi.fi/'},
  'datestamp': '2024-05-31T07:53:59.647000Z',
  'created': '2024-05-31T07:54:00Z',
  'contact': [{'first_name': 'Ewan',
    'last_name': "O'Connor",
    'organisation_name': 'Finnish Meteorological Institute (FMI)',
    'role_code': ['pointOfContact'],
    'country_code': 'FI'}]},
 'md_identification': {'abstract': 'Liquid water content data derived from cloud remote sensing measurements at Galați',
  'title': 'Liquid water content data derived from cloud remote sensing measurements at Galați',
  'date_type': 'creation',
  'contact': [{'first_name': 'Simo',
    'last_name': 'Tukiainen',
    'organisation_name': 'Finnish Meteorological Institute (FMI)',
    'role_code': ['

In [21]:
# Most of these keys consists of a new dictonary with metadata information. 
# An example is md_metadata 
md_list = []
for f in metadata_archive:
    md_list.append(f['md_metadata']) 
df_md_metadata = pd.DataFrame.from_records(md_list)

df_md_metadata.iloc[0] #only show first element in list of metadata

id                                                             16519
provider           {'name': 'CLU', 'atom': 'http://localhost:5009...
file_identifier              20230205_galati_lwc-scaled-adiabatic.nc
language                                                          en
hierarchy_level                                              dataset
online_resource              {'linkage': 'https://cloudnet.fmi.fi/'}
datestamp                                2024-05-31T07:53:59.647000Z
created                                         2024-05-31T07:54:00Z
contact            [{'first_name': 'Ewan', 'last_name': 'O'Connor...
Name: 0, dtype: object

In [22]:
# Above the column 'contact' includes more information about a contact person for each dataset. 

df_md_metadata.iloc[0]['contact'] # show contact information for first dataset

[{'first_name': 'Ewan',
  'last_name': "O'Connor",
  'organisation_name': 'Finnish Meteorological Institute (FMI)',
  'role_code': ['pointOfContact'],
  'country_code': 'FI'}]

In [23]:
# Another example of extracting metadata, here the content information.
files_list = []
for f in metadata_archive:
    url = f['md_content_information']
    files_list.append(url)
    
df_content_information = pd.DataFrame.from_records(files_list)
# Displays the content information for all datasets from Birkenes II 
df_content_information 


Unnamed: 0,attribute_descriptions,content_type
0,[liquid droplet mass concentration],physicalMeasurement
1,[ice particle mass concentration],physicalMeasurement
2,"[air vertical velocity, drizzle droplet equivo...",physicalMeasurement
3,[hydrometeor type classification],physicalMeasurement
4,[liquid droplet mass concentration],physicalMeasurement
5,[ice particle mass concentration],physicalMeasurement
6,"[air vertical velocity, drizzle droplet equivo...",physicalMeasurement
7,[hydrometeor type classification],physicalMeasurement
8,[liquid droplet mass concentration],physicalMeasurement
9,[ice particle mass concentration],physicalMeasurement


In [24]:
# Another example of extracting metadata, here the distribution information.
# The distribution information includes data format, dataset url, protocol, restrictions and more.

files_list = []
for f in metadata_archive:
    url = f['md_distribution_information'][0]
    files_list.append(url)
    
df_distribution_information = pd.DataFrame.from_records(files_list)
df_distribution_information.iloc[0] #show the distribution information for the first dataset. 
# If you wish to see distribution information about all Birkenes II datasets, remove .iloc[0]

data_format                                                       netcdf
version_data_format                                       HDF5 (NetCDF4)
dataset_url            https://cloudnet.fmi.fi/api/download/product/0...
protocol                                                            HTTP
function                                                        download
restriction                                               {'set': False}
transfersize                                                       0.237
description                                 Direct download of data file
Name: 0, dtype: object