Paola Nassisi edited this page Dec 7, 2017 · 8 revisions
Clone this wiki locally

Welcome to the esgf-dashboard wiki!

ESGF Dashboard

The ESGF Dashboard system provides a distributed and scalable monitoring framework responsible for capturing usage metrics at the single site level, at the ENES archive level and at the global ESGF level. The Dashboard collects and stores a high volume of heterogeneous metrics, covering aggregated cross and project-specific download statistics as well as the status of the federated archive in terms of published datasets, models and institutes involved.

The Dashboard has a plug-in based architecture composed by (i) an Information Provider, responsible to collect an extended set of data usage information, (ii) a data warehouse system, (iii) a GUI to visualize statistics through a centralized approach.

Dashboard Information Provider

The Information Provider, included into the ESGF software stack, can be configured to collect information at the single site level (data node) or to gather statistics from other nodes (collector node).

At the data node level, it retrieves the THREDDS web catalog downloads information, discovers the related metadata from the SOLR service on the reference ESGF index node and aggregates the data into the storage database. At the collector node level, instead, the Information Provider federates the local statistics by querying the ESGF REST APIs service of a set of data nodes.

An extension of the esgcet database was required to properly catalog a huge collection of data and to extract relevant and reliable information to be used as solid support for decision-makers. To this purpose, the data warehouse, supporting the Information Provider, includes not only logging information but also project-specific metadata, geolocation of the clients, status of the published datasets and so on.

All the downloads information gathered on the data nodes are properly processed by an Extraction-Transformation-Loading (ETL) system included in the ESGF Dashboard and ingested into the data warehouse to be collected through the ESGF REST APIs service by the reference collector node.

esgf-stats-api github repository →

Dashboard user interface

The system offers an analytics web interface enriched with a set of simple and attractive graphical widgets (e.g. charts, maps, reports) giving the users a comprehensive view about data usage statistics.

The ESGF Dashboard UI is as web application installed on a collector node through which the user can access the different metrics/statistics.

The visualized statistics resulting from the aggregation process of the Information Provider are:
* total volume of the downloads;
* total number of the downloads;
* total number of successful downloads;
* total number of downloads related to a replicated file;
* average duration of the download time.

All these metrics are visualized through intuitive and interactive widgets.

esgf-dashboard-ui github repository →

Visit the dashboard user interface →