# Accessing Prometheus Metrics 

## Using the Prometheus API Client

In this notebook, we will learn how to use the Prometheus API client for fetching and formatting the raw metrics obtained from a Prometheus host to drive better data science analysis on these metrics.
You can find more information about the functions of the API client here: [https://prometheus-api-client-python.readthedocs.io/en/master/source/prometheus_api_client.html#prometheus-api-client-package](https://prometheus-api-client-python.readthedocs.io/en/master/source/prometheus_api_client.html#prometheus-api-client-package)

For those wondering "What is a notebook?", let's do a quick introduction!

### Intro to Jupyter Notebook

The [Jupyter Notebook](https://jupyter.org/) is an incredibly powerful tool for interactively developing and presenting data science projects in a Python environment. Two main components of a notebook are:

- **Kernel** - computational engine that executes the code contained in a notebook document
- **Cell** - container for text to be displayed in the notebook or code to be executed by the notebook’s kernel
To help you get started, below is a list of few useful keyboard shortcuts you can try while exploring this notebook.

#### Keyboard shortcuts

- To run a cell, press: `Shift + Enter`

- To add a cell before your current cell, press: `Esc + a`

- To add a cell after your current cell, press: `Esc + b`

- To delete a cell, press: `Esc + x`

- To be able to edit a cell, press: `Enter`

- To see more documentation about of a function, type `?function_name`

- To see source code, type `??function_name`

- To quickly see possible arguments for a function, type `Shift + Tab` after typing the function name.

`Esc` and `Enter` take you into different modes. `Press Esc + h` to see all shortcuts.

Now that you have an open notebook in front of you, its interface will hopefully not look entirely alien; after all, Jupyter is essentially just an advanced word processor. Why not take a look around? Check out the menus to get a feel for it, especially take a few moments to scroll down the list of commands in the command palette, which is the small button with the keyboard icon (or `Ctrl + shift + P`).

#### Installing the library

Now that you are a bit familiar with the notebook environment, let's start exploring the Prometheus API Client library. To begin any exploratory analysis, we need to first install all the required packages.

For this notebook in particular, the prometheus api client library needs to be installed: [https://pypi.org/project/prometheus-api-client/](https://pypi.org/project/prometheus-api-client/)

Start executing each of the cells below. When you run a cell, its output will display below and the label to its left will have changed from `In []` to `In [1]`. The `In` part of the label is simply short for Input, while the label number indicates when the cell was executed on the kernel. Run the cell again and observe how the label changes.

In [None]:
!python3.8 -m pip install prometheus-api-client
!python3.8 -m pip install matplotlib

We will now import few modules from the Prometheus API client library which will help us in connecting to a Prometheus host and fetching the relevant metrics from it.

In [None]:
from prometheus_api_client import Metric, MetricsList, PrometheusConnect
import os

After installing all the necessary modules, we can start with collecting some data from a Prometheus host.

#### Connecting to Prometheus

The `PrometheusConnect` module of the library can be used to connect to a Prometheus host. This module is essentially a class created for the collection of metrics from a Prometheus host. It stores the following connection parameters:

- `url` - (str) url for the prometheus host
- `headers` – (dict) A dictionary of http headers to be used to communicate with the host. Example: {“Authorization”: “bearer my_oauth_token_to_the_host”}
- `disable_ssl` – (bool) If set to True, will disable ssl certificate verification for the http requests made to the prometheus host
To know more about this module, you can refer to the documentation here: [https://prometheus-api-client-python.readthedocs.io/en/master/source/prometheus_api_client.html#module-prometheus_api_client.prometheus_connect](https://prometheus-api-client-python.readthedocs.io/en/master/source/prometheus_api_client.html#module-prometheus_api_client.prometheus_connect)

In [None]:
prom_port = os.getenv("PROMETHEUS_DEMO_SERVICE_SERVICE_PORT", "9090")
prom_url = "http://demo.robustperception.io" + ":" + prom_port
print("Prometheus uri: ", prom_url)

In [None]:
# Creating the prometheus connect object with the required parameters
pc = PrometheusConnect(url=prom_url, disable_ssl=True)

This is a public demo instance of Prometheus hosted by [Robust Perception](http://demo.robustperception.io:9090/consoles/index.html)

In [None]:
# Let's print only the first 10 metric names as the list is too long.
pc.all_metrics()[:10]

#### Fetching Metrics from Prometheus

Every metric in Prometheus is stored as time series data: streams of timestamped values belonging to the same metric and the same set of labeled dimensions. Each of these time series is uniquely identified by:

- **metric name** - Specifies the general feature of a system that is measured. E.g. `http_requests_total` - the total number of HTTP requests received.
- **labels** - Provides more details to identify a particular dimensional instantiation of the metric. E.g. `http_requests_total{method="POST", handler="/api/tracks"}`: all HTTP requests that used the method POST - to the /api/tracks handler
Prometheus provides a functional query language called **PromQL (Prometheus Query Language)** that lets the user select and aggregate time series data in real time.

The `custom_query()` method in the library can be used to fetch metrics according to this PromQL format.

Parameters:

- **query** – (str) This is a PromQL query, a few examples can be found at [https://prometheus.io/docs/prometheus/latest/querying/examples/](https://prometheus.io/docs/prometheus/latest/querying/examples/)
- **params** – (dict) Optional dictionary containing GET parameters to be sent along with the API request, such as “time”
Lets try to fetch the values for a given metric

In [None]:
# Here, we are fetching the values of a particular metric name
pc.custom_query(query="prometheus_http_requests_total")

Now, let's see if we can fetch a particular label configuration of this metric

In [None]:
pc.custom_query(query="prometheus_http_requests_total{code='200'}")

We see that this filters the data to contain only values for the specified label configurations.

Can we now fetch the sum of this metric?

In [None]:
pc.custom_query(query="sum(prometheus_http_requests_total)")

**Question 1:** Fetch the metric values for the following querying functions of any metric-label configurations:

- **sum()**
- **rate()** [of 5 minutes]
- **count()**
Explanation of these query functions can be found at: [https://prometheus.io/docs/prometheus/latest/querying/functions/](https://prometheus.io/docs/prometheus/latest/querying/functions/)

[Hint: Refer to [https://prometheus.io/docs/prometheus/latest/querying/examples/](https://prometheus.io/docs/prometheus/latest/querying/examples/) for querying examples]

#### Collecting Historical Data

Suppose we want to fetch historical data, say the past few days of data, we can do so by using the `get_metric_range_data()` method. This method will fetch the data for the specifed metric label configuration within the time range specified. It consists of the following prameters:

- **metric_name** – (str) The name of the metric
- **label_config** – (dict) A dictionary specifying metric labels and their values
- **start_time** – (datetime) A datetime object that specifies the metric range start time
- **end_time** – (datetime) A datetime object that specifies the metric range end time
- **chunk_size** – (timedelta) Duration of metric data downloaded in one request
- **store_locally** – (bool) If set to True, will store data locally at, “./metrics/hostname/metric_date/name_time.json.bz2”
- **params** – (dict) Optional dictionary containing GET parameters to be sent along with the API request, such as “time”
Let's fetch the past 2 days of data for a specfic metric-label configuration in chunks of 1 day

We have set up a metric `test` with 3 weeks of historical data, but there is no current metric data available for it.

In [None]:
# Import the required datetime functions
from prometheus_api_client.utils import parse_datetime
from datetime import timedelta

start_time = parse_datetime("2d")
end_time = parse_datetime("now")
chunk_size = timedelta(days=1)

metric_data = pc.get_metric_range_data(
    "test{job='testdata'}",  # this is the metric name and label config
    start_time=start_time,
    end_time=end_time,
    chunk_size=chunk_size,
)

In [None]:
len(metric_data)

In [None]:
type(metric_data)

Let's take a closer look at the `metric_data` that we fetched.

In [None]:
for metric in metric_data:
    print(metric["metric"], "\n")

**Question 2:** Can you fetch the past 12 hours of data for the metric in chunks of 1 hour?

We can also fetch the **current** metric value for a specified metric and label configuration using the `get_current_metric_value()` method

In [None]:
pc.get_current_metric_value(
    metric_name="prometheus_http_requests_total",
    label_config={"code": "200", "handler": "/metrics"}
    )

**Question 3:** Fetch the current metric value for any metric with different label configurations

To keep track of multiple metrics each with multiple chunks distributed in a list, we created the `Metric` and `MetricsList` classes.

### How MetricsList works

To combine the chunks for each metric, we can initialize a `MetricsList` object. It creates a list of `Metric` objects, where each object is unique for a specific time-series.

To know more about the `MetricsList` module, you can refer to the documentation here: [https://prometheus-api-client-python.readthedocs.io/en/master/source/prometheus_api_client.html#module-prometheus_api_client.metrics_list](https://prometheus-api-client-python.readthedocs.io/en/master/source/prometheus_api_client.html#module-prometheus_api_client.metrics_list)

In [None]:
metrics_object_list = MetricsList(metric_data)
print(len(metrics_object_list))
for item in metrics_object_list:
    print(type(item))

Lets see what each of these metric objects look like

In [None]:
for item in metrics_object_list:
    print(item.metric_name, item.label_config, "\n")

Each of these items are unique metric time-series, none of them are repeated. The constructor for `MetricsList` combined all the chunks for each metric time-series in a single `Metric` object.

#### More about the Metric class

Let's look at one of the metrics from the `metrics_object_list` to learn more about the `Metric` class.

To know more about the `Metric` class, you can refer to the documentation here: [https://prometheus-api-client-python.readthedocs.io/en/master/source/prometheus_api_client.html#module-prometheus_api_client.metric.](https://prometheus-api-client-python.readthedocs.io/en/master/source/prometheus_api_client.html#module-prometheus_api_client.metric.)

In [None]:
print(metrics_object_list)
my_metric_object = metrics_object_list[1]  # One of the metrics from the list
print(type(my_metric_object))

What happens when we try to print the object?



In [None]:
print(my_metric_object)

We see that the `Metric` object mainly comprises of the following 3 properties:

They are,

- **metric_name:** stores the name of the metric as a string

- **label_config:** stores metric labels and values as a dict

- **metric_values:** metric values as a pandas dataframe

**Question 4:** Can you display the metric name?

**Question 5:** Can you display the label configurations of this metric?

**Question 6:** Can you display the metric values?

For a data scientist in particular, storing these metric time series values in this ``Metric`` type object would be easier to manipulate and use for further exploratory data analysis.

#### Plotting

The ``Metric`` class also has a `plot()` method which lets you plot a simple line graph of the metric time series

In [None]:
my_metric_object.plot()

**Question 7:** Can you plot the graphs for each of the unique metric time series in the metric object list?

#### The == operator

The ``==`` comparison operator checks if `metric_object_1` and `metric_object_2` belong to the same metric time-series and returns a Boolean True/False value.

Let's initialize a `Metric` object for each of the chunks that we downloaded from Prometheus in `metric_data`

In [None]:
metric_object_chunk_list = []
for raw_metric in metric_data:
    metric_object_chunk_list.append(Metric(raw_metric))

In [None]:
metric_object_chunk_list

Lets look at the `metric_name` and `label_config` for the first two metrics

In [None]:
print(
    metric_object_chunk_list[0].metric_name,
    metric_object_chunk_list[0].label_config
    )
print("\n------------------------------------------------------------\n")
print(
    metric_object_chunk_list[1].metric_name,
    metric_object_chunk_list[1].label_config
    )

**Question 8:** Are the two metrics the same?

- If yes, can you find any two metrics which are different? And if no, identify any two metrics which are the same.

#### The + operator

The `+` operator allows you to add two `Metric` objects that belong to the same metric time-series and return a new `Metric` object with the combined `metric_values` that are stored in both objects.

**Question 9:** Identify and add any two metric objects which belong to the same metric time series.

**Question 10:** Plot the following:

- Each of the individual metric objects (i.e. metric-1 and metric-2)
- The combined sum of metric-1 and metric-2

**Question 11:** What happens when you try to add two Metric objects that belong to different metric time series?

### Snapshot of a Metric


This represents a Prometheus query response as a pandas data frame by unpacking the metric label values, extracts (first or last) timestamp-value pair (if multiple pairs are retuned), and concats them before passing to the pandas DataFrame constructor.

To know more about this module, you can refer to the documentation here: [https://prometheus-api-client-python.readthedocs.io/en/master/source/prometheus_api_client.html#module-prometheus_api_client.metric_snapshot_df](https://prometheus-api-client-python.readthedocs.io/en/master/source/prometheus_api_client.html#module-prometheus_api_client.metric_snapshot_df)

In [None]:
from prometheus_api_client import MetricSnapshotDataFrame
metric_data = pc.get_current_metric_value(
                metric_name="prometheus_http_request_duration_seconds_sum",
                label_config={"job": "prometheus"}
                )
metric_df = MetricSnapshotDataFrame(metric_data)
metric_df.head()

### END

Great, you have successfully learnt how to fetch, manipulate and format metrics from Prometheus using the api client library! :) You can now get a better understanding of the metrics your systems and applications are monitoring.