# Experience

This experience shows an approach used to perform Time-Series data analysis involving Machine Learning for a MSc Thesis project called 'Observing and Controlling Performance in Microservices'.

### Context:
For this experience, there are values stored in a Time-Series database to be analysed. This values represent metrics extracted from a span data produced by a certain system.

### Objective:
The objective resides in answer the question related to anomaly detection: "What is the overall reliability of the system?". To do this we use two questions:
1. How do the request are being handled by a specific service? (Identify services that are experiencing unreliability periods)
2. Is there any problem related to the response time?

### Features involved:
- Status Codes -- (Question 1);
- Response time -- (Question 2);

### Considerations:
- Multiple features to be analyzed;
- Possible correlation between features;
- Unlabeled data --> (Un)supervised learning;

In [1]:
import pandas as pd

The code bellow was extracted from a module, opentsdb client, presented in Graphy. This function retrieves data from OpenTSDB using the REST API defined by the database developers. To retrieve data, the metric name, start timestamp and end timestamp must be provided.

In [2]:
import requests
import json

opentsdb_address = 'http://127.0.0.1:4242/'
api_query = 'api/query'

def get_metrics(name: str, start_timestamp: int, end_timestamp: int) -> dict:
    """
    Gets the metrics from OpenTSDB.

    :param name: The name of the metrics.
    :param start_timestamp: The start unix timestamp of the metric.
    :param end_timestamp: The end unix timestamp of the metric.
    :return: The metrics as a dictionary if success, None otherwise.
    """
    json_body = {
        "start": start_timestamp,
        "end": end_timestamp,
        "queries": [{"aggregator": "sum", "metric": name},
                    {"aggregator": "sum", "tsuids": ["000001000002000042", "000001000002000043"]}]
    }

    data = None
    try:
        response = requests.post(opentsdb_address + api_query, data=json.dumps(json_body),
                                 headers={'content-type': 'application/json'})
        if response.status_code == 200:
            response_text = json.loads(response.text)
            if response_text:
                data = response_text[0].get('dps', None)
        return data
    except ConnectionError as ex:
        logger.error('{}: {}'.format(type(ex), ex))
        sys.exit(status=1)

# Data to be analyzed

The data is gathered from a Time-Series database (OpenTSDB). In this case we get some metrics from this kind of database to perform the analysis.

We are using the OUTER JOIN method to merge data from multiple features. This merge method preserves the data points and fills the missing values with NaN (Missing values).

In [4]:
start_timestamp = 1530057600
end_timestamp = 1530316800

service_name = 'nova-api-cascading'

metric_name = 'huawei.status_code.{}.2XX'.format(service_name)
metric_name_2 = 'huawei.status_code.{}.4XX'.format(service_name)
metric_name_3 = 'huawei.status_code.{}.5XX'.format(service_name)

metric_name_4 = 'huawei.response_time_avg.{}'.format(service_name)

metrics = get_metrics(metric_name, start_timestamp, end_timestamp)
metrics_2 = get_metrics(metric_name_2, start_timestamp, end_timestamp)
metrics_3 = get_metrics(metric_name_3, start_timestamp, end_timestamp)
metrics_4 = get_metrics(metric_name_4, start_timestamp, end_timestamp)

df_1 = pd.DataFrame(metrics.items(), columns=['time', 'status_code.2XX'])
df_2 = pd.DataFrame(metrics_2.items(), columns=['time', 'status_code.4XX'])
df_3 = pd.DataFrame(metrics_3.items(), columns=['time', 'status_code.5XX'])
df_4 = pd.DataFrame(metrics_4.items(), columns=['time', 'response_time_avg'])

df = pd.merge(df_1, df_2, how='outer')
df = pd.merge(df, df_3, how='outer')
df = pd.merge(df, df_4, how='outer')

print('\nData info:\n{}'.format(df.info()))

print('\nData:\n{}'.format(df))

print('\nMissing values counting:\n{}'.format(df.isna().sum()))

metrcis_3: None


AttributeError: 'NoneType' object has no attribute 'items'

In [None]:
df_copy = df.copy()

In [None]:
df_new = df.interpolate(method ='linear')

df_new = df_new["status_code.5XX"].fillna(0)

df_new

# print('\nMissing values counting:\n{}'.format(df_new.isna().sum()))

In [None]:
from matplotlib import pyplot as plt

df_copy.plot(x='time', y='status_code.2XX', figsize=(12,6))
plt.xlabel('Timestamps')
plt.ylabel('Call count')
plt.title('Call Count [IN] (MISSING_VALUES)')

df_new.plot(x='time', y='status_code.2XX', figsize=(12,6))
plt.xlabel('Timestamps')
plt.ylabel('Call count')
plt.title('Call Count [IN] (COMPLETE)')

df_copy.plot(x='time', y='status_code.4XX', figsize=(12,6))
plt.xlabel('Timestamps')
plt.ylabel('Call count')
plt.title('Call Count [OUT] (MISSING_VALUES)')

df_new.plot(x='time', y='status_code.4XX', figsize=(12,6))
plt.xlabel('Timestamps')
plt.ylabel('Call count')
plt.title('Call Count [OUT] (COMPLETE)')

df_copy.plot(x='time', y='status_code.5XX', figsize=(12,6))
plt.xlabel('Timestamps')
plt.ylabel('Time (ms)')
plt.title('Average Response Time (MISSING_VALUES)')

df_new.plot(x='time', y='status_code.5XX', figsize=(12,6))
plt.xlabel('Timestamps')
plt.ylabel('Time (ms)')
plt.title('Average Response Time (COMPLETE)')