# Amun Inspection Performance Analysis [Template]

The purpose of this notebook is to have a reusable notebook that can be customized depending on the performance analysis that can be performed based on [Amun Service](https://github.com/thoth-station/amun-api) and using [Performance Indicators](https://github.com/thoth-station/performance).

## Amun Inspections inputs

**Software stacks and native dependencies**

example:

  * `AICoE TensorFlow` - `tensorflow==2.2.0` available on AICoE index (inspection identifier contains  `rhtf`)
  
**OS images**

example:

  * `rhel-8` 

**Python Interpreters**

example:

  * `3.6` 
  
**Hardware**

example:


### Performance indicators
Performance Indicators (PI) used for performance analysis:

  * [matrix multiplication](https://github.com/thoth-station/performance/blob/master/tensorflow/matmul.py)
  * [convolution 1D](https://github.com/thoth-station/performance/blob/master/tensorflow/conv1d.py)
  * [convolution 2D](https://github.com/thoth-station/performance/blob/master/tensorflow/conv2d.py)

Each performance indicator was run `x times` per inspection run (`batch size == x`), performance indicators reported median of inspections to be further compared.

## Dataset content

Inspection specification, build logs, job logs, hardware information of the node where the performance indicator was run and the actual inspection job result.

No buildtime/runtime errors spotted with the tested stack.


# Analysis

Results of performance are shown in terms of Elapsed time [ms].

The analysis performed in this notebook are defined as:

example:

- Performance analysis across different Tf stacks (Python packages) (with/without optimized library e.g.MKL) (fixing Hardware, OS image, Python Interpreter, number of CPUs)


## Assign environment variables and import libraries

In [1]:
%env THOTH_DEPLOYMENT_NAME     ocp-stage
%env THOTH_CEPH_HOST           https://s3.upshift.redhat.com/
%env THOTH_CEPH_BUCKET         thoth
%env THOTH_CEPH_BUCKET_PREFIX  data

env: THOTH_DEPLOYMENT_NAME=ocp-stage
env: THOTH_CEPH_HOST=https://s3.upshift.redhat.com/
env: THOTH_CEPH_BUCKET=thoth
env: THOTH_CEPH_BUCKET_PREFIX=data


In [1]:
from thoth.report_processing.components.inspection import AmunInspections
from thoth.report_processing.components.inspection import AmunInspectionsSummary
from thoth.report_processing.components.inspection import AmunInspectionsStatistics

inspection = AmunInspections()
inspection_runs_summary = AmunInspectionsSummary()
inspection_statistics = AmunInspectionsStatistics()

import pandas as pd
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 1000)
pd.set_option('display.width', 1500)
pd.options.plotting.backend = "plotly"  # Convert to matplotlib

### Extract dataset if data are not retrieved from Ceph

In [None]:
FILE_NAME = "dataset-name.zip"
from thoth.report_processing.utils import extract_zip_file
extract_zip_file(FILE_NAME)

In [None]:
inspections_identifiers = ['']  # List of identifiers for the analysis

## Retrieve and process data

### from Ceph

In [None]:
inspection_runs = inspection.aggregate_thoth_inspections_results(
    inspections_identifiers=inspections_identifiers,
)

### from local path

In [None]:
from pathlib import Path

current_path = Path.cwd()

inspection_runs = inspection.aggregate_thoth_inspections_results(
    inspections_identifiers=inspections_identifiers,
    is_local=True,
    repo_path=current_path.joinpath("inspections")
)

In [None]:
processed_inspection_runs, failed_inspection_runs = inspection.process_inspection_runs(
    inspection_runs,
)

In [None]:
inspections_df = inspection.create_inspections_dataframe(
    processed_inspection_runs=processed_inspection_runs,
    include_statistics=True
)

In [None]:
inspections_df

# Inspections summary report

In [None]:
report_results, _ = inspection_runs_summary.produce_summary_report(inspections_df=inspections_df)

## Hardware

In [None]:
report_results["hardware"]['platform']

In [None]:
report_results["hardware"]['processor']

In [None]:
report_results["hardware"]['flags']

In [None]:
report_results["hardware"]['ncpus']

In [None]:
report_results["hardware"]['info']

## Operating System

In [None]:
report_results["base_image"]['base_image']

In [None]:
report_results["base_image"]['number_cpus_run']

## Performance Indicator

In [None]:
report_results["pi"]['pi']

## Software Stack

In [None]:
report_results["software_stack"]['requirements_locked']

In [None]:
python_packages_dataframe, python_packages_versions = inspection.create_python_package_df(inspections_df=inspections_df)
python_packages_dataframe

# Create final dataframe

In [None]:
final_dataframe = inspection.create_final_dataframe(
    inspections_df=inspections_df,
    include_statistics=True
)
final_dataframe

# Plot results