# Tutorial 1 - Basics

Welcome to the  *IBM SMF Explorer* Basics Tutorial.
You will learn how to get started with the *IBM SMF Explorer* framework and access SMF data.

The tutorial has cells containing code. When you come across such a code cell you should execute it by selecting it and pressing ``Ctrl``+``Enter``.
Feel free to change the code, but keep in mind that other parts of the tutorial might be affected. 


## Getting started

To start working with the *IBM SMF Explorer*, you need to import the ``smfexplorer`` package:

In [1]:
import smfexplorer

This tutorial was created for *SMF Explorer* version **1.0.2**.
To check the version, execute the following command: ``smfexplorer.__version__``.

In [2]:
smfexplorer.__version__

'1.0.4'

All requests in the *SMF Explorer* are managed by an object called a **Context**.
A Context represents a connection to one or multiple SMF dumps/datasets and manages the state of all requests dispatched against it.
Creating separate contexts allows you to run the same requests against different SMF dumps in an easy manner.

The Context can be created by calling the ``new_context()`` function of the ``smfexplorer`` module.
If multiple datasets are defined (i.e., provided as arguments to the ``new_context()``), the *SMF Explorer* will execute requests against all of them and concatenate the data in the order the names were specified.

To separately fetch the data from different datasets, you can create as many Context objects as you like. 

> **Note**: if you are working with two or more datasets that were created as a result of the dump-dataset switch (i.e., one dataset is a continuation of another), we recommend assigning these datasets to one Context. 
If you want to work with datasets that contain different SMF records, we advise creation of multiple Contexts for better performance.   

In [3]:
DATASET = "YOUR.SMF.DATA"
ctx = smfexplorer.new_context(DATASET)

# Multiple datasets assigned to one Context instance:
# ctx2 = smfexplorer.new_context('YOUR.SMF.DATA1','YOUR.SMF.DATA2')

*IBM SMF Explorer* provides utility functions that can help you to understand your input dataset. 

You can check if a dataset is available:

In [4]:
print(smfexplorer.check_dataset(DATASET))
print(smfexplorer.check_dataset("WRONG.SMF.DATA"))

True
False


## Get meta data of the dataset

### Get available records

You can fetch a list of SMF types/subtypes available in your SMF dataset using ```get_available_records()```function. 

To get the list of types you can run ```get_available_types()``` function and for subtypes - ```get_available_subtypes()```.

In [4]:
# get available types
ctx.get_available_types()

Unnamed: 0,type,count
0,2,1
1,3,1
2,14,50480
3,15,7753
4,62,30265
5,70,36
6,80,75008
7,82,12065
8,113,2664


In [6]:
# get available subtypes for the SMF type 70
ctx.get_available_subtypes(70)

Unnamed: 0,subtype,count
0,1,27
1,2,9


In [7]:
# get a complete SMF record overview
ctx.get_available_records()

Unnamed: 0,type,subtype,count
0,2,0,1
1,3,0,1
2,14,0,50480
3,15,0,7753
4,62,0,30265
5,70,1,27
6,70,2,9
7,80,0,75008
8,82,1,1
9,82,18,2


### Get description of the dataset

You can request the meta information of your SMF dataset using ```get_dataset_description()```function. 

In [5]:
desc = ctx.get_dataset_description()

The result of the method is a dictionary ```dict```. The keys of the dictionary are the dataset names used to create the context, and the values are the descripton of the corresponding dataset. You can use the name of the dataset to extract the description of a specific dataset from the retrieved description as follows:

In [7]:
dataset_desc = desc[DATASET]

The description of each dataset is saved in a ```dict``` object. Use function ```keys()``` of a ```dict```to see what kinds of information are available:

In [9]:
dataset_desc.keys()

dict_keys(['creation_date', 'count', 'system_ids', 'estimated_size_in_bytes', 'first_time_in_buffer', 'last_time_in_buffer', 'earliest_start', 'lastest_start', 'earliest_end', 'lastest_end', 'type_info'])

You can use the key to retrieve information from the description. For example, you can get all the system IDs inside the dataset by calling:

In [10]:
dataset_desc['system_ids']

['J80 ']

Or you can get detailed information of records of each subtype by calling:

In [13]:
dataset_desc['type_info']

Unnamed: 0,type,subtype,count,system_ids,estimated_size_in_bytes,first_time_in_buffer,last_time_in_buffer,earliest_start,lastest_start,earliest_end,lastest_end
0,113,1,1296,[J80 ],1973376,2019-07-25 00:00:20.550,2019-07-25 05:46:04.660,NaT,NaT,NaT,NaT
1,113,2,1368,[J80 ],2008224,2019-07-25 00:00:20.550,2019-07-25 05:46:04.660,NaT,NaT,NaT,NaT
2,14,0,50480,[J80 ],22811003,2019-07-25 00:00:00.000,2019-07-25 05:59:57.750,NaT,NaT,NaT,NaT
3,15,0,7753,[J80 ],3731270,2019-07-25 00:00:01.990,2019-07-25 05:59:55.880,NaT,NaT,NaT,NaT
4,2,0,1,[J80 ],18,2019-07-25 08:15:10.430,2019-07-25 08:15:10.430,NaT,NaT,NaT,NaT
5,3,0,1,[J80 ],18,2019-07-25 08:15:16.510,2019-07-25 08:15:16.510,NaT,NaT,NaT,NaT
6,62,0,30265,[J80 ],7735330,2019-07-25 00:00:00.510,2019-07-25 05:59:39.070,NaT,NaT,NaT,NaT
7,70,1,27,[J80 ],694224,2019-07-25 00:29:35.070,2019-07-25 05:59:35.080,2019-07-24 23:59:35,2019-07-25 05:29:35,2019-07-25 00:29:34.919,2019-07-25 05:59:34.926
8,70,2,9,[J80 ],37152,2019-07-25 00:29:35.140,2019-07-25 05:59:35.200,2019-07-24 23:59:35,2019-07-25 05:29:35,2019-07-25 00:29:34.919,2019-07-25 05:59:34.926
9,80,0,75008,[J80 ],22438707,2019-07-25 00:00:00.000,2019-07-25 05:59:59.220,NaT,NaT,NaT,NaT


## Fetching Data

The standard way to fetch the data using the *IBM SMF Explorer* is by defining the list of SMF fields you want to request.
Therefore, the Framework provides you with predefined definitions for many SMF fields.
To access those definitions you need to import them from the ``smfexplorer.fields`` module.
Fields are defined in classes that correspond to the SMF type and subtype.


In the following example, we import all fields for the SMF type 70 subtype 1 record.

> **Note**: the following naming scheme: SMF**XX**S**Y**, where *XX* represents record type and *Y* its subtype 

In [8]:
from smfexplorer.fields import SMF70S1

After importing from the fields module, you can access SMF field documentation by pressing ``shift``+``tab``.
To test this out for the field ``lpar_name``, just place the cursor behind the ``SMF70S1.lpar_name`` line below and press the key combination.
Alternatively, you can use the ipython **?** syntax to get help.

In [9]:
# select this cell and press shift+tab to see the documentation
SMF70S1.lpar_busy

<Field:lpar_busy:SMF70_LPAR_BUSY:<RecordMap:SMF70LCS:<Record:70-1>>>

In [None]:
?SMF70S1.lpar_name

Such information is provided on each field.
Feel free to have a look into the properties available.

To make working with SMF data easier, additional **virtual fields** are provided.
Virtual fields are derived from the SMF fields by the *IBM SMF Explorer*.
For example, the ``ziip_boost`` field shows whether zIIP Boost was active.

The virtual field ``ziip_boost`` uses  ``fla``(SMF70FLA) field to extract the information.

In [None]:
?SMF70S1.ziip_boost

### Request() Method

To use the field definitions for data fetching, you can use the ``request()`` method of a Context.
You need to provide an array of fields you would like to request.
To trigger the request call ``.run()``.
When the request succeeds, it returns a pandas DataFrame.

In [10]:
ctx.request(
    [
        SMF70S1.timestamp,
        SMF70S1.sid,
        SMF70S1.lpar_name,
        SMF70S1.system_name,
        SMF70S1.sysplex_name,
        SMF70S1.lpar_system_name,
        SMF70S1.lpar_number,
        SMF70S1.lpar_cpu_count,
    ]
).run()

Unnamed: 0,timestamp,sid,lpar_name,system_name,sysplex_name,lpar_system_name,lpar_number,lpar_cpu_count
0,2019-07-25 00:29:35.070,J80,J80,J80,UTCPLXJ8,J80-J80,7,88
1,2019-07-25 00:29:35.070,J80,CF22,,,CF22-,1,1
2,2019-07-25 00:29:35.070,J80,CF3,,,CF3-,2,6
3,2019-07-25 00:29:35.070,J80,CT2,CT2,CT2PLEX,CT2-CT2,3,15
4,2019-07-25 00:29:35.070,J80,JA0,JA0,UTCPLXJ8,JA0-JA0,4,84
...,...,...,...,...,...,...,...,...
247,2019-07-25 05:59:35.080,J80,JJ0,,,,24,0
248,2019-07-25 05:59:35.080,J80,Z2,Z2,ZPETPLX2,Z2-Z2,25,35
249,2019-07-25 05:59:35.080,J80,CT1,,,,26,0
250,2019-07-25 05:59:35.080,J80,ISKLMLX1,,,ISKLMLX1-,27,1


The *IBM SMF Explorer* is a Fluent-API.
That means you chain methods together to configure a request.
You have seen the most basic form of such a chain: ``request().run()``. The ``request()`` method can start a chain and the ``run()`` method ends it.
You will learn about other methods that can be used in the chain later and in other tutorials.

Note, that you cannot combine fields arbitrarily in a request.
Not all fields are compatible, because they may originate from different structures and cannot be displayed in a single table in a logical/useful way.

The following request, for example, tries to combine SMF 70 Subtype 1 and SMF 72 Subtype 3 data into one table and **throws an exception**.
In general, you cannot combine fields of different record types.

**WARNING**: the following code causes an error

In [None]:
from smfexplorer.fields import SMF72S3

ctx.request([SMF70S1.timestamp, SMF70S1.sid, SMF72S3.utilization_total]).run()

#### Raw fields

The *IBM SMF Explorer* applies different transformations to the raw SMF data before returning it to the user.
Sometimes you might want to disable the post-processing of some fields to get the original values.

You can use the `raw` property of each field to see raw SMF values.
What `raw` returns depends on the type of the field.
For normal fields, post-processing is disabled.
For virtual fields, the raw value of the source field will be returned.

In the example below, you can see how `raw` fetches the `cpu_type` value without the *IBM SMF Explorer* post-processing.

In [13]:
ctx.request(
    [
        SMF70S1.cpu_type,
        SMF70S1.cpu_type.raw,
    ]
).run()

Unnamed: 0,cpu_type,cpu_type_raw
0,CP,0
1,CP,0
2,CP,0
3,CP,0
4,CP,0
...,...,...
643,zIIP,2
644,zIIP,2
645,zIIP,2
646,zIIP,2


### Using Sample Requests
To make common tasks easier, *IBM SMF Explorer* comes with a collection of predefined requests.
The data can be fetched from related fields without listing them individually.
These requests can be found in the ``samples`` property of any context.
In other tutorials, you will be shown how to create and register your own samples.

To fetch the same information as above (i.e., in the first successful ``ctx.request()`` call), we can use the ``lpar_information()`` sample request.

In [12]:
ctx.samples.lpar_information().run()

Unnamed: 0,timestamp,sid,lpar_name,system_name,sysplex_name,lpar_system_name,lpar_number,lpar_cpu_count
0,2019-07-25 00:29:35.070,J80,J80,J80,UTCPLXJ8,J80-J80,7,88
1,2019-07-25 00:29:35.070,J80,CF22,,,CF22-,1,1
2,2019-07-25 00:29:35.070,J80,CF3,,,CF3-,2,6
3,2019-07-25 00:29:35.070,J80,CT2,CT2,CT2PLEX,CT2-CT2,3,15
4,2019-07-25 00:29:35.070,J80,JA0,JA0,UTCPLXJ8,JA0-JA0,4,84
...,...,...,...,...,...,...,...,...
247,2019-07-25 05:59:35.080,J80,JJ0,,,,24,0
248,2019-07-25 05:59:35.080,J80,Z2,Z2,ZPETPLX2,Z2-Z2,25,35
249,2019-07-25 05:59:35.080,J80,CT1,,,,26,0
250,2019-07-25 05:59:35.080,J80,ISKLMLX1,,,ISKLMLX1-,27,1


To fetch additional fields together with a sample, the ``run()`` method has a named parameter called ``display``.
You can provide an array of fields to this parameter just like in the ``request()`` method.

In [11]:
ctx.samples.lpar_information().run(display=[SMF70S1.capactiy_group_name])

Unnamed: 0,timestamp,sid,lpar_name,system_name,sysplex_name,lpar_system_name,lpar_number,lpar_cpu_count,capactiy_group_name
0,2019-07-25 00:29:35.070,J80,J80,J80,UTCPLXJ8,J80-J80,7,88,
1,2019-07-25 00:29:35.070,J80,CF22,,,CF22-,1,1,
2,2019-07-25 00:29:35.070,J80,CF3,,,CF3-,2,6,
3,2019-07-25 00:29:35.070,J80,CT2,CT2,CT2PLEX,CT2-CT2,3,15,
4,2019-07-25 00:29:35.070,J80,JA0,JA0,UTCPLXJ8,JA0-JA0,4,84,
...,...,...,...,...,...,...,...,...,...
247,2019-07-25 05:59:35.080,J80,JJ0,,,,24,0,
248,2019-07-25 05:59:35.080,J80,Z2,Z2,ZPETPLX2,Z2-Z2,25,35,
249,2019-07-25 05:59:35.080,J80,CT1,,,,26,0,
250,2019-07-25 05:59:35.080,J80,ISKLMLX1,,,ISKLMLX1-,27,1,


# Some available Samples

## Samples for SMF70S1:

``lpar_information()`` --  fields from SMF70S1 on LPARs

``processor_information()`` --  fields from SMF70S1 on processors

## Samples for SMF72S3:

``smf_72_03_sample()`` -- fields for SMF72 subtype 3 analysis


## Samples for SMF99S1:
    
``p_utilization()`` --  CP, zIIP, zAAP and total utilization
    
``rg_capping()`` -- Resource Group and Tenant Resource Group capping information
    
``smf_99_01_sample()`` --  commonly used SMF 99 subtype 1 data


## Samples for SMF99S2:
 

``srv_service()`` -- service Class service consumption for CP, zIIP and zAAP

``smf_99_02_sample()`` -- commonly used SMF 99 Subtype 2 data

## Samples for SMF99S12:

``hiper_dispatch()`` -- hiper dispatch information per processor type

## Samples for SMF99S14:

``topology()`` -- topology information per processor
