In [1]:
'''
Licensed Materials - Property of IBM
IBM Maximo APM - Predictive Maintenance Insights On-Premises
IBM Maximo APM - Predictive Maintenance Insights SaaS 
© Copyright IBM Corp. 2019 All Rights Reserved.
US Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
'''

'\nLicensed Materials - Property of IBM\nIBM Maximo APM - Predictive Maintenance Insights On-Premises\nIBM Maximo APM - Predictive Maintenance Insights SaaS \n© Copyright IBM Corp. 2019 All Rights Reserved.\nUS Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.\n'

# Maximo APM PMI - Predicted Failure Date Model Template

1. [Introduction](#introduction)
2. [Install Maximo APM PMI SDK](#install-maximo-apm-pmi-sdk)
3. [Setup the Model Training Pipeline](#setup-model-training-pipline)
4. [Train the Model Instance](#train-model-instance)
5. [Register the Trained Model Instance](#register-trained-model-instance)
6. [Model Template Internals](#model-template-internals)

<a id='introduction'></a>
## Introduction

###### Use Case Description

This notebook deals with computing time to failure for a system like a process or a unit task or an asset like a device. 

This notebook uses the survival_analysis package of SROM supporting various algorithms to compute the survival functions and median time to failure. The survival analysis package handles both survival and failure analysis.

Survival analysis is a branch of statistics for analyzing the expected duration of time until one or more events happen including, but not limited to, failure in mechanical systems. This topic is called reliability theory or reliability analysis in engineering, duration analysis or duration modelling in economics, and event history analysis in sociology. Survival analysis attempts to answer questions such as: what is the proportion of a sample that will survive past a certain time? Of those that survive, at what rate will they fail?  How do particular circumstances or characteristics or causes increase or decrease the probability of survival or failure?

Broadly speaking, survival analysis involves the modelling of time to event data; in this context involving industrial processes and assets, failure or fault could be considered an "event". The key assumption here is a single event occurs for each subject, after which the subject (asset or a process) is stopped or failed - meaning it ceases to exist or survive.

This model provides several methods for survival/failure analysis, including:
+ Kaplan-Meier Estimator
+ Nelson-Aalen
+ Cox Regression
+ Aalen's Additive Regression Model

The notebook shows the model building, training, and registration to PMI.

**Input Data**
The raw data for time-to-failure model is from Maximo and IOT platform. First, it gets the asset failure history and the assets' sensor data (the *_episode). Then the raw data will be merged into the format with the columns below -

+ id (that indicates the process or the system or an asset)
+ event_timestamp (the timestamp of data points)
+ failuredate (the asset has failure on this date, which is coming from Maximo failure history data)
+ installdate (the installation date of the asset, which is coming from Maximo asset meta data)
+ *_episode (related sensor data to the failure event)

| id  | event_timestamp | faildate |  c_episode | t_episode | f_episode | p_episode| installdate | 
| ------------- |-------------|-------------|-------|------------|------------|------------|------------|
| SAMPLE_ASSET101-__-BEDFORD | 2018-07-02 | 2018-07-02 | 1 | 7  | 2 | 3| 2017-10-30T00:00:00 |
| SAMPLE_ASSET101-__-BEDFORD | 2019-03-18 | NaT | 1 | 5  | 2 | 5| 2017-10-30T00:00:00 |
| SAMPLE_ASSET102-__-BEDFORD | 2019-03-21 | 2019-03-21 | 2 | 8  | 4 | 3| 2017-10-29T00:00:00 |

In the data set above, each row corresponds to a device event. The data contains failure information for the devices. The failuredate means the asset has failure on this date, otherwise it is NaT. The model will calculate the durations between different failure dates for each asset. The remaining columns are covariates which are potentially correlated with failures of devices.

**Output**

The output depends on the type of algorithm that best fits the data. Depending on the case, and as illustrated in this notebook below, the output involves predicting the time to the event of consideration in one of the different forms. For Kaplan Meier, which is a non-parametric method, the value is a the median. It can also give the survival function. But this does not deal with covariates. The key advanatge is this works well for right censoring in the sample.
For Nelson Aalen, the result is the hazard function. This is also a non-parametric method, and does not take the covariates into consideration. 
For Cox Regression, which is suitable for covariates, the result is hazard function and the expected duration before the event happens. This is a semi-parametric model, and this factors in the role of covariates in the occurrence of the event.

<a id="install-maximo-apm-pmi-sdk"></a>
## Install Maximo APM PMI SDK

To install the SDK, you need your Maximo APM PMI instance ID, API base URL, and your API key. The Maximo APM PMI instance ID and API base URL can be found in the user welcome letter. For API key, request to your Maximo admin to create an user account first to generate one for you. Create one environment variable for each here.

In [2]:
%%capture
%env APM_ID=4ac3917e
%env APM_API_BASEURL=https://prod.pmi.apm.maximo.ibm.com
%env APM_API_KEY=dp7opk78635sbf809f07o4t53lum3c9eoaovpk9f

Then, install PMI SDK with `pip`. Note that we have to upgrade `pip` first.

In [3]:
!pip install -U pip~=18.1
!pip install pyspark
!pip install -U https://prod.pmi.apm.maximo.ibm.com/ibm/pmi/service/rest/ds/4ac3917e/dp7opk78635sbf809f07o4t53lum3c9eoaovpk9f/lib/download?filename=pmlib-1.0.0.tar.gz

Requirement already up-to-date: pip~=18.1 in /opt/conda/envs/Python36/lib/python3.6/site-packages (18.1)
Collecting https://prod.pmi.apm.maximo.ibm.com/ibm/pmi/service/rest/ds/4ac3917e/dp7opk78635sbf809f07o4t53lum3c9eoaovpk9f/lib/download?filename=pmlib-1.0.0.tar.gz
[?25l  Downloading https://prod.pmi.apm.maximo.ibm.com/ibm/pmi/service/rest/ds/4ac3917e/dp7opk78635sbf809f07o4t53lum3c9eoaovpk9f/lib/download?filename=pmlib-1.0.0.tar.gz (792kB)
[K    100% |████████████████████████████████| 798kB 56.1MB/s ta 0:00:01


Building wheels for collected packages: pmlib
  Running setup.py bdist_wheel for pmlib ... [?25ldone
[?25h  Stored in directory: /home/dsxuser/.tmp/pip-ephem-wheel-cache-hwmz161r/wheels/27/e3/4f/8f4f27f0744ea9362917922e6990c31fc7ed89833aa12f6e3a
Successfully built pmlib
Installing collected packages: pmlib
  Found existing installation: pmlib 1.0.0
    Uninstalling pmlib-1.0.0:
      Successfully uninstalled pmlib-1.0.0
Successfully installed pmlib-1.0.0


<a id="setup-model-training-pipline"></a>
## Setup the Model Training Pipeline

Before you can start working on the model training pipeline, you have to setup an asset group and asset-sensor relationshp properly in Maximo. See IBM Maximo APM - Predictive MaintenanceInsights SaaS User Guide for details.

Required model pipeline configuration:

* Asset group ID: The unit of model processing is an asset group. Asset groups are managed on Maximo APM UI. You need to get the ID of the asset group to be analyzed by this model.
* Asset failure history and installation date as the label: This model requires asset installation date (Asset attribute **```installdate```** in Maximo) and asset failure history (Asset Workorder attribute **```faildate```** in Maximo) to extract the latel for training.
* Sensor data as features: This model also accepts one or more features from either asset data or IOT data. **Note that these features must be of type Integer or Floating-Point Number.** Features are specifed simply by attribute names prefixed by type (with separator colon). For asset attribute, the prefix is an empty string. For IOT data attribute, the prefix is the device type registered on Watson IOT Platform.
* Prediction output names: This model generates one output, the predicted failure date. Give it a name (containing only alphanumeric, dash, and underline).

Now you can setup a training pipeine based on this model template, with your own data, to train a model instance.

In [4]:
from pmlib.time_to_failure import TimeToFailureAssetGroupPipeline

group = TimeToFailureAssetGroupPipeline(
            asset_group_id='1016', 
            model_pipeline={
                'features': ['IIOT:rh','IIOT:temp','IIOT:pressure','IIOT:flow','IIOT:energy','IIOT:vibration'],
                'features_for_training': [':installdate', ':faildate'],
                'predictions': ['predicted_time_to_failure'],
            })



2020-03-30T09:55:29.751 pmlib.api.init_environ INFO APM_ID=4ac3917e, APM_API_BASEURL=https://prod.pmi.apm.maximo.ibm.com, APM_API_KEY=********
2020-03-30T09:55:29.753 pmlib.util.api_request INFO method=get, url=https://prod.pmi.apm.maximo.ibm.com/ibm/pmi/service/rest/ds/tenant?instanceId=4ac3917e, headers={'apmapitoken': '********'}, timeout=30, ssl_verify=True, json=None, session=None, kwargs={}
2020-03-30T09:55:31.597 pmlib.util.api_request INFO resp.status_code=200, method=get, url=https://prod.pmi.apm.maximo.ibm.com/ibm/pmi/service/rest/ds/tenant?instanceId=4ac3917e
2020-03-30T09:55:31.599 pmlib.api.init_environ DEBUG resp={
    "as_apikey": "********",
    "as_apitoken": "********",
    "as_id": null,
    "as_url": "https://api-us.connectedproducts.internetofthings.ibmcloud.com",
    "info": {
        "API_BASEURL": "https://api-us.connectedproducts.internetofthings.ibmcloud.com",
        "API_KEY": "********",
        "API_TOKEN": "********",
        "COS_BUCKET_KPI": "analytics-

2020-03-30T09:55:37.803 analytics_service.pmlib.loader.AssetLoader._validate_data_items DEBUG all_entity_types={'IOT', 'ZPMI', '1015', 'IIOT', 'ASSET_CACHE', '1011', '1016'}
2020-03-30T09:55:37.804 analytics_service.pmlib.loader.AssetLoader._set_asset_device_mappings INFO input_asset_device_mappings=None, asset_device_mappings={}, entity_type_meta={}
2020-03-30T09:55:37.805 analytics_service.pmlib.time_to_failure.TimeToFailureAssetGroupPipeline.__init__ DEBUG pipeline_config={'features': ['rh', 'temp', 'pressure', 'flow', 'energy', 'vibration'], 'inputs': ['IIOT:rh', 'IIOT:temp', 'IIOT:pressure', 'IIOT:flow', 'IIOT:energy', 'IIOT:vibration', ':installdate', ':faildate'], 'renamed_inputs': ['rh', 'temp', 'pressure', 'flow', 'energy', 'vibration', 'installdate', 'faildate'], 'features_for_training': ['installdate', 'faildate'], 'targets': ['installdate', 'faildate'], 'predictions': ['predicted_time_to_failure'], 'features_resampled': {}}
2020-03-30T09:55:37.807 analytics_service.pmlib.ti

The example above configured a pipeline for this model, accepting **```p_episode```, ```t_episode```, ```f_episode```**, and **```c_episode```** of Watson IOT Platform device type **```SampleTimeToFailureSensor```**. It also uses asset attributes **```installdate```** and **```faildate```** to extract the labels for training. The predicted output of the trained model instance is called **```predicted_time_to_failure```**.

By default, this model also generated daily aggregated prediction result taking output names in the form of **```daily_<predicted_failure_date_output_name>```**.

<a id="train-model-instance"></a>
## Train the Model Instance

With the model pipeline configured, now you can train the model instance:

In [5]:
df = group.execute()

2020-03-30T09:55:37.817 analytics_service.pmlib.cache_loader.AssetCacheRefresher.execute INFO start_ts=None, end_ts=None, entities=None
2020-03-30T09:55:43.384 pmlib.util.api_request INFO method=get, url=https://api-us.connectedproducts.internetofthings.ibmcloud.com/api/meta/v1/CTP-PMI-Democore-31/entityType/ASSET_CACHE, headers={'Content-Type': 'application/json', 'X-api-key': '********', 'X-api-token': '********', 'Cache-Control': 'no-cache'}, timeout=30, ssl_verify=True, json=None, session=None, kwargs={}
2020-03-30T09:55:44.063 pmlib.util.api_request INFO resp.status_code=200, method=get, url=https://api-us.connectedproducts.internetofthings.ibmcloud.com/api/meta/v1/CTP-PMI-Democore-31/entityType/ASSET_CACHE
2020-03-30T09:55:44.065 iotfunctions.metadata.__init__ DEBUG Initializing new entity type using iotfunctions 2.0.3
2020-03-30T09:55:44.066 iotfunctions.util.__init__ DEBUG Starting trace
2020-03-30T09:55:44.067 iotfunctions.util.__init__ DEBUG Trace name: auto_trace_ASSET_CACHE

2020-03-30T09:55:50.743 iotfunctions.metadata.__init__ DEBUG Initialized entity type 
EntityType:iot_iiot
Functions:
Granularities:
No schedules metadata
2020-03-30T09:55:50.744 analytics_service.pmlib.loader.AssetLoader.execute DEBUG before get_data: start_ts=None, end_ts=None, entity_type=IIOT, columns_to_load=['deviceid', 'rcv_timestamp_utc', 'vibration', 'pressure', 'flow', 'energy', 'temp', 'rh'], time_grain=None, agg_methods=None, agg_outputs=None
2020-03-30T09:55:50.746 pmlib.api._validate_resampling DEBUG time_grain=None, agg_methods={}, agg_outputs={}
2020-03-30T09:55:54.741 iotfunctions.metadata.index_df DEBUG Indexed dataframe on id, rcv_timestamp_utc
2020-03-30T09:55:54.758 pmlib.api.get_entity_type_data DEBUG df=shape=(11067, 8), index={'id': 'O', 'rcv_timestamp_utc': '<M8[ns]'}, columns={'deviceid': 'O', 'vibration': 'float64', 'pressure': 'float64', 'flow': 'float64', 'energy': 'float64', 'temp': 'float64', 'rh': 'float64', '_timestamp': '<M8[ns]'}, head(5)=
            

2020-03-30T09:55:57.105 pmlib.api.get_asset_failure_history DEBUG response=<Response [200]>
2020-03-30T09:55:57.107 pmlib.api.get_asset_failure_history DEBUG one_failure_history_record={'date': '2020-01-02T16:55:00+01:00', 'classcode': 'TOP-TECH', 'problemcode': 'CIV', 'description': 'Problem WO with faildate for PMI', 'causecode': 'BREUK', 'remedycode': 'REPARAT', 'wonum': '1006'}
2020-03-30T09:55:57.132 pmlib.api.get_asset_failure_history DEBUG one_failure_history_record={'date': '2020-01-09T16:55:00+01:00', 'classcode': 'TOP-TECH', 'problemcode': 'CIV', 'description': 'Problem WO with faildate for PMI', 'causecode': 'MONTAGE', 'remedycode': 'VERVANG', 'wonum': '1007'}
2020-03-30T09:55:57.135 pmlib.api.get_asset_failure_history DEBUG one_failure_history_record={'date': '2020-01-12T16:55:00+01:00', 'classcode': 'TOP-TECH', 'problemcode': 'CIV', 'description': 'Problem WO with faildate for PMI', 'wonum': '1008'}
2020-03-30T09:55:57.137 pmlib.api.get_asset_failure_history DEBUG one_fail

2020-03-30T09:55:57.553 pmlib.api.get_asset_failure_history DEBUG one_failure_history_record={'date': '2020-03-26T18:30:00+01:00', 'classcode': 'TOP-TECH', 'problemcode': 'PA', 'description': 'Problem WO with faildate for PMI', 'causecode': 'ONDERHD', 'remedycode': 'FMECA', 'wonum': '1016'}
2020-03-30T09:55:57.554 pmlib.api.get_asset_failure_history DEBUG response=<Response [200]>
2020-03-30T09:55:57.555 pmlib.api.get_asset_failure_history DEBUG one_failure_history_record={'date': '2020-03-07T12:02:00+01:00', 'classcode': 'TOP-TECH', 'problemcode': 'ELEK', 'description': 'Fail date WO for PMI', 'causecode': 'ONDERHD', 'remedycode': 'UITVOER', 'wonum': '1003'}
2020-03-30T09:55:57.556 pmlib.api.get_asset_failure_history DEBUG one_failure_history_record={'date': '2020-03-09T10:45:00+01:00', 'classcode': 'TOP-TECH', 'problemcode': 'ELEK', 'description': 'Problem WO with faildate for PMI', 'causecode': 'MONTAGE', 'remedycode': 'REPARAT', 'wonum': '1019'}
2020-03-30T09:55:57.571 pmlib.api.ge

2020-03-30T09:55:59.374 analytics_service.pmlib.loader.AssetLoader.execute DEBUG df_loaded_n_mapped_=shape=(55, 3), index={0: 'int64'}, columns={'asset_id': 'O', 'faildate': '<M8[ns]', 'event_timestamp': '<M8[ns]'}, head(5)=
            asset_id            faildate     event_timestamp
0  ZIOT1001-____-WCM 2020-01-02 15:55:00 2020-01-02 15:55:00
1  ZIOT1001-____-WCM 2020-01-09 15:55:00 2020-01-09 15:55:00
2  ZIOT1001-____-WCM 2020-01-12 15:55:00 2020-01-12 15:55:00
3  ZIOT1001-____-WCM 2020-01-14 07:30:00 2020-01-14 07:30:00
4  ZIOT1001-____-WCM 2020-01-19 16:15:00 2020-01-19 16:15:00
2020-03-30T09:55:59.377 analytics_service.pmlib.loader.AssetLoader.execute DEBUG before merge, df=None
2020-03-30T09:55:59.761 analytics_service.pmlib.loader.AssetLoader.execute DEBUG df_merged=shape=(11112, 11), index={'asset_id': 'O', 'event_timestamp': '<M8[ns]'}, columns={'id': 'O', 'deviceid': 'O', 'vibration': 'float64', 'pressure': 'float64', 'flow': 'float64', 'energy': 'float64', 'temp': 'float64'

2020-03-30T09:55:59.798 analytics_service.pmlib.loader.AssetLoader.execute DEBUG df_merge=shape=(11112, 13), index={0: 'int64'}, columns={'asset_id': 'O', 'event_timestamp': '<M8[ns]', 'id': 'O', 'deviceid': 'O', 'vibration': 'float64', 'pressure': 'float64', 'flow': 'float64', 'energy': 'float64', 'temp': 'float64', 'rh': 'float64', '_timestamp': '<M8[ns]', 'entity_type': 'O', 'faildate': '<M8[ns]'}, head(5)=
            asset_id     event_timestamp   id deviceid  vibration  pressure  \
0  ZIOT1001-____-WCM 2020-01-02 15:55:00  NaN      NaN        NaN       NaN   
1  ZIOT1001-____-WCM 2020-01-09 15:55:00  NaN      NaN        NaN       NaN   
2  ZIOT1001-____-WCM 2020-01-12 15:55:00  NaN      NaN        NaN       NaN   
3  ZIOT1001-____-WCM 2020-01-14 07:30:00  NaN      NaN        NaN       NaN   
4  ZIOT1001-____-WCM 2020-01-19 16:15:00  NaN      NaN        NaN       NaN   

   flow  energy  temp  rh _timestamp entity_type            faildate  
0   NaN     NaN   NaN NaN        NaT    

2020-03-30T09:55:59.973 iotfunctions.pipeline.execute DEBUG columns excluded when dropping null rows ['deviceid', '_timestamp', 'logicalinterface_id', 'devicetype', 'format', 'updated_utc', 'evt_timestamp']
2020-03-30T09:55:59.974 iotfunctions.pipeline.execute DEBUG columns considered when dropping null rows ['vibration', 'pressure', 'flow', 'energy', 'temp', 'rh', 'faildate', 'installdate']
2020-03-30T09:55:59.976 iotfunctions.pipeline.execute DEBUG vibration count not null: 11100
2020-03-30T09:55:59.977 iotfunctions.pipeline.execute DEBUG pressure count not null: 11100
2020-03-30T09:55:59.991 iotfunctions.pipeline.execute DEBUG flow count not null: 11100
2020-03-30T09:55:59.992 iotfunctions.pipeline.execute DEBUG energy count not null: 11100
2020-03-30T09:55:59.994 iotfunctions.pipeline.execute DEBUG temp count not null: 11100
2020-03-30T09:55:59.995 iotfunctions.pipeline.execute DEBUG rh count not null: 11100
2020-03-30T09:55:59.996 iotfunctions.pipeline.execute DEBUG faildate count

2020-03-30T09:56:00.299 iotfunctions.pipeline._execute_stage DEBUG Function TimeToFailureEstimatorFeatureExtraction has no validate_df method. Skipping validation of the dataframe
2020-03-30T09:56:01.692 iotfunctions.metadata.register DEBUG found METRIC column deviceid
2020-03-30T09:56:01.694 iotfunctions.metadata.register DEBUG found METRIC column event_timestamp
2020-03-30T09:56:01.695 iotfunctions.metadata.register DEBUG found METRIC column devicetype
2020-03-30T09:56:01.695 iotfunctions.metadata.register DEBUG found METRIC column logicalinterface_id
2020-03-30T09:56:01.696 iotfunctions.metadata.register DEBUG found METRIC column eventtype
2020-03-30T09:56:01.697 iotfunctions.metadata.register DEBUG found METRIC column format
2020-03-30T09:56:01.698 iotfunctions.metadata.register DEBUG found METRIC column updated_utc
2020-03-30T09:56:02.413 iotfunctions.db.http_request DEBUG http request successful. status 200
2020-03-30T09:56:02.414 iotfunctions.metadata.register DEBUG Metadata reg

2020-03-30T09:56:11.869 analytics_service.pmlib.pipeline._ModelPipelineConfig.__init__ DEBUG kwargs={}
2020-03-30T09:56:11.872 analytics_service.pmlib.time_to_failure.TimeToFailureAssetGroupPipeline.execute DEBUG adjusted after model trained: loader_inputs=('IIOT:rh', 'IIOT:temp', 'IIOT:pressure', 'IIOT:flow', 'IIOT:energy', 'IIOT:vibration'), loader_names=('rh', 'temp', 'pressure', 'flow', 'energy', 'vibration')
2020-03-30T09:56:11.875 analytics_service.pmlib.loader.AssetLoader._validate_mappings DEBUG asset_device_mappings={'ZIOT1001-____-WCM': ['IIOT:ZIOT1001'], 'ZIOT1002-____-WCM': ['IIOT:ZIOT1002'], 'ZIOT1003-____-WCM': ['IIOT:ZIOT1003'], 'ZIOT1004-____-WCM': ['IIOT:ZIOT1004'], 'ZIOT1005-____-WCM': ['IIOT:ZIOT1005']}
2020-03-30T09:56:11.877 analytics_service.pmlib.loader.AssetLoader._validate_mappings DEBUG features_meta={'IIOT': {'vibration', 'pressure', 'flow', 'energy', 'temp', 'rh'}}
2020-03-30T09:56:11.880 analytics_service.pmlib.loader.AssetLoader._validate_mappings DEBUG ma

Once this method completes successfully, you'll have a trained model instance reday (for next step, see below) and also with the prediction results returned as a dataframe for verification.

<a id="register-trained-model-instance"></a>
## Register the Trained Model Instance


If the trained model instance looks good, you can register it to Maximo APM PMI:

In [6]:
group.register()

2020-03-30T09:56:11.915 analytics_service.pmlib.time_to_failure.TimeToFailureAssetGroupPipeline.register DEBUG target_pipeilne_class=pmlib.time_to_failure.TimeToFailureAssetGroupPipeline, url=None
2020-03-30T09:56:11.921 analytics_service.pmlib.time_to_failure.TimeToFailureAssetGroupPipeline.register DEBUG catalog_config={'name': 'TimeToFailureAssetGroupPipeline', 'description': 'TimeToFailureAssetGroupPipeline', 'moduleAndTargetName': 'pmlib.time_to_failure.TimeToFailureAssetGroupPipeline', 'url': 'https://prod.pmi.apm.maximo.ibm.com/ibm/pmi/service/rest/ds/4ac3917e/dp7opk78635sbf809f07o4t53lum3c9eoaovpk9f/lib/download?filename=pmlib-1.0.0.tar.gz', 'category': 'TRANSFORMER', 'tags': [], 'output': [{'name': 'names', 'description': 'Provide a list of output names to be generated from the pipeline.', 'dataType': 'ARRAY', 'jsonSchema': {'minItems': 1, '$schema': 'http://json-schema.org/draft-07/schema#', 'type': 'array', 'items': {'type': 'string'}}, 'tags': []}], 'input': [{'name': 'asse

2020-03-30T09:56:13.865 iotfunctions.metadata.register DEBUG found METRIC column format
2020-03-30T09:56:13.870 iotfunctions.metadata.register DEBUG found METRIC column updated_utc
2020-03-30T09:56:14.475 iotfunctions.db.http_request DEBUG http request successful. status 200
2020-03-30T09:56:14.477 iotfunctions.metadata.register DEBUG Metadata registered for table apm_1016 
2020-03-30T09:56:15.269 analytics_service.pmlib.time_to_failure.TimeToFailureEstimatorSrom.save_model DEBUG saved apm/pmi/model/1016/TimeToFailureEstimatorSrom/predicted_time_to_failure_1585562168
2020-03-30T09:56:16.383 analytics_service.pmlib.time_to_failure.TimeToFailureEstimatorSrom.save_model DEBUG saved apm/pmi/model/1016/TimeToFailureEstimatorSrom/predicted_time_to_failure_1585562168_input.gz
2020-03-30T09:56:17.220 analytics_service.pmlib.time_to_failure.TimeToFailureEstimatorSrom.save_model DEBUG saved apm/pmi/model/1016/TimeToFailureEstimatorSrom/predicted_time_to_failure_1585562168_input_after_train_prepr

2020-03-30T09:56:22.897 iotfunctions.db.http_request DEBUG http request successful. status 200
2020-03-30T09:56:22.902 analytics_service.pmlib.time_to_failure.TimeToFailureAssetGroupPipeline._write INFO granularities={'Daily': <iotfunctions.metadata.Granularity object at 0x7fb727fee4a8>, 'GroupDaily': <iotfunctions.metadata.Granularity object at 0x7fb6c410dac8>, 'Hourly': <iotfunctions.metadata.Granularity object at 0x7fb6c410dd30>, 'GroupHourly': <iotfunctions.metadata.Granularity object at 0x7fb6c410de10>, 'Weekly': <iotfunctions.metadata.Granularity object at 0x7fb6c410d2e8>, 'GroupWeekly': <iotfunctions.metadata.Granularity object at 0x7fb6c410d898>, 'Monthly': <iotfunctions.metadata.Granularity object at 0x7fb6c410d198>, 'GroupMonthly': <iotfunctions.metadata.Granularity object at 0x7fb6c410d390>}
2020-03-30T09:56:22.904 analytics_service.pmlib.time_to_failure.TimeToFailureAssetGroupPipeline._parse_kpi_dependency_tree DEBUG raw_metrics_set={'predicted_time_to_failure'}, derived_me

2020-03-30T09:56:32.345 analytics_service.pmlib.persist.PersistColumns.execute DEBUG columns_to_persist=[('value_n', 'daily_predicted_time_to_failure')]
2020-03-30T09:56:32.356 analytics_service.pmlib.persist.PersistColumns.execute DEBUG df_stacked=shape=(62, 1), index={'id': 'O', 'event_timestamp': '<M8[ns]', 2: 'O'}, columns={'value_n': 'float64'}, head(5)=
                                                                       value_n
id                event_timestamp                                             
ZIOT1001-____-WCM 2020-02-25      daily_predicted_time_to_failure  1358.667193
                  2020-02-26      daily_predicted_time_to_failure    76.159989
                  2020-02-27      daily_predicted_time_to_failure   405.344200
                  2020-03-17      daily_predicted_time_to_failure   168.303605
                  2020-03-21      daily_predicted_time_to_failure   168.303605
2020-03-30T09:56:32.368 analytics_service.pmlib.persist.PersistColumns.execute DEBUG 

'D17CE664-6651-49DF-94C5-C584D8B87E62'

Once registration succeeds, you can see this newly trained model instance available for the asset group on IBM Maximo APM UI.

<a id="model-template-internals"></a>
## Model Template Internals

The Time to Failure model has two parts: 
+ **transformer the raw data to the input format of the four survival estimators in SROM** (**`TimeToFailureEstimatorSromFeatureExtraction`** class in the cell below)
+ **training the Time to Failure model and do the scoring** (**`TimeToFailureEstimatorSrom`** class in the cell below)

In the following cell, we show the source of Maximo APM PMI - Time to Failure model template.

In [7]:
class TimeToFailureEstimatorSrom(SromEstimator):
    def get_stages(self):
        from srom.survival_analysis.kaplan_meier import KaplanMeier
        from srom.survival_analysis.cox_regression import CoxRegression
        from srom.survival_analysis.aalen_additive_regression import AalenAdditiveRegression
        from srom.survival_analysis.nelson_aalen import NelsonAalen

        event_column = self.features_for_training[-2]
        duration_column = self.features_for_training[-1]

        self.logger.debug('event_column=%s, duration_column=%s' % (event_column, duration_column))

        km = KaplanMeier(duration_column=duration_column, event_column=event_column)
        cr = CoxRegression(duration_column=duration_column, event_column=event_column)
        aar = AalenAdditiveRegression(duration_column=duration_column, event_column=event_column)
        na = NelsonAalen(duration_column=duration_column, event_column=event_column)
        
        return [
            [km, cr, aar,na]
        ]
    
    def get_param_grid(self):
        from srom.pipeline import SROMParamGrid
        return SROMParamGrid(gridtype='empty')
    
    def get_df_for_training(self, df):
        # this model takes all columns as features, so we have to remove any not intended to be features
        # the original two, installdate and faildate, must be removed explicityly before passing for training
        df = super().get_df_for_training(df)
        return df[[column for column in df.columns if column not in self.features_for_training[0:2]]]

    def predict(self, model, df):
        return model.predict(df)

    def get_prediction_result_value_index(self):
        return [0]
    

class TimeToFailureEstimatorFeatureExtraction(BaseTransformer):
    def __init__(self, installdate_column, faildate_column, event_column, duration_column):
        super().__init__()
        self.logger = get_logger(self)
        self.installdate_column = installdate_column
        self.faildate_column = faildate_column
        self.event_column = event_column
        self.duration_column = duration_column

    def execute(self, df):
        self.logger.debug('df_input: %s' % log_df_info(df, head=5))

        df_indices = df.index.names
        df = df.reset_index()

        # add failure event column
        df[self.event_column] = np.where(pd.isnull(df[self.faildate_column]), 0, 1)

        # now calculate the duration besed on installdate, faildate, and current time
        # 1. if faildate is NA, it must be the one for current time (since last failure or installdate), set its end time to current time
        # 2. for faildate not NA, set the previous faildate as the start time (by shift())
        # 3. for the very first row with faildate, the start time would be NA, set this row's start time to be the asset's installdate
        # 4. calculate the diff of start and end, take days
        # 5. this must be per asset grouping

        duration_start_column = 'calc_date_start'
        duration_end_column = 'calc_date_end'

        df[duration_end_column] = np.where(pd.notna(df[self.faildate_column]), df[self.faildate_column], df[df_indices[1]])
        df = df.sort_values([df_indices[0], duration_end_column])
        df[duration_start_column] = df.groupby([df_indices[0]])[duration_end_column].shift(1)
        df[duration_start_column] = np.where(pd.notna(df[duration_start_column]), df[duration_start_column], df[self.installdate_column])
        df = df.astype({duration_start_column: 'datetime64[ms]'})
        df[self.duration_column] = (df[duration_end_column] - df[duration_start_column]).dt.days

        self.logger.debug('df_duration_calculated: %s' % log_df_info(df, head=5))

        # clean up columns and set index back
        df = df.drop(labels=[duration_start_column, duration_end_column], axis=1, errors='ignore')
        df = df.set_index(df_indices)

        self.logger.debug('df_final: %s' % log_df_info(df, head=5))

        return df


class TimeToFailureAssetGroupPipeline(AssetGroupPipeline):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)

        self.model_template_name = 'Predicted Failure Date'

        # exclude labels from normal fillna/dropna logic

        if self.fillna_exclude is None:
            self.fillna_exclude = []
        if self.pipeline_config.features_for_training is not None:
            self.fillna_exclude.extend(self.pipeline_config.features_for_training)
            self.fillna_exclude = list(set(self.fillna_exclude))

        if self.dropna_exclude is None:
            self.dropna_exclude = []
        if self.pipeline_config.features_for_training is not None:
            self.dropna_exclude.extend(self.pipeline_config.features_for_training)
            self.dropna_exclude = list(set(self.dropna_exclude))

        # default aggregation post-processing
        prediction = self.pipeline_config.predictions[0]
        default_post_processing = [
            {
                "functionName": "Maximum",
                "enabled": True,
                "granularity": "Daily",
                "output": {
                    "name": "daily_%s" % prediction
                },
                "input": {
                    "source": prediction
                }
            }
        ]
        if self.post_processing is None:
            self.post_processing = default_post_processing
        else:
            for agg in default_post_processing:
                if agg not in self.post_processing:
                    self.post_processing.append(agg)

    def prepare_execute(self, pipeline, model_config):
        # this model uses transfomer to generate 2 new features to 'replace' the original 2
        # the model does not expect the original 2, so the estimator class above override method 
        # to produce custom df for training, to remove the original 2

        installdate_column = model_config.features_for_training[0]
        faildate_column = model_config.features_for_training[1]
        event_column = 'has_failed'
        duration_column = 'days_run'

        model_config.features_for_training.append(event_column)
        model_config.features_for_training.append(duration_column)

        estimator = TimeToFailureEstimatorSrom(**model_config)
        estimator.add_training_preprocessor(TimeToFailureEstimatorFeatureExtraction(installdate_column=model_config.features_for_training[0], faildate_column=model_config.features_for_training[1], event_column=event_column, duration_column=duration_column))
        pipeline.add_stage(estimator)

    def get_prediction_backtrack(self):
        reset = DateOffset(**{"hour": 0, "minute": 0, "second": 0, "microsecond": 0})
        offset = to_offset('1d')
        return [[reset, offset], [reset]]

    @staticmethod
    def generate_sample_data(sensor_type_name, **kwargs):
        return generate_time_to_failure_data(sensor_type_name=sensor_type_name, **kwargs)

NameError: name 'SromEstimator' is not defined

## How to override base class
If you want to customize some functions in the model template, you can just override the function. For example **`TimeToFailureEstimatorSrom(SromEstimator)`**, It is based on the base class **`SromEstimator`** you can:
+ override the existing method in **`SromEstimator`**, or the base class **`BaseEstimator`** of **`SromEstimator`**
+ add new function like **`get_stages`** to configure Survival Classifiers for the algorithm.

    def get_stages(self):
        from srom.survival_analysis.kaplan_meier import KaplanMeier
        from srom.survival_analysis.cox_regression import CoxRegression
        from srom.survival_analysis.aalen_additive_regression import AalenAdditiveRegression
        from srom.survival_analysis.nelson_aalen import NelsonAalen

        event_column = self.features_for_training[-2]
        duration_column = self.features_for_training[-1]

        self.logger.debug('event_column=%s, duration_column=%s' % (event_column, duration_column))

        km = KaplanMeier(duration_column=duration_column, event_column=event_column)
        cr = CoxRegression(duration_column=duration_column, event_column=event_column)
        aar = AalenAdditiveRegression(duration_column=duration_column, event_column=event_column)
        na = NelsonAalen(duration_column=duration_column, event_column=event_column)
        
        return [
            [km, cr, aar,na]
        ]

#### Base class `SromEstimator`

In [None]:
class SromEstimator(BaseEstimator):
    def __init__(self, features, targets, predictions, srom_training_options=None, **kwargs):
        super().__init__(features=features, targets=targets, predictions=predictions, **kwargs)
        self._set_srom_training_optins(srom_training_options)

    def _set_srom_training_optins(self, srom_training_options):
        self.srom_training_options = srom_training_options
        if self.srom_training_options is None:
            self.srom_training_options = {}
        if 'verbosity' not in self.srom_training_options:
            self.srom_training_options['verbosity'] = 'low'
        if 'exectype' not in self.srom_training_options:
            self.srom_training_options['exectype'] = 'single_node_complete_search'
        if 'num_option_per_pipeline' not in self.srom_training_options:
            self.srom_training_options['num_option_per_pipeline'] = 1
        if 'max_eval_time_minute' not in self.srom_training_options:
            self.srom_training_options['max_eval_time_minute'] = 1

    def train_model(self, df):
        srom_pipeline = self.create_pipeline()
        srom_pipeline = self.configure_pipeline(srom_pipeline)
        srom_pipeline.set_stages(self.get_stages())

        df_train = df
        if isinstance(srom_pipeline, AnomalyPipeline):
            label = self.features_for_training[0]
            df_train = df[pd.isna(df[label])].drop(labels=label, axis=1, errors='ignore').reset_index(drop=True)
            self.logger.debug('trainX: %s' % log_df_info(df_train, head=5))

            validX = df[pd.notna(df[label])].drop(labels=label, axis=1, errors='ignore').reset_index(drop=True)
            validy = df[pd.notna(df[label])][label].reset_index(drop=True)
            self.logger.debug('validX: %s' % log_df_info(validX, head=5))
            self.logger.debug('validy: %s' % log_df_info(validy, head=5))

            srom_pipeline.execute(
                trainX=df_train, 
                validX=validX, 
                validy=validy, 
                param_grid=self.get_param_grid(), 
                **self.srom_training_options)
        else:
            srom_pipeline.execute(
                df_train, 
                param_grid=self.get_param_grid(), 
                **self.srom_training_options)

        srom_pipeline.fit(df_train)

        return srom_pipeline

    def create_pipeline(self):
        return SROMPipeline()

    def configure_pipeline(self, srom_pipeline):
        return srom_pipeline