<h1><center>Net Diagnostics Ingest Management</center></h1>
<a id="tc"></a>

## Table of Contents
1. [Configuration](#configuration) 
2. [Start ND Ingest Part 0 (TEST)](#ingest0)
3. [Start ND Ingest Part 1](#ingest1)
4. [Start ND Ingest Part 2](#ingest2)
5. [Start ND Ingest Transaction Part 1](#ingest3)
6. [Start ND Ingest Transaction Part 2](#ingest4)
7. [Start ND Augmentation](#augmentation)
8. [Remove Duplicates](#rmduplicates)
9. [Push Metrics to DB](#todb)
10. [Prepare ND LGBM Signature](#signature)
11. [Train and Tune LGBM Model](#hptuning)
12. [Trained Model Deployment](#deployment)
13. [Cold Start Prediction before Anomaly Detection](#coldstart)
14. [Batch Prediction](#prediction)
15. [Batch Anomaly Detection for Test](#detection)
16. [Analytics](#analytics)
17. [Streaming](#streaming)
18. [Push Notebook to GCS Bucket](#gcs)

<a id="configuration"></a>
## Configuration
[back to Table Of Contents](#tc)

In [2]:
import os
import json

def get_transition(transition_file):
    with open(transition_file, 'r') as f:
        return json.load(f)

In [15]:
BASE_PATH='/home/jovyan/work/data'
PROJECT_PATH = f'{BASE_PATH}/poc'
BUCKET = 'ai4ops-main-storage-bucket'
PROJECT = 'kohls-kos-cicd'
CLUSTER = 'ai4ops'
REGION='global'
AI_PLATFORM_REGION = 'us-central1'

In [16]:
from mldsl import *
import importlib
from datetime import datetime
import sys
import pyspark
import json

<a id="ingest0"></a>
## Start ND Ingest Part 0 (TEST)
[back to Table Of Contents](#tc)

In [17]:
SCRIPT_PATH = f"{PROJECT_PATH}/spark/ingest"
RESOURCES='/opt/dataproc/.resources'

DURATION = '1' # seconds
POOL_SIZE = '2'
TIMEOUT = '1440' # minutes
WRITE_FORMAT = 'csv'
SLICE_SIZE = '10000'


In [18]:
builder = DataprocJobBuilder()
session = GCPSessionFactory.build_session(job_bucket=BUCKET,job_region=REGION, cluster=CLUSTER, job_project_id=PROJECT, 
                                          ml_region=AI_PLATFORM_REGION)

In [19]:
CONFIG_NAME='job_part_kohls_nd_08_00.json'

In [20]:
arguments = {'--token_file_gcs_path':f'gs://{BUCKET}/resources/kohls_nd.txt',\
             '--res_path':RESOURCES,\
             '--duration':DURATION,\
             '--pool_size':POOL_SIZE,\
             '--timeout':TIMEOUT,\
             '--write_format':WRITE_FORMAT,\
             '--slice_size':SLICE_SIZE
            }

In [21]:
TIMESTAMP = int(datetime.now().timestamp())

test_nd_job_name = f"api_ai4ops_ingest_from_nd_{TIMESTAMP}"

arguments['--output_file_pattern_path'] = f'gs://{BUCKET}/nd_history/{test_nd_job_name}'
arguments['--tasks_file_path'] = CONFIG_NAME

test_nd_job = builder.job_file(f'{SCRIPT_PATH}/nd_history_ingest_batch.py')\
.job_id(test_nd_job_name)\
.py_file(f'{SCRIPT_PATH}/apigee_ingest_utils.py')\
.py_file(f'{SCRIPT_PATH}/ai4ops_db.py')\
.py_file(f'{SCRIPT_PATH}/yarn_logging.py')\
.py_file(f'{SCRIPT_PATH}/nd_ingest.py')\
.jar(f"gs://{BUCKET}/resources/spark.http.apigee-1.0-SNAPSHOT-jar-with-dependencies.jar")\
.jar(f"gs://{BUCKET}/resources/mysql-connector-java-8.0.16.jar")\
.file(f'{SCRIPT_PATH}/jobs/{CONFIG_NAME}')\
.arguments(**arguments)\
.build_job()

test_nd_executor = DataprocExecutor(test_nd_job, session)

In [22]:
test_nd_executor.submit_job(run_async=False)

Job with id api_ai4ops_ingest_from_nd_1567572118 was submitted to the cluster ai4ops
Job STATUS was set to PENDING at 2019-09-04 04:41:15
Job STATUS was set to SETUP_DONE at 2019-09-04 04:41:17
      Yarn APP /home/jovyan/work/data/poc/spark/ingest/nd_history_ingest_batch.py with STATUS ACCEPTED has PROGRESS 0
      Yarn APP /home/jovyan/work/data/poc/spark/ingest/nd_history_ingest_batch.py with STATUS RUNNING has PROGRESS 10
Canceling job: api_ai4ops_ingest_from_nd_1567572118


<a id="ingest2"></a>
## Start ND Ingest Part 1
[back to Table Of Contents](#tc)

In [None]:
# CONFIG_NAME='job_part_kohls_nd_08_ALL_01.json'
CONFIG_NAME='job_part_kohls_nd_PEAK_ALL_01.json'

DURATION = '5' # seconds
POOL_SIZE = '1'
TIMEOUT = '1440' # minutes
WRITE_FORMAT = 'csv'
SLICE_SIZE = '20000'

In [None]:
arguments = {'--token_file_gcs_path':f'gs://{BUCKET}/resources/kohls_nd.txt',\
             '--res_path':RESOURCES,\
             '--duration':DURATION,\
             '--pool_size':POOL_SIZE,\
             '--timeout':TIMEOUT,\
             '--write_format':WRITE_FORMAT,\
             '--slice_size':SLICE_SIZE
            }

In [None]:
TIMESTAMP = int(datetime.now().timestamp())

part1_nd_job_name = f"api_ai4ops_ingest_from_nd_{TIMESTAMP}"

arguments['--output_file_pattern_path'] = f'gs://{BUCKET}/nd_history/{part1_nd_job_name}'
arguments['--tasks_file_path'] = CONFIG_NAME'

part1_nd_job = builder.job_file(f'{SCRIPT_PATH}/nd_history_ingest_batch.py')\
.job_id(part1_nd_job_name)\
.py_file(f'{SCRIPT_PATH}/apigee_ingest_utils.py')\
.py_file(f'{SCRIPT_PATH}/ai4ops_db.py')\
.py_file(f'{SCRIPT_PATH}/apigee_history_ingest.py')\
.py_file(f'{SCRIPT_PATH}/yarn_logging.py')\
.py_file(f'{SCRIPT_PATH}/nd_ingest.py')\
.jar(f"gs://{BUCKET}/resources/spark.http.apigee-1.0-SNAPSHOT-jar-with-dependencies.jar")\
.jar(f"gs://{BUCKET}/resources/mysql-connector-java-8.0.16.jar")\
.file(f'{SCRIPT_PATH}/jobs/{CONFIG_NAME}')\
.arguments(**arguments)\
.build_job()


part1_nd_executor = DataprocExecutor(part1_nd_job, session)

<a id="ingest2"></a>
## Start ND Ingest Part 2
[back to Table Of Contents](#tc)

In [None]:
# CONFIG_NAME='job_part_kohls_nd_08_ALL_02.json'
CONFIG_NAME='job_part_kohls_nd_PEAK_ALL_02.json'

In [None]:
TIMESTAMP = int(datetime.now().timestamp())

part2_nd_job_name = f"api_ai4ops_ingest_from_nd_{TIMESTAMP}"

arguments['--output_file_pattern_path'] = f'gs://{BUCKET}/nd_history/{part2_nd_job_name}'
arguments['--tasks_file_path'] = CONFIG_NAME

part2_nd_job = builder.job_file(f'{SCRIPT_PATH}/nd_history_ingest_batch.py')\
.job_id(part2_nd_job_name)\
.py_file(f'{SCRIPT_PATH}/apigee_ingest_utils.py')\
.py_file(f'{SCRIPT_PATH}/ai4ops_db.py')\
.py_file(f'{SCRIPT_PATH}/apigee_history_ingest.py')\
.py_file(f'{SCRIPT_PATH}/yarn_logging.py')\
.py_file(f'{SCRIPT_PATH}/nd_ingest.py')\
.jar(f"gs://{BUCKET}/resources/spark.http.apigee-1.0-SNAPSHOT-jar-with-dependencies.jar")\
.jar(f"gs://{BUCKET}/resources/mysql-connector-java-8.0.16.jar")\
.file(f'{SCRIPT_PATH}/jobs/{CONFIG_NAME}')\
.arguments(**arguments)\
.build_job()


part2_nd_executor = DataprocExecutor(part2_nd_job, session)

<a id="ingest3"></a>
## Start ND Ingest Transactions Part 1
[back to Table Of Contents](#tc)

In [None]:
# CONFIG_NAME='job_part_kohls_nd_08_ALL_TX_01.json'
CONFIG_NAME='job_part_kohls_nd_PEAK_ALL_TX_01.json'

In [None]:
TIMESTAMP = int(datetime.now().timestamp())

part1_tx_nd_job_name = f"api_ai4ops_ingest_from_nd_{TIMESTAMP}"

arguments['--output_file_pattern_path'] = f'gs://{BUCKET}/nd_history/{part1_tx_nd_job_name}'
arguments['--tasks_file_path'] = CONFIG_NAME

part1_tx_nd_job = builder.job_file(f'{SCRIPT_PATH}/nd_history_ingest_batch.py')\
.job_id(part1_tx_nd_job_name)\
.py_file(f'{SCRIPT_PATH}/apigee_ingest_utils.py')\
.py_file(f'{SCRIPT_PATH}/ai4ops_db.py')\
.py_file(f'{SCRIPT_PATH}/apigee_history_ingest.py')\
.py_file(f'{SCRIPT_PATH}/yarn_logging.py')\
.py_file(f'{SCRIPT_PATH}/nd_ingest.py')\
.jar(f"gs://{BUCKET}/resources/spark.http.apigee-1.0-SNAPSHOT-jar-with-dependencies.jar")\
.jar(f"gs://{BUCKET}/resources/mysql-connector-java-8.0.16.jar")\
.file(f'{SCRIPT_PATH}/jobs/{CONFIG_NAME}')\
.arguments(**arguments)\
.build_job()


part1_tx_nd_executor = DataprocExecutor(part1_tx_nd_job, session)

<a id="ingest3"></a>
## Start ND Ingest Transactions Part 2
[back to Table Of Contents](#tc)

In [None]:
# CONFIG_NAME='job_part_kohls_nd_08_ALL_TX_02.json'
CONFIG_NAME='job_part_kohls_nd_PEAK_ALL_TX_02.json'

In [None]:
TIMESTAMP = int(datetime.now().timestamp())

part2_tx_nd_job_name = f"api_ai4ops_ingest_from_nd_{TIMESTAMP}"

arguments['--output_file_pattern_path'] = f'gs://{BUCKET}/nd_history/{part2_tx_nd_job_name}'
arguments['--tasks_file_path'] = CONFIG_NAME

part2_tx_nd_job = builder.job_file(f'{SCRIPT_PATH}/nd_history_ingest_batch.py')\
.job_id(part2_tx_nd_job_name)\
.py_file(f'{SCRIPT_PATH}/apigee_ingest_utils.py')\
.py_file(f'{SCRIPT_PATH}/ai4ops_db.py')\
.py_file(f'{SCRIPT_PATH}/apigee_history_ingest.py')\
.py_file(f'{SCRIPT_PATH}/yarn_logging.py')\
.py_file(f'{SCRIPT_PATH}/nd_ingest.py')\
.jar(f"gs://{BUCKET}/resources/spark.http.apigee-1.0-SNAPSHOT-jar-with-dependencies.jar")\
.jar(f"gs://{BUCKET}/resources/mysql-connector-java-8.0.16.jar")\
.file(f'{SCRIPT_PATH}/jobs/{CONFIG_NAME}')\
.arguments(**arguments)\
.build_job()


part2_tx_nd_executor = DataprocExecutor(part2_tx_nd_job, session)

In [None]:
part1_nd_executor.submit_job(run_async=True)

In [None]:
part2_nd_executor.submit_job(run_async=True)

In [None]:
part1_tx_nd_executor.submit_job(run_async=True)

In [None]:
part2_tx_nd_executor.submit_job(run_async=True)

In [None]:
sleep(60)
state1 = part1_nd_executor.get_job_state()
state2 = part2_nd_executor.get_job_state()
state3 = part1_tx_nd_executor.get_job_state()
state4 = part2_tx_nd_executor.get_job_state()

print('State 1: {}'.format(state1))
print('State 2: {}'.format(state2))
print('State 3: {}'.format(state3))
print('State 4: {}'.format(state4))

if state1 not in ['DONE', 'RUNNING']:
    raise RuntimeError('Previous workflow step was failed')

if state2 not in ['DONE', 'RUNNING']:
    raise RuntimeError('Previous workflow step was failed')

if state3 not in ['DONE', 'RUNNING']:
    raise RuntimeError('Previous workflow step was failed')
if state4 not in ['DONE', 'RUNNING']:
    raise RuntimeError('Previous workflow step was failed')

In [None]:
nd_ingest_transition = {
    "INGEST_JOB_1": f"{part1_nd_job_name.job_id}",
    "INGEST_JOB_2": f"{part2_nd_job_name.job_id}",
    "INGEST_TX_JOB_1": f"{part1_tx_nd_job_name.job_id}",
    "INGEST_TX_JOB_2": f"{part2_tx_nd_job_name.job_id}",
    "INGEST_TIMESTAMP": f"{int(datetime.now().timestamp())}",
    "INGEST_BUCKET": f"{BUCKET}",
    "INGEST_OUTPUT_JOB_1": f"gs://{BUCKET}/nd_history/{part1_nd_job_name}/chunk*",
    "INGEST_OUTPUT_JOB_2": f"gs://{BUCKET}/nd_history/{part2_nd_job_name}/chunk*",
    "INGEST_OUTPUT_TX_JOB_1": f"gs://{BUCKET}/nd_history/{part1_tx_nd_job_name}/chunk*",
    "INGEST_OUTPUT_TX_JOB_2": f"gs://{BUCKET}/nd_history/{part2_tx_nd_job_name}/chunk*",
    "INGEST_STATE_JOB_1": f"{state1}",
    "INGEST_STATE_JOB_2": f"{state2}",
    "INGEST_STATE_TX_JOB_1": f"{state3}",
    "INGEST_STATE_TX_JOB_2": f"{state4}"
}

with open('api_nd_transition_ingest.json', 'w') as file:
     file.write(json.dumps(nd_ingest_transition)) 

<a id="augmentation"></a>
## Start ND Augmentation
[back to Table Of Contents](#tc)

In [None]:
# August
# SOURCE_PART_1='gs://ai4ops-main-storage-bucket/nd_history/ai4ops_ingest_from_net_diagnostics_1567190234/chunk*'
# SOURCE_PART_2='gs://ai4ops-main-storage-bucket/nd_history/ai4ops_ingest_from_net_diagnostics_1567190231/chunk*'
# SOURCE_PART_3='gs://ai4ops-main-storage-bucket/nd_history/ai4ops_ingest_from_net_diagnostics_1567190227/chunk*'
# SOURCE_PART_4='gs://ai4ops-main-storage-bucket/nd_history/ai4ops_ingest_from_net_diagnostics_1567190222/chunk*'

# # Peak
# SOURCE_PART_1='gs://ai4ops-main-storage-bucket/nd_history/ai4ops_ingest_from_net_diagnostics_1567289986/chunk*'
# SOURCE_PART_2='gs://ai4ops-main-storage-bucket/nd_history/ai4ops_ingest_from_net_diagnostics_1567289983/chunk*'
# SOURCE_PART_3='gs://ai4ops-main-storage-bucket/nd_history/ai4ops_ingest_from_net_diagnostics_1567289969/chunk*'
# SOURCE_PART_4='gs://ai4ops-main-storage-bucket/nd_history/ai4ops_ingest_from_net_diagnostics_1567289961/chunk*'


nd_ingest_transition = get_transition('api_nd_transition_ingest.json')
print(nd_ingest_transition)
INGEST_OUTPUT_JOB_1 = transition.get('INGEST_OUTPUT_JOB_1', '')
INGEST_OUTPUT_JOB_2 = transition.get('INGEST_OUTPUT_JOB_2', '')
INGEST_OUTPUT_TX_JOB_1 = transition.get('INGEST_OUTPUT_TX_JOB_1', '')
INGEST_OUTPUT_TX_JOB_2 = transition.get('INGEST_OUTPUT_TX_JOB_2', '')
INPUT_PATH = f'{INGEST_OUTPUT_JOB_1},{INGEST_OUTPUT_JOB_2},{INGEST_OUTPUT_TX_JOB_1},{INGEST_OUTPUT_TX_JOB_2}'



DATA_START_DATE = '2018-11-09T04:21:00Z'
DATA_END_DATE = '2018-12-10T04:21:00Z'

DATA_BASE_PATH = 'nd_history'

ND_HIST_BASE_PATH = f'gs://{BUCKET}/{DATA_BASE_PATH}'




In [None]:
builder = DataprocJobBuilder()
nd_aug_job_name = "api_ai4ops_nd_augmentation_{}".format(int(datetime.now().timestamp()))

AUGMENTATION_OUT=f'{ND_HIST_BASE_PATH}/augmented/{nd_aug_job_name}'

arguments = {"--input_data_path":INPUT_PATH,\
        "--output_data_path":AUGMENTATION_OUT, \
        "--start_date":DATA_START_DATE,"--end_date":DATA_END_DATE}

augumentation_job = builder.task_script(f'{SCRIPT_PATH}/nd_augmentation.py')\
.job_id(nd_aug_job_name)\
.py_file(f'{SCRIPT_PATH}/apigee_ingest_utils.py')\
.py_file(f'{SCRIPT_PATH}/ai4ops_db.py')\
.py_file(f'{SCRIPT_PATH}/yarn_logging.py')\
.arguments(**arguments)\
.build_job()

nd_aug_executor = DataprocExecutor(augumentation_job, session)

In [None]:
aug_res = nd_aug_executor.submit_job(run_async=True)

In [None]:
sleep(60)
state = nd_aug_executor.get_job_state()

print('State : {}'.format(state))
if state not in ['DONE', 'RUNNING']:
    raise RuntimeError('Previous workflow step was failed')


In [None]:
transition_nd_augmentation = {
    "AUGMENTATION_JOB": nd_aug_job_name,
    "AUGMENTATION_OUTPUT": AUGMENTATION_OUT,
    "AUGMENTATION_STATE": state
}

with open('api_transition_nd_augmentation.json', 'w') as file:
     file.write(json.dumps(transition_nd_augmentation)) 

<a id="rmduplicates"></a>
## Remove Duplicates
[back to Table Of Contents](#tc)


In [None]:
builder = DataprocJobBuilder()

transition = get_transition('api_transition_nd_augmentation.json')
print(transition)
AUGMENTATION_OUT = transition.get('AUGMENTATION_OUTPUT', '')

In [None]:
dedup_job_name = "api_remove_duplicates_{}".format(int(datetime.now().timestamp()))

DEDUPLICATION_OUT=f'{BASE_PATH}/no_duplicates/{dedup_job_name}'

arguments = {"--input_data_path":f"{AUGMENTATION_OUT}/chunk*",\
        "--output_data_path":DEDUPLICATION_OUT}

deduplication_job = builder.task_script('remove_duplicates.py')\
.job_id(dedup_job_name)\
.py_file(f'{SCRIPT_PATH}/apigee_ingest_utils.py')\
.py_file(f'{SCRIPT_PATH}/ai4ops_db.py')\
.py_file(f'{SCRIPT_PATH}/yarn_logging.py')\
.arguments(**arguments)\
.build_job()

dedup_executor = DataprocExecutor(deduplication_job, session)

In [None]:
dedup_res = dedup_executor.submit_job(run_async=True)

In [None]:
sleep(60)
state = dedup_executor.get_job_state()

print('State : {}'.format(state))
if state not in ['DONE', 'RUNNING']:
    raise RuntimeError('Previous workflow step was failed')

In [None]:
deduplication_transition = {
    "REMOVE_DUPLICATES_JOB": dedup_job_name,
    "REMOVE_DUPLICATES_OUTPUT": DEDUPLICATION_OUT,
    "REMOVE_DUPLICATES_STATE": state
}

with open('api_transition_remove_duplicates.json', 'w') as file:
     file.write(json.dumps(deduplication_transition)) 

<a id="todb"></a>
## Push Metrics to DB
[back to Table Of Contents](#tc)

### Aggregation ND Metrics to DB

In [None]:
transition = get_transition('api_transition_remove_duplicates.json')
print(transition)
DEDUPLICATION_OUT = transition.get('REMOVE_DUPLICATES_OUTPUT', '')

In [None]:
DB_SECRET="kohls_db.txt"
METRIC_FILTER = '%-agg'

In [None]:
builder = DataprocJobBuilder()

save_to_db_job_name = "api_ai4ops_push_nd_metrics_to_mysql_agg_{}".format(int(datetime.now().timestamp()))

arguments = {"--metrics_path": f"{DEDUPLICATION_OUT}/chunk*",\
            "--db_credentials_gcs_file_path" : f"gs://{BUCKET}/resources/{DB_SECRET}", \
            "--res_path" : RESOURCES, \
            "--start_from":DATA_START_DATE,\
            "--metric_filter": METRIC_FILTER
            }

save_to_db_job = builder.task_script(f'{SCRIPT_PATH}/nd_to_mysql.py')\
.job_id(save_to_db_job_name)\
.py_file(f'{SCRIPT_PATH}/apigee_ingest_utils.py')\
.py_file(f'{SCRIPT_PATH}/ai4ops_db.py')\
.py_file(f'{SCRIPT_PATH}/yarn_logging.py')\
.py_file(f'{SCRIPT_PATH}/augmentation.py')\
.jar(f'gs://{BUCKET}/resources/mysql-connector-java-8.0.16.jar')\
.arguments(**arguments)\
.build_job()

save_to_db_executor = DataprocExecutor(save_to_db_job, session)

In [None]:
save_res = save_to_db_executor.submit_job(run_async=False)

In [None]:
sleep(60)
state = save_to_db_executor.get_job_state()

print('State : {}'.format(state))
if state not in ['DONE', 'RUNNING']:
    raise RuntimeError('Previous workflow step was failed')


In [None]:
transition_push_nd_to_db = {
    "PUSH_TO_DB_JOB": save_to_db_job_name,
    "PUSH_TO_DB_STATE": state
}

with open('api_transition_push_nd_to_db_agg.json', 'w') as file:
     file.write(json.dumps(transition_push_nd_to_db)) 

### Transaction ND Metrics to DB

In [None]:
METRIC_FILTER = '%-trans'

In [None]:
save_to_db_job_name = "api_ai4ops_push_nd_metrics_to_mysql_tx_{}".format(int(datetime.now().timestamp()))

arguments = {"--metrics_path": f"{DEDUPLICATION_OUT}/chunk*",\
            "--db_credentials_gcs_file_path" : f"gs://{BUCKET}/resources/{DB_SECRET}", \
            "--res_path" : RESOURCES, \
            "--start_from":DATA_START_DATE,\
            "--metric_filter": METRIC_FILTER
            }

save_to_db_job = builder.task_script(f'{SCRIPT_PATH}/nd_to_mysql.py')\
.job_id(save_to_db_job_name)\
.py_file(f'{SCRIPT_PATH}/apigee_ingest_utils.py')\
.py_file(f'{SCRIPT_PATH}/ai4ops_db.py')\
.py_file(f'{SCRIPT_PATH}/yarn_logging.py')\
.py_file(f'{SCRIPT_PATH}/augmentation.py')\
.jar(f'gs://{BUCKET}/resources/mysql-connector-java-8.0.16.jar')\
.arguments(**arguments)\
.build_job()

save_to_db_executor = DataprocExecutor(save_to_db_job, session)

In [None]:
save_res = save_to_db_executor.submit_job(run_async=False)

In [None]:
sleep(60)
state = save_to_db_executor.get_job_state()

print('State : {}'.format(state))
if state not in ['DONE', 'RUNNING']:
    raise RuntimeError('Previous workflow step was failed')


In [None]:
transition_push_nd_to_db = {
    "PUSH_TO_DB_JOB": save_to_db_job_name,
    "PUSH_TO_DB_STATE": state
}

with open('api_transition_push_nd_to_db_tx.json', 'w') as file:
     file.write(json.dumps(transition_push_nd_to_db)) 

<a id="signature"></a>
## Prepare ND LGBM Signature
[back to Table Of Contents](#tc)

In [None]:
AI_PLATFORM_REGION = 'us-central1'
AI_PLATFORM_MODEL_BASE_VERSION = 'v1'
CLUSTER = 'ai4ops'

OUTPUT_PATH = 'nd_models/lgbm/input'
USE_POWER_TRANSFORMER = True
WITH_CALENDAR_FEATURES = False
SUFFIX = ""
SCALER_NAME = ""

# DATA_CONFIG = 'lgbm_signature_nd_peak_august_cpu.json'
DATA_CONFIG = 'lgbm_signature_nd_peak_august_mb.json'

In [None]:
transition = get_transition('api_transition_remove_duplicates.json')
print(transition)
DEDUPLICATION_OUT = transition.get('REMOVE_DUPLICATES_OUTPUT', '')

In [None]:
builder = DataprocJobBuilder()

TIMESTAMP=int(datetime.now().timestamp())
sign_job_name = "api_ai4ops_nd_lgbm_signature_${}_{}".format(SUFFIX, TIMESTAMP)

SIGNATURE_OUT = f"{OUTPUT_PATH}/{sign_job_name}"

arguments = {"--input_data_path":DEDUPLICATION_OUT,\
             "--config": DATA_CONFIG,\
             "--output_bucket":BUCKET,\
             "--output_bucket_project": PROJECT,\
            "--output_bucket_path":SIGNATURE_OUT,\
             "--workflow_id":str(TIMESTAMP),\
             "--with_calendar_features": WITH_CALENDAR_FEATURES, \
            "--with_power_transform": USE_POWER_TRANSFORMER \
            }

if SCALER_NAME:
    arguments['scaler_name'] = SCALER_NAME

signature_job = builder.job_file(f'{SCRIPT_PATH}/partial_signature_lgbm.py')\
.job_id(sign_job_name)\
.py_file(f'{SCRIPT_PATH}/apigee_ingest_utils.py')\
.py_file(f'{SCRIPT_PATH}/ai4ops_db.py')\
.py_file(f'{SCRIPT_PATH}/yarn_logging.py')\
.file(f'{SCRIPT_PATH}/jobs/{DATA_CONFIG}')\
.file(SCALER_NAME)\
.arguments(**arguments)\

.build_job()



signature_executor = DataprocExecutor(signature_job, session)

In [None]:
signature_res = signature_executor.submit_job(run_async=True)

In [None]:
sleep(60)
state = signature_executor.get_job_state()

print('State : {}'.format(state))
if state not in ['DONE', 'RUNNING']:
    raise RuntimeError('Previous workflow step was failed')

In [None]:
transition_signature = {
    "SIGNATURE_JOB_ID": sign_job_name,
    "SIGNATURE_TIMESTAMP": TIMESTAMP,
    "SIGNATURE_BUCKET": BUCKET,
    "SIGNATURE_TRAIN": f"{SIGNATURE_OUT}/LGBM-TRAIN-{TIMESTAMP}.csv",
    "SIGNATURE_VAL": f"{SIGNATURE_OUT}/LGBM-VAL-{TIMESTAMP}.csv",
    "SIGNATURE_TEST": f"{SIGNATURE_OUT}/LGBM-TEST-{TIMESTAMP}.csv",
    "SIGNATURE_SCALER": f"{SIGNATURE_OUT}/LGBM-SCL-{TIMESTAMP}.pkl",
    "SIGNATURE_DROP_KEYS": f"{SIGNATURE_OUT}/LGBM-DROP-KEYS-{TIMESTAMP}.txt",
    "SIGNATURE_STATE": state"
}

print(transition_signature)

with open('api_transition_nd_signature.json', 'w') as file:
     file.write(json.dumps(transition_signature)) 

<a id="hptuning"></a>
## Train and Tune LGBM Model
[back to Table Of Contents](#tc)

In [None]:
SCRIPT_PATH = f"{PROJECT_PATH}/models/gcp/lightgbm"
signature_transition = get_transition('api_transition_nd_signature.json')
print(signature_transition)
state = signature_transition.get('SIGNATURE_STATE', '')
print('State: {}'.format(state))
if state not in ['DONE']:
    raise RuntimeError('Previous workflow step was failed')


SIGNATURE_TRAIN = transition.get('SIGNATURE_TRAIN', '')
SIGNATURE_VAL = transition.get('SIGNATURE_VAL', '')
SIGNATURE_TEST = transition.get('SIGNATURE_TEST', '')
SIGNATURE_SCALER = transition.get('SIGNATURE_SCALER', '')
CATEGORICAL_COLUMNS = 'metric_class'
EXCLUDED_COLUMNS =  'time,metric,metric_id'

TUNING_CONFIG_FILE='hptuning_config_nd.yaml'
TRAIN_JOB_SUFFIX = 'nd_peak_august_m
WAIT_DELAY='60'
WAIT_TRIES='6'
SCALE_TIER="custom"

In [None]:
import pyyaml

TIMESTAMP=int(datetime.now().timestamp())
TUNING_JOB_NAME=f"api_ai4ops_tuning_lgbm_{TRAIN_JOB_SUFFIX}_{TIMESTAMP}"
JOB_DIR=f"gs://{BUCKET}/nd_models/lgbm/models/lightgbm/{TUNING_JOB_NAME}"
ERR_LOG_PATH_GS=f"nd_models/lgbm/models/lightgbm/{TUNING_JOB_NAME}/output"
TRAINED_MODEL_PATH_GS = f"nd_models/lgbm/models/lightgbm/{TUNING_JOB_NAME}/model"

training_input = {
  "scaleTier": SCALE_TIER,
  "masterType":"large_model",\
  "workerType":"large_model",\
  "parameterServerType":"large_model",\
  "workerCount":"4",\
  "parameterServerCount":"3",\
  "masterConfig": {
    "imageUri": "gcr.io/kohls-kos-cicd/ai4ops_lgbm_image"
  },
  "region": AI_PLATFORM_REGION,
  "jobDir": JOB_DIR
}

args = {
    '--is_hyperparameters_tuning': True,\
    '--bucket_id': BUCKET, \
  '--train_data_path_gs': SIGNATURE_TRAIN, \
  '--val_data_path_gs': SIGNATURE_VAL, \
  '--err_log_path_gs': ERR_LOG_PATH_GS, \
  '--trained_model_path_gs': TRAINED_MODEL_PATH_GS, \
  '--boosting_type': "gbdt", \
  '--num_leaves': 7, \
  '--learning_rate': 0.0215553547770489, \
  '--subsample_for_bin': 200000,\
  '--objective': "huber", \
  '--eval_metric': "mae", \
  '--obj_penalty': 1, \
  '--metrics': "l1,l2", \
  '--min_split_gain'"" 0.00021777905410443098, \
  '--min_child_weight': 15.29904104261325, \
  '--min_child_samples': 171, \
  '--subsample': 0.7424239139415899, \
  '--subsample_freq': 57, \
  '--colsample_bytree': 0.4449652059931909, \
  '--n_jobs': -1, \
  '--early_stopping_rounds'; 10, \
  '--importance_type': "split", \
  '--categorical_feature': CATEGORICAL_COLUMNS, \
  '--target': "var1(t)", \
  '--excluded': EXCLUDED_COLUMNS
}


ai_tuning_job = AIJob(TUNING_JOB_NAME, training_input)


ai_tuning_job.set_args(args)
ai_tuning_job.load_hyperparameters_from_file(TUNING_CONFIG_FILE)

In [None]:
tuning_executor = AIPlatformJobExecutor(ai_tuning_job, 60, 60)

response = tuning_executor.submit_train_job()
state = response.state

In [None]:
transition_tuning = {
    "TRAIN_JOB_ID": TUNING_JOB_NAME,
    "TRAIN_JOB_DIR": JOB_DIR,
    "TRAIN_STATE": state,
    "TRAINED_MODEL": f"{JOB_DIR}/model",
    "IS_TUNING":True
}

print(transition_tuning)

with open('api_transition_tuning.json', 'w') as file:
     file.write(json.dumps(transition_tuning)) 

<a id="deployment"></a>
## Trained Model Deployment
See AI Platform Models https://console.cloud.google.com/mlengine/models
<br/>[back to Table Of Contents](#tc)

In [None]:
SCRIPT_PATH = f"{PROJECT_PATH}/models/gcp/lightgbm"
train_out = get_transition('api_transition_tuning.json')

TRAIN_JOB_ID = train_out.get('TRAIN_JOB_ID', '')
TRAIN_JOB_DIR = train_out.get('TRAIN_JOB_DIR', '')
TRAINED_MODEL = train_out.get('TRAINED_MODEL', '')
IS_TUNING = train_out.get('IS_TUNING', '')

In [None]:
SIGNATURE_SCALER = signature_transition.get('SIGNATURE_SCALER', '')
SIGNATURE_DROP_KEYS = signature_transition.get('SIGNATURE_DROP_KEYS', '')
SIGNATURE_BUCKET = signature_transition.get('SIGNATURE_BUCKET', '')

DEPLOYMENT_PATH = 'deployment'

MODEL_NAME = AI_PALTFORM_MODEL_NAME
VERSION_NAME = AI_PLATFORM_MODEL_BASE_VERSION
OBJECTIVE_VALUE_IS_MAXIMUM_NEEDED = False

In [None]:
deploy_job_input = {
  'pythonVersion': "3.5", \
  'deploymentUri': TRAINED_MODEL,\
  'packageUris': [f'{STAGING_DIR}/{PREDICTOR_PACKAGE}'], \
  'autoScaling':{'minNodes':1},
  'runtimeVersion': '1.13'
  'predictionClass': 'custom_predictor.LGBMPredictor'
}
custom_predictor_setup_path = f"{SCRIPT_PATH}/setup.py"

deploy_job = AIJob("", deploy_input = deploy_job_input, hp_tununing=True)

artefacts_map = {
    SIGNATURE_SCALER:'scaler.pkl',\
    SIGNATURE_DROP_KEYS:'drop_keys.txt'
}

artefacts_path = f'{DEPLOYMENT_PATH}/{MODEL_NAME}_{VERSION_NAME}'

In [None]:
deploy_executor = AIPlatformJobExecutor(session, deploy_job, 20,5)
deploy_executor.submit_deploy_model_job(MODEL_NAME, VERSION_NAME, artefacts_path, artefacts_map, TRAIN_JOB_ID, custom_predictor_setup_path=f"{SCRIPT_PATH}/setup.py")


In [None]:
transition_deployment = {
    "MODEL_NAME": MODEL_NAME,
    "VERSION_NAME": VERSION_NAME,
    "MODEL_DIR": TRAINED_MODEL",
    "STAGING_DIR": STAGING_DIR,
    "DEPLOYMENT": f"{DEPLOYMENT_PATH}/{MODEL_NAME}_{VERSION_NAME}",
    "SCALER": f"{DEPLOYMENT_PATH}/{MODEL_NAME}_{VERSION_NAME}/scaler.pkl",
    "DROP_KEYS": f"{DEPLOYMENT_PATH}/{MODEL_NAME}_{VERSION_NAME}/drop_keys.txt"
}

print(transition_deployment)

with open('api_transition_deployment.json', 'w') as file:
     file.write(json.dumps(transition_deployment)) 

<a id="coldstart"></a>
## Cold Start Prediction before Anomaly Detection
[back to Table Of Contents](#tc)

In [None]:
SCRIPT_PATH = f'{PROJECT_BASE_PATH}/spark/ingest'
DURATION = '10000'
CONFIG_NAME = 'lgbm_moving_anomalies_detection_ext_sparse_4m_20190827.json'
POOL_SIZE = '4' 
CHUNK_SIZE = '10'
COLD_START_FROM = '-1h'
COLD_START_TO= '0m'
COLD_START_STEP = '60m'
COLD_START_STEP_DELAY = '0m'
METRIC_DB_TABLE = 'metric_rt'
PREDICTION_DB_TABLE = 'prediction_rt_synthetic'
DEPLOYMENT_PATH="deployment"

In [None]:
builder = DataprocJobBuilder()

TIMESTAMP=int(datetime.now().timestamp())
COLD_START_JOB_NAME = f"api_ai4ops_cold_start_prediction_{TIMESTAMP}"

arguments = {
             '--gs_bucket':BUCKET,\
             '--gs_base_deployment_path':DEPLOYMENT_PATH,\
             '--tasks_file_path':CONFIG_NAME,\
             '--db_credentials_file_gcs_path': f"gs://{BUCKET}/resources/{DB_SECRET}" \
             '--res_path':RESOURCES,\
             '--duration':DURATION,\
             '--pool_size':POOL_SIZE,\
             '--project_id':PROJECT,\
             '--metric_db_table':METRIC_DB_TABLE,\
             '--prediction_db_table':PREDICTION_DB_TABLE,\
             '--cold_start_from':COLD_START_FROM,\
             '--cold_start_to':COLD_START_TO,\
             '--cold_start_step':COLD_START_STEP,\
             '--cold_start_step_delay':COLD_START_STEP_DELAY,\
             '--chunk_size':CHUNK_SIZE
            }
            

cold_start_job = builder.job_file(f'{SCRIPT_PATH}/cold_start_prediction.py')\
.job_id(COLD_START_JOB_NAME)\
.py_file(f'{SCRIPT_PATH}/apigee_ingest_utils.py')\
.py_file(f'{SCRIPT_PATH}/ai4ops_db.py')\
.py_file(f'{SCRIPT_PATH}/yarn_logging.py')\
.py_file(f'{SCRIPT_PATH}/apigee_history_ingest.py')\
.py_file(f'{SCRIPT_PATH}/signature_lgbm.py')\
.py_file(f'{SCRIPT_PATH}/anomaly_detection.py')\
.py_file(f'{SCRIPT_PATH}/apigee_streaming_alerts.py')\
.py_file(f'{SCRIPT_PATH}/apigee_streaming_moving_anomalies.py')\
.file(f'{SCRIPT_PATH}/jobs/{CONFIG_NAME}')\
.jar(f"gs://{BUCKET}/resources/spark.http.apigee-1.0-SNAPSHOT-jar-with-dependencies.jar")\
.jar(f"gs://{BUCKET}/resources/mysql-connector-java-8.0.16.jar")\
.arguments(**arguments)\
.property('spark.executor.memory','6G'),\
.property('spark.num.executors','4'),\
.build_job()


cold_start_executor = DataprocExecutor(cold_start_job, session)

In [None]:
cold_start_res = cold_start_executor.submit_job(run_async=True)

In [None]:
sleep(60)
state = cold_start_executor.get_job_state()

print('State : {}'.format(state))
if state not in ['DONE', 'RUNNING']:
    raise RuntimeError('Previous workflow step was failed')

<a id="prediction"></a>
## Batch Prediction
<br/>[back to Table Of Contents](#tc)

In [None]:
SCRIPT_PATH = f"{PROJECT_PATH}/models/gcp/lightgbm/ai_platform_predictions"
deployment = get_transition('api_transition_deployment.json')
AI_PALTFORM_MODEL_NAME = deployment.get('MODEL_NAME', '')
AI_PALTFORM_MODEL_VERSION = deployment.get('VERSION_NAME', '')

In [None]:
TIMESTAMP=int(datetime.now().timestamp())
BASE_PATH="apigee_history/apigee/metrics/history"
OUTPUT_DATA_PATH=f"{BASE_PATH}/lgbm-batch-predicted/{TIMESTAMP}"
MAX_PARALLEL_REQUESTS=4

EXCLUDED_INPUT_COLUMNS="time,metric_val,metric,var1(t)"
PREDICTED_COLUMN_NAME="predicted"
OUTPUT_COLUMNS_MAPPING="metric_val=metric,time,var1(t)=value"

PREDICT_JOB_ID=f"api_ai4ops_batch_prediction_lgbm_{TIMESTAMP}"


arguments = {"--project_id":PROJECT,\
             "--bucket_name": BUCKET,\
             "--model_name":AI_PALTFORM_MODEL_NAME,\
             "--version_name": AI_PALTFORM_MODEL_VERSION,\
            "--input_data_file":SIGNATURE_TEST,\
             "--output_data_path":OUTPUT_DATA_PATH,\
             "--excluded_input_columns": EXCLUDED_INPUT_COLUMNS, \
            "--predicted_column_name": PREDICTED_COLUMN_NAME \
            "--output_columns_mapping": OUTPUT_COLUMNS_MAPPING, \
            "--samples_count_in_chunk": 800, \
            "--max_parallel_requests": MAX_PARALLEL_REQUESTS,
             "--drop_by_nan_columns" : ""
            }

predict_job = builder.job_file(f'{SCRIPT_PATH}/batch_predictions.py')\
.job_id(PREDICT_JOB_ID)\
.arguments(**arguments)\
.build_job()

predict_executor = DataprocExecutor(predict_job, session)


In [None]:
prediction_res = predict_executor.submit_job(run_async=True)

In [None]:
sleep(60)
state = predict_executor.get_job_state()

print('State : {}'.format(state))
if state not in ['DONE', 'RUNNING']:
    raise RuntimeError('Previous workflow step was failed')

In [None]:
transition_prediction = {
    "PREDICTION_JOB_ID": PREDICT_JOB_ID,
    "PREDICTION_TIMESTAMP": TIMESTAMP,
    "PREDICTION_BUCKET": BUCKET,
    "PREDICTION_OUTPUT": OUTPUT_DATA_PATH,
    "PREDICTION_STATE": state,
    "PREDICTION_EXCLUDED_INPUT_COLUMNS": EXCLUDED_INPUT_COLUMNS,
    "PREDICTION_PREDICTED_COLUMN_NAME": PREDICTED_COLUMN_NAME,
    "PREDICTION_OUTPUT_COLUMNS_MAPPING": OUTPUT_COLUMNS_MAPPING
}

print(transition_prediction)

with open('api_transition_prediction.json', 'w') as file:
     file.write(json.dumps(transition_prediction)) 

<a id="detection"></a>
## Batch Anomaly Detection for Test
<br/>[back to Table Of Contents](#tc)

In [None]:
PROJECT_PATH = '/home/jovyan/work/data/poc'
SCRIPT_PATH = f"{PROJECT_PATH}/spark/ingest"
prediction = get_transition('api_transition_prediction.json')

state = transition.get('PREDICTION_STATE', '')
print('State: {}'.format(state))
if state not in ['DONE']:
    raise RuntimeError('Previous workflow step was failed')
    
PREDICTION_BUCKET= prediction.get('PREDICTION_BUCKET', '')
PREDICTION_OUTPUT = prediction.get('PREDICTION_OUTPUT', 
                                   
THRESHOLD = 5
# USE_INVERSE_TRANSFORM = False

ANOMALIES_BASE_VERSION = 'ND MB Peak & August' 
ANALYTICS_PATH = 'apigee_history/apigee/metrics/history/lgbm-analytics'
                                   
INVERSE_TRANSFORMER_PATH = (transition_deployment.get('SCALER', '') if USE_INVERSE_TRANSFORM 
                                          else '')
STAT_DEPLOYMENT_PATH = "deployment/{}_{}".format(transition_deployment.get('MODEL_NAME', ''), 
                                                               transition_deployment.get('VERSION_NAME', ''))
if not INVERSE_TRANSFORMER_PAT:
    STAT_FILE="stat.csv"
else
    STAT_FILE="inverse_stat.csv"                        


In [None]:
TIMESTAMP=int(datetime.now().timestamp())

DETECTION_JOB_ID=f"api_ai4ops_anomaly_detection_{TIMESTAMP}"

VERSION=f"Ver.{TIMESTAMP}: {ANOMALIES_BASE_VERSION} THRE{THRESHOLD}"
DB_SECRET="kohls_db.txt"
RESOURCES="/opt/dataproc/.resources"
PREDICTION_PATH=f"gs://{PREDICTION_BUCKET}/{PREDICTION_OUTPUT}"
ANOMALY_NEIGHBORHOOD_SIZE = 1

arguments = {
            "--db_credentials_gcs_file_path" : f"gs://${BUCKET}/resources/${DB_SECRET}", \
            "--res_path" : RESOURCES, \
            "--predictions_path": PREDICTION_PATH, \
            "--inverse_transformer_name": GCPHelper.get_file_name(INVERSE_TRANSFORMER_PATH),\
            "--anomaly_threshold":THRESHOLD,\
            "--anomaly_neighborhood_size": ANOMALY_NEIGHBORHOOD_SIZE, \
            "--version": VERSION \
            "--output_analytics_project": PROJECT, \
            "--output_analytics_bucket": BUCKET, \
            "--output_analytics_path": f"{ANALYTICS_PATH}/{TIMESTAMP}",
            "--output_stat_project" : PROJECT,\
            "--output_stat_bucket" : BUCKET,\
            "--output_stat_path" : STAT_DEPLOYMENT_PATH,\
            "--output_stat_file" : STAT_FILE
            }

detect_job = builder.job_file(f'{SCRIPT_PATH}/anomaly_detection.py')\
.job_id(DETECTION_JOB_ID)\
.py_file(f'{SCRIPT_PATH}/apigee_ingest_utils.py')\
.py_file(f'{SCRIPT_PATH}/ai4ops_db.py')\
.py_file(f'{SCRIPT_PATH}/yarn_logging.py')\
.py_file(f'{SCRIPT_PATH}/yarn_plotter.py')\
.file(INVERSE_TRANSFORMER_PATH)\
.jars(f"gs://{BUCKET}/resources/mysql-connector-java-8.0.16.jar")
.arguments(**arguments)\
.build_job()

detect_executor = DataprocExecutor(detect_job, session)

In [None]:
detect_res = detect_executor.submit_job(run_async=True)

In [None]:
sleep(60)
state = detect_executor.get_job_state()

print('State : {}'.format(state))
if state not in ['DONE', 'RUNNING']:
    raise RuntimeError('Previous workflow step was failed')

In [None]:
transition_anomalies = {
    "ANOMALIES_JOB_ID": DETECTION_JOB_ID,
    "ANOMALIES_TIMESTAMP": TIMESTAMP,
    "ANOMALIES_ANALYTICS": BUCKET,
    "ANOMALIES_STATE": state,
    "ANOMALIES_VERSION": VERSION,
    "STAT_PATH": f"gs://{BUCKET}/{STAT_DEPLOYMENT_PATH}/${STAT_FILE}"
}

print(transition_prediction)

with open('api_transition_anomalies.json', 'w') as file:
     file.write(json.dumps(transition_anomalies)) 

In [None]:
ANOMALIES_VERSION = transition.get('ANOMALIES_VERSION', '')
ANOMALIES_EXAMPLE_METRIC = 'cncui_blue-available_memory_mb-agg'

ANOMALIES_ANALYTICS = transition.get('ANOMALIES_ANALYTICS', '')
ANOMALIES_VERSION_ENC = ANOMALIES_VERSION.replace(' ', '%20')
GRAFANA_REF = ('<a id="grafana"></a><h2>Grafana Anomalies Dashboard</h2><a href="http://ai4ops-grafana-0:8080/d/1rJbPKnzWk1/ai4ops-versioned-anomalies?orgId=1&' + 
            'var-anomaly_metric_name={}&var-anomaly_version={}">{}</a>'.format(ANOMALIES_EXAMPLE_METRIC, 
                                                                           ANOMALIES_VERSION_ENC,
                                                                             ANOMALIES_VERSION))
from IPython.display import HTML, display
display(HTML(GRAFANA_REF))


<a id="analytics"></a>
## Analytics
[back to Table Of Contents](#tc)

In [None]:
import os
shutil.rmtree('anomalies_analytics')
os.mkdir('anomalies_analytics')

GCPHelper.download_folder_from_gs(BUCKET, ANOMALIES_ANALYTICS, 'anomalies_analytics')

In [None]:
from IPython.display import FileLink, FileLinks
FileLinks('anomalies_analytics')

<a id="streaming"></a>
## Streaming
[back to Table Of Contents](#tc)

In [None]:
builder = DataprocJobBuilder()
session = Session(BUCKET, REGION, CLUSTER, PROJECT)

In [None]:
CONFIG_PATH = ""

DB_SECRET="kohls_db.txt"
ND_SECRET="kohls_nd.txt"
BATCH_DURATION = ""

arguments = {'--token_file_gcs_path':f'gs://{BUCKET}/resources/kohls_nd.txt',\
             '--db_credentials_file_gcs_path': f"gs://{BUCKET}/resources/{DB_SECRET}" \
             '--res_path':RESOURCES,\
             '--duration':DURATION,\
             '--pool_size':POOL_SIZE,\
             '--batch_duration':BATCH_DURATION,\
             '--metric_db_table':'metric_rt'
            }

In [None]:
TIMESTAMP=int(datetime.now().timestamp())


STREAMING_JOB_ID=f"ai4ops_streaming_nd_ingest_{TIMESTAMP}"

streaming_job = builder.job_file(f'{SCRIPT_PATH}/nd_streaming_ingest.py')\
.job_id(DETECTION_JOB_ID)\
.py_file(f'{SCRIPT_PATH}/apigee_ingest_utils.py')\
.py_file(f'{SCRIPT_PATH}/ai4ops_db.py')\
.py_file(f'{SCRIPT_PATH}/yarn_logging.py')\
.py_file(f'{SCRIPT_PATH}/apigee_history_ingest.py')\
.py_file(f'{SCRIPT_PATH}/nd_ingest.py')\
.file(CONFIG_PATH)\
.jars(f"gs://{BUCKET}/resources/mysql-connector-java-8.0.16.jar")\
.jar(f"gs://{BUCKET}/resources/spark.http.apigee-1.0-SNAPSHOT-jar-with-dependencies.jar")\
.max_failures(-1)\
.arguments(**arguments)\
.build_job()

streaming_executor = DataprocExecutor(detect_job, session)

In [None]:
detect_res = streaming_executor.submit_job(run_async=True)

In [None]:
sleep(60)
state = streaming_executor.get_job_state()

print('State : {}'.format(state))
if state not in ['DONE', 'RUNNING']:
    raise RuntimeError('Previous workflow step was failed')

<a id="gcs"></a>
## Push Notebook to GCS Bucket
[back to Table Of Contents](#tc)

In [None]:
from IPython.display import Javascript

script = '''
require(["base/js/namespace"],function(Jupyter) {
    Jupyter.notebook.save_checkpoint();
});
'''

def notebook_save():
    Javascript(script)
    print('This notebook has been saved')
notebook_save()

In [None]:
GS_NOTEBOOKS_PATH = 'ai4ops-source/ai4ops-jupyter-ds-01'
BUCKET = 'ai4ops-main-storage-bucket'
PROJECT = 'kohls-kos-cicd'
upload_file_to_gs(PROJECT, BUCKET, './api_net_diagnostics_ingest.ipynb', GS_NOTEBOOKS_PATH)