### JOBS for Data Drift Detection

Create, run and monitor a JOB 

In this NoteBook we show how to create, run and monitor a JOB

The code is packed in a tar.gz file, saved in Object Storage

* migrated to Python 3.8

In [1]:
import os
import ads

from ads.jobs import DataScienceJob
from ads.jobs import ScriptRuntime
from ads.jobs import Job

from ads import set_auth

In [2]:
print(ads.__version__)

2.8.2


In [3]:
compartment_id = os.environ['NB_SESSION_COMPARTMENT_OCID']
project_id = os.environ['PROJECT_OCID']

set_auth(auth='resource_principal')

In [4]:
# check what is available for fast start
DataScienceJob.fast_launch_shapes()

[{
   "core_count": 1,
   "managed_egress_support": "SUPPORTED",
   "memory_in_gbs": 15,
   "name": "VM.STANDARD2.1_C1_M15GB_SUPPORTED",
   "shape_name": "VM.Standard2.1",
   "shape_series": "INTEL_SKYLAKE"
 },
 {
   "core_count": 4,
   "managed_egress_support": "SUPPORTED",
   "memory_in_gbs": 60,
   "name": "VM.STANDARD2.4_C4_M60GB_SUPPORTED",
   "shape_name": "VM.Standard2.4",
   "shape_series": "INTEL_SKYLAKE"
 }]

#### Specify Infrastructure for JOBS run

In [5]:
# 1. Specify the Infrastructure requested
# VM Shape, logging
# network is taken from NB session

# for fast start
SHAPE_NAME = "VM.Standard2.4"
LOG_GROUP_ID = "ocid1.loggroup.oc1.eu-milan-1.amaaaaaangencdya37xpdas7cenw3thhfetpb5qe75ymyymoo2b4w42pbrsq"
LOG_ID = "ocid1.log.oc1.eu-milan-1.amaaaaaangencdyaspdct6j6xl4umonzqwvvhrysal7lcxi2gcj6vt7doqaa"

# you need to provide the OCID for LogGroup and Log
infrastructure = (
    DataScienceJob()
    .with_shape_name(SHAPE_NAME)
    .with_log_group_id(LOG_GROUP_ID)
    .with_log_id(LOG_ID)
)

#### Specify the runtime

In [6]:
#
# all the Python code is packed in drift.tar.gz, saved in a Object Storage bucket
# url: oci://drift_input@frqap2zhtzbe/drift.tar.gz
#

# specify the runtime and conda env 
runtime = (
    ScriptRuntime()
    .with_source("oci://WORKSHOP@frqap2zhtzbe/test.tar.gz")
    .with_service_conda("generalml_p37_cpu_v1")
    .with_environment_variable(JOB_RUN_ENTRYPOINT="test.py")
)

#### Specify the JOB

In [7]:
# specify the JOB
job = (
    Job(name="job_test")
    .with_infrastructure(infrastructure)
    .with_runtime(runtime)
)

#### Create the JOB definition

In [8]:
# create the JOB
job.create()

kind: job
spec:
  id: ocid1.datasciencejob.oc1.eu-milan-1.amaaaaaangencdyaxjohphcyvy72dde4dus7qrtqtz3jhnb5aaggdjkbphfa
  infrastructure:
    kind: infrastructure
    spec:
      blockStorageSize: 500
      compartmentId: ocid1.compartment.oc1..aaaaaaaag2cpni5qj6li5ny6ehuahhepbpveopobooayqfeudqygdtfe6h3a
      displayName: job_test
      jobInfrastructureType: STANDALONE
      jobType: DEFAULT
      logGroupId: ocid1.loggroup.oc1.eu-milan-1.amaaaaaangencdya37xpdas7cenw3thhfetpb5qe75ymyymoo2b4w42pbrsq
      logId: ocid1.log.oc1.eu-milan-1.amaaaaaangencdyaspdct6j6xl4umonzqwvvhrysal7lcxi2gcj6vt7doqaa
      projectId: ocid1.datascienceproject.oc1.eu-milan-1.amaaaaaangencdyageryq6wvsxw6rjdjwagoym3h7hnncszqqnq34g3aakoq
      shapeName: VM.Standard2.4
      subnetId: ocid1.subnet.oc1.eu-milan-1.aaaaaaaajiptbm2u4svnhnnk7uvb7owx7iii2fqb52n2oz7ura43mizniskq
    type: dataScienceJob
  name: job_test
  runtime:
    kind: runtime
    spec:
      conda:
        slug: generalml_p37_cpu_v1
        type

#### Run the JOB

In [9]:
# run
job_run = job.run()

#### attach and display the log

In [10]:
# watch and stream the job run outputs
job_run.watch()

Job OCID: ocid1.datasciencejob.oc1.eu-milan-1.amaaaaaangencdyaxjohphcyvy72dde4dus7qrtqtz3jhnb5aaggdjkbphfa
Job Run OCID: ocid1.datasciencejobrun.oc1.eu-milan-1.amaaaaaangencdyabpu5cl4ujxvp72d2lkkknq7kbrmlr2xc53fu4ue6liiq
2023-03-22 14:13:10 - Job Run ACCEPTED
2023-03-22 14:13:16 - Job Run ACCEPTED, Infrastructure provisioning.
2023-03-22 14:14:16 - Job Run ACCEPTED, Infrastructure provisioned.
2023-03-22 14:14:38 - Job Run ACCEPTED, Job run bootstrap starting.
2023-03-22 14:17:25 - Job Run ACCEPTED, Job run bootstrap complete. Artifact execution starting.
2023-03-22 14:17:28 - Job Run IN_PROGRESS, Job run artifact execution in progress.
2023-03-22 14:17:22 - JOB starting...
2023-03-22 14:17:22 - 
2023-03-22 14:17:23 - 
2023-03-22 14:17:23 - JOB ending.
2023-03-22 14:17:23 - 
2023-03-22 14:17:45 - Job Run SUCCEEDED, Job run artifact execution succeeded. Infrastructure de-provisioning.


kind: jobRun
spec:
  id: ocid1.datasciencejobrun.oc1.eu-milan-1.amaaaaaangencdyabpu5cl4ujxvp72d2lkkknq7kbrmlr2xc53fu4ue6liiq
  infrastructure:
    kind: infrastructure
    spec:
      blockStorageSize: 500
      compartmentId: ocid1.compartment.oc1..aaaaaaaag2cpni5qj6li5ny6ehuahhepbpveopobooayqfeudqygdtfe6h3a
      displayName: job_test-run-2023-03-22-14:13.06
      jobInfrastructureType: STANDALONE
      jobType: DEFAULT
      logGroupId: ocid1.loggroup.oc1.eu-milan-1.amaaaaaangencdya37xpdas7cenw3thhfetpb5qe75ymyymoo2b4w42pbrsq
      logId: ocid1.log.oc1.eu-milan-1.amaaaaaangencdyaspdct6j6xl4umonzqwvvhrysal7lcxi2gcj6vt7doqaa
      projectId: ocid1.datascienceproject.oc1.eu-milan-1.amaaaaaangencdyageryq6wvsxw6rjdjwagoym3h7hnncszqqnq34g3aakoq
      shapeName: VM.Standard2.4
      subnetId: ocid1.subnet.oc1.eu-milan-1.aaaaaaaajiptbm2u4svnhnnk7uvb7owx7iii2fqb52n2oz7ura43mizniskq
    type: dataScienceJob
  name: job_test-run-2023-03-22-14:13.06
  runtime:
    kind: runtime
    spec:
      