How to fix error `Exception while invoking preprocessor script.` during scheduled model-quality-monitor job.

**Describe the bug**
I have scheduled model-quality-monitoring jobs with hourly frequency. As part of which, the ground-truth-merge job completes without issues and model-quality-monitoring job exits out with code 1 with this error:
```
Exception in thread "main" com.amazonaws.sagemaker.dataanalyzer.exception.Error: Exception while invoking preprocessor script.
```
I have included print statements in my preprocessor.py script and I presume the script was never run, because nothing was printed to my cloudwatch logs. So, I think container (or processing job) failed even before the script was run. So, what's stopping the container from using the script and how do I fix this error.


**To reproduce**
I have created an model-quality-monitoring schedule as mentioned in [this notebook from sagemaker-examples-repo](https://github.com/aws/amazon-sagemaker-examples/blob/master/sagemaker_model_monitor/model_quality/model_quality_churn_sdk.ipynb), with few changes to suit my use case. I have a regression problem(or target variable is continuous) and here is all the code, relevant files and screenshots.


<details>
  <summary>Here's the code I am using to schedule the monitoring job and also create alarm in CloudWatch: </summary>

```
## code is almost entirely from the example notebook from sagemaker-examples-repository linked above.
import boto3
from sagemaker import get_execution_role, Session
from sagemaker.model_monitor import ModelQualityMonitor
from sagemaker.model_monitor import EndpointInput
# create clients
sm_boto_client = boto3.client('sagemaker')
sm_session = Session()
cw_boto_client = boto3.Session().client("cloudwatch")

# initiating relevant variables
endpoint_name = 'ch-mqm-ep-try31'
try_num = 34

# make a Monitoring schedule name
ch_monitor_schedule_name = f"ch-mqm-sched-try{try_num}-sor"
# Create an enpointInput instance
endpointInput = EndpointInput(endpoint_name=endpoint_name,
                              destination="/opt/ml/processing/input_data",
                              inference_attribute='data')

# create ModelQualityMonitor instance
ch_mqm_try34_sor = ModelQualityMonitor(
    role=get_execution_role(),
    instance_count=1,
    instance_type="ml.m5.xlarge",
    volume_size_in_gb=30,
    max_runtime_in_seconds=1800,
    sagemaker_session=sm_session)

# create the schedule now
mon_sched_suffix = f"ch-mqm-ep-try{try_num}-sor"
ch_mqm_try34_sor.create_monitoring_schedule(
    monitor_schedule_name=ch_monitor_schedule_name,
    endpoint_input=endpointInput,
    record_preprocessor_script=preprocessor_s3_uri,
    output_s3_uri=f's3://api-trial/{mon_sched_suffix}_monitoring_schedule_outputs',
    problem_type="Regression",
    ground_truth_input=f's3://api-trial/{endpoint_name}_groundtruths',
    constraints='s3://api-trial/ch-mqm-ep-25-5_mqm_artifacts/constraints.json',
    schedule_cron_expression=CronExpressionGenerator.hourly(),
    enable_cloudwatch_metrics=True)

# creating cloudwatch alarm now
alarm_name = f"MODEL_QUALITY_R2_try{try_num}_sor"

alarm_desc = (
    "Trigger an CloudWatch alarm when the rmse score drifts away from the baseline constraints"
)

mdoel_quality_r2_drift_threshold = (
    0.7238122  ##Setting this threshold purposefully low to see the alarm quickly.
)

metric_name = "r2"
namespace = f"aws/sagemaker/Endpoints/model-metrics-try{try_num}-sor"

# put the alarm in cloudwatch now
cw_boto_client.put_metric_alarm(
    AlarmName=alarm_name,
    AlarmDescription=alarm_desc,
    ActionsEnabled=True,
    MetricName=metric_name,
    Namespace=namespace,
    Statistic="Average",
    Dimensions=[
        {"Name": "Endpoint", "Value": endpoint_name},
        {"Name": "MonitoringSchedule",
         "Value": ch_monitor_schedule_name}
    ],
    Period=600,
    EvaluationPeriods=1,
    DatapointsToAlarm=1,
    Threshold=mdoel_quality_r2_drift_threshold,
    ComparisonOperator="LessThanOrEqualToThreshold",
    TreatMissingData="breaching",
)
```

</details>

<details>
  <summary>Here's my "preprocessor.py" file: </summary>

```
from json import loads as json_loads
from base64 import b64decode as base64_b64decode

def preprocess_handler(inference_record):
    # input_data = json_loads(inference_record.endpoint_input.data)
    # input_data = {f"feature{str(i).zfill(2)}": val for i, val in enumerate(input_data)}
    print(inference_record)
    output_data = json_loads(inference_record.endpoint_output.data)
    output_data = base64_b64decode(output_data.encode('utf-8')).decode('utf-8')
    output_data = {"prediction0": output_data}
    print(output_data, '\n')
    return {**output_data}
```

</details>

<details>
  <summary>Here's a couple of lines form the merged ground truth file: </summary>

```
{"eventVersion":"0","groundTruthData":{"data":170300.0,"encoding":"CSV"},"captureData":{"endpointOutput":{"data":"MjAyNTkyLjA=","encoding":"BASE64","mode":"OUTPUT","observedContentType":"text/html; charset=utf-8"}},"eventMetadata":{"eventId":"0faaafcf-5b6c-4dd9-8831-f734eed9aaa0","inferenceId":"60b48ef7-4609-4322-ac5d-e72ccd5d9492","inferenceTime":"2021-09-26T06:40:40Z"}}
{"eventVersion":"0","groundTruthData":{"data":152600.0,"encoding":"CSV"},"captureData":{"endpointOutput":{"data":"MTQ5NzU2LjA=","encoding":"BASE64","mode":"OUTPUT","observedContentType":"text/html; charset=utf-8"}},"eventMetadata":{"eventId":"7400f3e7-0b00-47c2-b39e-efd9cc155a45","inferenceId":"907f0ada-a61b-476e-8a6c-a6d284a249da","inferenceTime":"2021-09-26T06:40:42Z"}}
```

</details>

**Expected behavior**
Expected behaviour from that the job is that it use the preprocessing.py script I provided while creating the schedule, calculates R2(or R^2) metric and accordingly update the alarm in CloudWatch.

**Screenshots or logs**
Here's the screenshot of both the jobs:
![image](https://user-images.githubusercontent.com/41302653/134808380-82dab044-b94f-44c3-9d30-bc14fe0c647c.png)

Here's the screenshot of log of the failed model-quality-monitoring job:
![image](https://user-images.githubusercontent.com/41302653/134808335-073b18dc-cf29-464f-a500-001f1fef0eba.png)
Please tell me if any other logs are needed.

**System information**
A description of your system. Please provide:
(I am doing all this from SageMaker Studio, scheduling the jobs and running all the scripts. And I am also not using any custom docker images.)
- **SageMaker Python SDK version**: 2.59.3 (on sagemaker studio)
- **Framework name (eg. PyTorch) or algorithm (eg. KMeans)**: sagemaker-model-monitor-analyzer (for a scheduled model-quality-monitoring job)
- **Framework version**: (I don't know the version of the container, it had no docker tag)
- **Python version**: `3.7.10 (default, Jun  4 2021, 14:48:32)` (sys.version printed this on studio)
- **CPU or GPU**: CPU
- **Custom Docker image (Y/N)**: N

**Additional context**
1. Since the error is about preprocessing script note being run, here it is in S3 with the code I have shown above:
![image](https://user-images.githubusercontent.com/41302653/134809425-06e16efc-b3db-47e8-aa01-45f5f1fbb17d.png)
2. I have also tried another scheduling job with naming my script `preprocessing.py`, and that generated same error too.
3. in the details of job, on sagemaker console, you can see the script being fetched for scheduled-monitoring job here:
![image](https://user-images.githubusercontent.com/41302653/134809516-3690489d-c836-4046-9211-d470a5c7d354.png)


How do I fix this error?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to fix error `Exception while invoking preprocessor script.` during scheduled model-quality-monitor job. #2665

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How to fix error Exception while invoking preprocessor script. during scheduled model-quality-monitor job. #2665

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

How to fix error `Exception while invoking preprocessor script.` during scheduled model-quality-monitor job. #2665