# Save the Model and the Evaluation Data

To save this model and the evaluation data so that you can use it from various locations, including other notebooks or the model server, upload it to s3-compatible storage.

## Install the required packages and define a function for the upload

In [None]:
!pip install boto3==1.35.59 botocore==1.35.59 python-dotenv==1.0.1

Name: boto3
Version: 1.35.59
Summary: The AWS SDK for Python
Home-page: https://github.com/boto/boto3
Author: Amazon Web Services
Author-email: 
License: Apache License 2.0
Location: /opt/conda/lib/python3.10/site-packages
Requires: botocore, jmespath, s3transfer
Required-by: 
---
Name: botocore
Version: 1.35.59
Summary: Low-level, data-driven core of boto 3.
Home-page: https://github.com/boto/botocore
Author: Amazon Web Services
Author-email: 
License: Apache License 2.0
Location: /opt/conda/lib/python3.10/site-packages
Requires: jmespath, python-dateutil, urllib3
Required-by: boto3, s3transfer
---
Name: python-dotenv
Version: 1.0.1
Summary: Read key-value pairs from a .env file and set them as environment variables
Home-page: https://github.com/theskumar/python-dotenv
Author: Saurabh Kumar
Author-email: me+github@saurabh-kumar.com
License: BSD-3-Clause
Location: /opt/conda/lib/python3.10/site-packages
Requires: 
Required-by: 


In [None]:
import os
import boto3
import botocore
from dotenv import load_dotenv

# Load the .env file
load_dotenv()

aws_access_key_id = os.environ.get('AWS_ACCESS_KEY_ID')
aws_secret_access_key = os.environ.get('AWS_SECRET_ACCESS_KEY')
region_name = os.environ.get('AWS_REGION')
bucket_name = os.environ.get('AWS_S3_BUCKET')

if not all([aws_access_key_id, aws_secret_access_key, region_name, bucket_name]):
    raise ValueError("One or more data connection variables are empty.  "
                     "Please check your data connection to an S3 bucket.")

session = boto3.session.Session(aws_access_key_id=aws_access_key_id,
                                aws_secret_access_key=aws_secret_access_key,
                                region_name=region_name)

s3_resource = session.resource(
    's3',
    config=botocore.client.Config(signature_version='s3v4')
)

bucket = s3_resource.Bucket(bucket_name)


def upload_file_to_s3(local_file, s3_key):
    print(f"{local_file} -> {s3_key}")
    bucket.upload_file(local_file, s3_key)

def upload_directory_to_s3(local_directory, s3_prefix):
    num_files = 0
    for root, dirs, files in os.walk(local_directory):
        for filename in files:
            file_path = os.path.join(root, filename)
            relative_path = os.path.relpath(file_path, local_directory)
            s3_key = os.path.join(s3_prefix, relative_path)
            print(f"{file_path} -> {s3_key}")
            bucket.upload_file(file_path, s3_key)
            num_files += 1
    return num_files


def list_objects(prefix):
    filter = bucket.objects.filter(Prefix=prefix)
    for obj in filter.all():
        print(obj.key)

## Verify the upload

In your S3 bucket, under the `models` upload prefix, run the `list_object` command. As best practice, to avoid mixing up model files, keep only one model and its required files in a given prefix or directory. This practice allows you to download and serve a directory with all the files that a model requires. 

If this is the first time running the code, this cell will have no output.

If you've already uploaded your model, you should see this output: `models/fraud/1/model.onnx`


In [None]:
list_objects("models")

If you've already uploaded your model, you should see this output: `scaler.pkl` and `test_data.pkl`

In [None]:
list_objects("artifact")

## Upload and check again

In [None]:
# Compress files: models/fraud/1/model.onnx, artifact/scaler.pkl and artifact/scaler.pkl using python's zipfile module
import zipfile
import os

# Define the files to be compressed
files_to_compress = [
    "models/fraud/1/model.onnx",
    "artifact/scaler.pkl",
    "artifact/test_data.pkl"
]

# Define the name of the compressed file
compressed_file_name = "models/evaluation_kit.zip"

# Create a zip file and add the files to it
with zipfile.ZipFile(compressed_file_name, 'w') as zipf:
    for file in files_to_compress:
        zipf.write(file)

# Verify the compressed file
if os.path.exists(compressed_file_name):
    print(f"Files compressed successfully. Compressed file: {compressed_file_name}")
    # Upload the compressed file to S3
    upload_file_to_s3(compressed_file_name, compressed_file_name)
else:
    print("Failed to compress files.")

Use the function to upload the `models` and `artifact` folders in a rescursive fashion:

In [None]:
local_models_directory = "models"

if not os.path.isdir(local_models_directory):
    raise ValueError(f"The directory '{local_models_directory}' does not exist.  "
                     "Did you finish training the model in the previous notebook?")

num_files = upload_directory_to_s3("models", "models")

if num_files == 0:
    raise ValueError("No files uploaded.  Did you finish training and "
                     "saving the model to the \"models\" directory?  "
                     "Check for \"models/fraud/1/model.onnx\"")

local_artifacts_directory = "artifact"

if not os.path.isdir(local_artifacts_directory):
    raise ValueError(f"The directory '{local_artifacts_directory}' does not exist.  "
                     "Did you finish training the model in the previous notebook?")

num_files = upload_directory_to_s3(local_artifacts_directory, local_artifacts_directory)

if num_files == 0:
    raise ValueError("No files uploaded.  Did you finish training and "
                     "saving the model to the \"artifacts\" directory?")


To confirm this worked, run the `list_objects` function again:

In [None]:
list_objects("models")
list_objects("artifact")

### Next Step

Now that you've saved the model to s3 storage, you can refer to the model by using the same data connection to serve the model as an API.
