# AIM410 Workshop Setup

This notebook handles all the setup requirements for the AIM410 workshop on efficient AI model customization using Amazon SageMaker AI.

## What this notebook does:

- Installs all required Python dependencies
- Sets up SageMaker session and execution role
- Configures cross-account access for shared endpoints
- Clones and prepares the Spectrum repository for fine-tuning
- Sets up database infrastructure (S3, Glue, Athena)
- Creates retail database tables for evaluation

Run this notebook first before proceeding with the workshop notebooks.


## 1. Install Dependencies


In [None]:
%pip install -r requirements.txt -U

## 2. SageMaker Session Setup


In [None]:
import sagemaker
import boto3

# Initialize SageMaker session
sagemaker_session = sagemaker.Session()
role = sagemaker.get_execution_role()
region = sagemaker_session.boto_session.region_name
account_id = boto3.client('sts').get_caller_identity()['Account']
bucket_name = sagemaker_session.default_bucket()

print(f"SageMaker role: {role}")
print(f"Default bucket: {bucket_name}")
print(f"Region: {region}")
print(f"Account ID: {account_id}")

## 3. SageMaker AI Model Deployment

Deploy the model to a SageMaker AI endpoint in order to query later.

<div class="alert alert-block alert-info">
  <b>Important:</b> If you're running this at an AWS event, we have pre-deployed a SageMaker endpoint for you in the AWS account you're using. <b>You can skip to the next step!</b>
</div>

If you're running this on your own AWS account or you want to see how to deploy a Hugging Face model on a SageMaker endpoint, please refer to the `optional/1-vanilla-model-deployment.ipynb` notebook in the provided repository.

In [None]:
VANILLA_ENDPOINT_NAME = "vanilla-qwen3"
CUSTOM_ENDPOINT_NAME = "custom-qwen3"

We have also pre-provisioned a MLFlow Tracking Server for you:

In [None]:
from utils.mlflow_tracking import *

MLFLOW_TRACKING_URI = get_mlflow_arn()
MLFLOW_TRACKING_URI

## 4. Spectrum Repository Setup

Clone the Spectrum repository for spectrum-aware fine-tuning.


In [None]:
import os

spectrum_clone_folder = "~/spectrum"

# Clone Spectrum repository
try:
    !git clone https://github.com/QuixiAI/spectrum.git {spectrum_clone_folder}
    print("✓ Spectrum repository cloned successfully")
except:
    print("⚠️ Repository may already exist or clone failed")

# Create spectrum-layer directory
!mkdir -p ./scripts/spectrum-layer/
print("✓ Spectrum layer directory created")

## 5. Databases and tables

<div class="alert alert-block alert-info">
	If you're running this at an AWS event, we have pre-deployed the tables in AWS Glue. In the cell below, we're just pointing to them. However, in your own AWS account, make sure to create an AWS Glue database first, then paste its name below.
</div>

In [None]:
db_name = "retail"
s3_output = f"s3://{bucket_name}/athena-query-output/"

## 7. Store required variables across sessions


In [None]:
%store VANILLA_ENDPOINT_NAME CUSTOM_ENDPOINT_NAME MLFLOW_TRACKING_URI account_id bucket_name region db_name s3_output

## Setup Complete

✅ **Workshop setup is now complete!**

### What was configured:

- Python dependencies installed
- SageMaker session initialized
- Cross-account access configured
- Spectrum repository cloned
- S3 bucket and Glue database created
- Retail dataset loaded and tables created

### Next steps:

1. **Notebook 1**: Evaluate base Qwen3 model performance
2. **Notebook 2**: Perform spectrum-aware fine-tuning
3. **Notebook 3**: Evaluate fine-tuned model performance
4. **Notebook 4**: Build autonomous agents with Strands

You can now proceed with the workshop notebooks in order.
