# Deploy Sentence Highlighter Model to Amazon SageMaker

This notebook demonstrates the process of deploying a sentence highlighting model to Amazon SageMaker. This model can be used to identify and highlight important sentences in a given text.

We will cover the following steps:
1.  **Setup**: Importing necessary libraries and configuring logging.
2.  **SageMaker Role**: Creating or retrieving an IAM role for SageMaker to access AWS resources.
3.  **Model Packaging**: Preparing the model artifacts, inference script, and dependencies into a `model.tar.gz` file.
4.  **Deployment**: Uploading the packaged model to S3 and deploying it as a SageMaker endpoint.
5.  **Testing (Optional)**: How to invoke the deployed endpoint.

In [None]:
import os
import time
import logging
import boto3
import sagemaker
import shutil
import tarfile
from datetime import datetime
import json # Ensure json is imported as it's used in create_sagemaker_role
from sagemaker.pytorch import PyTorchModel
from sagemaker.serializers import JSONSerializer
from sagemaker.deserializers import JSONDeserializer
# from sagemaker import get_execution_role # This is often used, but the script uses a custom role creation

In [None]:
# Configure logging
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)

logger.info("Libraries imported and logging configured.")

## 2. SageMaker IAM Role

Amazon SageMaker needs permissions to access other AWS services, such as S3 (for model artifacts) and CloudWatch (for logs). We define a function `create_sagemaker_role` that either creates a new IAM role with the necessary permissions or fetches an existing one.

**Permissions required:**
*   `AmazonSageMakerFullAccess`: Allows SageMaker to manage training jobs, endpoints, etc.
*   `AmazonS3FullAccess`: Allows SageMaker to read model data from S3 and write output. (Note: In a production environment, you should scope this down to specific buckets).

In [None]:
def create_sagemaker_role(role_name='SageMakerExecutionRoleScript'):
    """Create or get a SageMaker execution role with necessary permissions."""
    iam = boto3.client('iam')
    account_id = boto3.client('sts').get_caller_identity()['Account']
    role_arn = f'arn:aws:iam::{account_id}:role/{role_name}'

    try:
        logger.info(f"Checking for existing role: {role_name}")
        iam.get_role(RoleName=role_name)
        logger.info(f"Role {role_name} already exists. Using ARN: {role_arn}")
    except iam.exceptions.NoSuchEntityException:
        logger.info(f"Role {role_name} not found. Creating new role...")
        assume_role_policy_document = {
            "Version": "2012-10-17",
            "Statement": [
                {
                    "Effect": "Allow",
                    "Principal": {"Service": "sagemaker.amazonaws.com"},
                    "Action": "sts:AssumeRole"
                }
            ]
        }
        try:
            role_response = iam.create_role(
                RoleName=role_name,
                AssumeRolePolicyDocument=json.dumps(assume_role_policy_document)
            )
            role_arn = role_response['Role']['Arn']
            logger.info(f"Created new role: {role_name} with ARN: {role_arn}")

            policies = [
                'arn:aws:iam::aws:policy/AmazonSageMakerFullAccess',
                'arn:aws:iam::aws:policy/AmazonS3FullAccess' # Consider more restrictive policies for production
            ]
            for policy_arn_to_attach in policies:
                iam.attach_role_policy(RoleName=role_name, PolicyArn=policy_arn_to_attach)
                logger.info(f"Attached policy {policy_arn_to_attach} to role {role_name}")
            
            logger.info("Waiting 10 seconds for role to be fully available...")
            time.sleep(10) # IAM changes can take a moment to propagate
        except Exception as e:
            logger.error(f"Failed to create role or attach policies: {str(e)}")
            raise
    except Exception as e:
        logger.error(f"Failed to get SageMaker role '{role_name}': {str(e)}")
        raise
    return role_arn

# Example of how to call it (optional: can be called later in the main execution block)
# try:
#     sagemaker_role_arn = create_sagemaker_role()
#     logger.info(f"Using SageMaker Role ARN: {sagemaker_role_arn}")
# except Exception as e:
#     logger.error(f"Role creation failed: {e}")

## 3. Model Packaging

To deploy a model to SageMaker, we need to package it in a specific format. This typically involves creating a `model.tar.gz` file containing:
*   The trained model artifact (e.g., `model.pt` or `pytorch_model.bin`).
*   An inference script (commonly named `inference.py`) that SageMaker will use to load the model and make predictions. This script must define specific functions like `model_fn`, `input_fn`, `predict_fn`, and `output_fn`.
*   A `requirements.txt` file listing any dependencies that need to be installed in the SageMaker serving container.

The `prepare_model_package` function below automates this process. You will need to provide paths to your model file, your `inference.py` script, and your `requirements.txt` file.

In [None]:
def prepare_model_package(source_model_path, inference_script_path, requirements_path, model_dir="model_package"):
    """Prepare model package for deployment."""
    try:
        logger.info("Preparing model package...")
        
        # Clean up existing model directory
        if os.path.exists(model_dir):
            shutil.rmtree(model_dir)
        os.makedirs(model_dir)
        
        logger.info(f"Using source model path: {source_model_path}")
        if not os.path.exists(source_model_path):
            raise FileNotFoundError(f"Model file not found at {source_model_path}")
        
        # Create code directory for inference script
        code_dir = os.path.join(model_dir, "code")
        os.makedirs(code_dir, exist_ok=True)
        
        # Copy inference script to code directory
        logger.info(f"Using inference script path: {inference_script_path}")
        if not os.path.exists(inference_script_path):
            raise FileNotFoundError(f"Inference script not found at {inference_script_path}")
        shutil.copy(inference_script_path, code_dir)
        
        # Copy requirements.txt to code directory
        logger.info(f"Using requirements path: {requirements_path}")
        if not os.path.exists(requirements_path):
            raise FileNotFoundError(f"requirements.txt not found at {requirements_path}")
        shutil.copy(requirements_path, code_dir)
        
        # Copy model file to model directory (SageMaker expects the model file at the root of the tarball or in a specific structure)
        # For PyTorchModel, placing it in the root of the tarball is common.
        # The arcname in tar.add will determine its path in the archive.
        target_model_filename = os.path.basename(source_model_path)
        # shutil.copy(source_model_path, os.path.join(model_dir, target_model_filename)) # Not strictly needed to copy here if only adding to tarball from source_model_path

        # Create model.tar.gz
        output_tarball = "model.tar.gz"
        if os.path.exists(output_tarball):
            os.remove(output_tarball)
            
        logger.info(f"Creating {output_tarball}...")
        with tarfile.open(output_tarball, "w:gz") as tar:
            # Add model file to the root of the tarball
            tar.add(source_model_path, arcname=target_model_filename)
            # Add the code directory (containing inference.py and requirements.txt)
            tar.add(code_dir, arcname="code")
        
        logger.info(f"Model package prepared successfully as {output_tarball}")
        return output_tarball
        
    except Exception as e:
        logger.error(f"Error preparing model package: {str(e)}")
        raise

# Example usage (paths will need to be defined by the user)
# try:
#     # Define these paths according to your project structure
#     model_file = "path/to/your/opensearch-semantic-highlighter-v1.pt" 
#     inference_py = "path/to/your/inference.py"
#     requirements_txt = "path/to/your/requirements.txt"
#
#     if not (os.path.exists(model_file) and os.path.exists(inference_py) and os.path.exists(requirements_txt)):
#         logger.warning("One or more required files (model, inference script, requirements) not found. Skipping example packaging.")
#     else:
#         model_package_tarball = prepare_model_package(
#             source_model_path=model_file,
#             inference_script_path=inference_py,
#             requirements_path=requirements_txt
#         )
#         logger.info(f"Model package created: {model_package_tarball}")
#
# except FileNotFoundError as fnf:
#     logger.error(f"Packaging failed due to missing file: {fnf}")
# except Exception as e:
#     logger.error(f"Packaging example failed: {e}")

## 4. Model Deployment to SageMaker

Once the model package (`model.tar.gz`) is ready and the IAM role is set up, we can deploy the model to a SageMaker endpoint.

This involves:
1.  **Generating a unique endpoint name**: To avoid conflicts with existing endpoints.
2.  **Uploading the model package to S3**: SageMaker loads models from S3.
3.  **Creating a SageMaker `PyTorchModel` object**: This object points to the model data in S3 and specifies the inference script, framework versions, and IAM role.
4.  **Deploying the model**: This provisions the necessary infrastructure (e.g., EC2 instances) and deploys the model container. This step can take several minutes.

The `get_endpoint_name` function creates a unique name, and `deploy_sagemaker_model` handles the deployment.

In [None]:
def get_endpoint_name():
    """Generate a unique endpoint name with timestamp."""
    timestamp = datetime.now().strftime("%Y%m%d-%H%M%S")
    return f"semantic-highlighter-{timestamp}"

In [None]:
def deploy_sagemaker_model(model_package_tarball, sagemaker_role_arn, 
                                 entry_point_script='inference.py', # Name of the script inside code/ within model.tar.gz
                                 framework_version='2.0', # Specify a recent, valid PyTorch version
                                 py_version='py310', # Specify a compatible Python version 
                                 instance_type='ml.m5.large', # Choose an appropriate instance type
                                 initial_instance_count=1,
                                 s3_bucket=None, # Optional: specify bucket, else uses default
                                 s3_prefix='semantic-highlighter-script/model'):
    """Deploys the packaged model to a SageMaker endpoint."""
    try:
        endpoint_name = get_endpoint_name()
        logger.info(f"Using endpoint name: {endpoint_name}")
        
        session = sagemaker.Session()
        if not s3_bucket:
            s3_bucket = session.default_bucket()
        logger.info(f"Using S3 bucket: {s3_bucket} and prefix: {s3_prefix}")

        logger.info(f"Uploading {model_package_tarball} to S3...")
        model_s3_uri = session.upload_data(path=model_package_tarball, bucket=s3_bucket, key_prefix=s3_prefix)
        logger.info(f"Model uploaded to S3: {model_s3_uri}")
        
        logger.info("Creating SageMaker PyTorchModel...")
        # source_dir is not needed if entry_point and requirements.txt are inside the tarball's code/ directory.
        # The entry_point argument should be the name of your inference script, e.g., 'inference.py'.
        model = PyTorchModel(
            model_data=model_s3_uri,
            role=sagemaker_role_arn,
            entry_point=entry_point_script, 
            framework_version=framework_version,
            py_version=py_version,
            sagemaker_session=session
        )
        logger.info("SageMaker PyTorchModel created successfully.")
        
        logger.info(f"Starting endpoint deployment for {endpoint_name} on {instance_type}. This may take 5-10 minutes...")
        start_time = time.time()
        
        predictor = model.deploy(
            initial_instance_count=initial_instance_count,
            instance_type=instance_type,
            endpoint_name=endpoint_name,
            serializer=JSONSerializer(),
            deserializer=JSONDeserializer(),
            wait=True # Set to False if you don't want to wait for completion
        )
        
        end_time = time.time()
        logger.info(f"Endpoint {endpoint_name} deployed successfully in {end_time - start_time:.2f} seconds.")
        
        # Save endpoint name for later use (e.g. testing, cleanup)
        endpoint_file = 'sagemaker_endpoint_name.txt'
        with open(endpoint_file, 'w') as f:
            f.write(endpoint_name)
        logger.info(f"Endpoint name saved to {endpoint_file}")
            
        return predictor, endpoint_name
    except Exception as e:
        logger.error(f"Deployment failed: {str(e)}")
        # Consider adding cleanup for the S3 object if deployment fails mid-way
        raise

# Example usage (to be called in the main execution block)
# try:
#     # Assume model_package_tarball and sagemaker_role_arn are available
#     if 'model_package_tarball' in locals() and 'sagemaker_role_arn' in locals():
#         logger.info("Proceeding with deployment example...")
#         deployed_predictor, deployed_endpoint_name = deploy_sagemaker_model(
#             model_package_tarball=model_package_tarball, 
#             sagemaker_role_arn=sagemaker_role_arn,
#             instance_type='ml.t2.medium' # Use a cost-effective instance for testing
#         )
#         logger.info(f"Endpoint '{deployed_endpoint_name}' is now active.")
#     else:
#         logger.warning("Skipping deployment example as model package or role ARN is not defined.")
# except Exception as e:
#     logger.error(f"Deployment example failed: {e}")

### Prerequisites: Acquiring the Model Artifact

Before you can run the deployment, you need the sentence highlighting model artifact. The example script refers to `opensearch-semantic-highlighter-v1.pt`.

**How to obtain the model:**
*   **Pre-trained Model:** If this is a publicly available pre-trained model from the OpenSearch project or another provider, include instructions or a link here. For example:
    *   "You can download the `opensearch-semantic-highlighter-v1.pt` model from [link to model repository/S3 bucket if available]."
    *   "This model is typically packaged with [specific OpenSearch distribution/plugin version]."
*   **Training your own:** If users are expected to train their own model, provide a brief pointer:
    *   "If you have trained your own sentence highlighting model, ensure it's saved in PyTorch's `.pt` format. You might use `torch.save(model, 'your_model.pt')` or `torch.jit.save(scripted_model, 'your_model.pt')`."

**For this notebook, you must:**
1.  Obtain the `opensearch-semantic-highlighter-v1.pt` file (or your own equivalent `.pt` model file).
2.  Update the `source_model_path` variable in the "Run SageMaker Deployment" section (Step 5) to point to the correct location of your model file on your local system.

*(Developer Note: Please replace the placeholder text above with actual links or specific instructions for acquiring the `opensearch-semantic-highlighter-v1.pt` model if this information is available. If not, the current text provides general guidance.)*

### A Note on `inference.py`

The deployment process relies on an `inference.py` script, which tells SageMaker how to load your model and make predictions. A template `inference.py` has been created in the same directory as this notebook (`docs/source/examples/inference.py`).

**Key components of `inference.py`:**
*   **`model_fn(model_dir)`**: Loads your trained model. It expects the model artifacts (like `.pt` file, `config.json`, `vocab.txt`) to be in `model_dir` after SageMaker unpacks `model.tar.gz`.
*   **`input_fn(request_body, request_content_type)`**: Deserializes incoming request data.
*   **`predict_fn(input_data, model)`**: Performs inference using the loaded model and processed input. **You will likely need to customize this function based on your specific model's sentence highlighting logic.**
*   **`output_fn(prediction, response_content_type)`**: Serializes the prediction result to send back in the HTTP response.

**Before running the deployment:**
1.  **Review `docs/source/examples/inference.py`**.
2.  **Customize `predict_fn`** and potentially `model_fn` if your model loading or inference logic is different from the template.
3.  Ensure the `inference_script_path` variable in the 'Run SageMaker Deployment' section below points to your finalized `inference.py` script. The default path in the notebook will point to `docs/source/examples/inference.py`.

### `requirements.txt` for the SageMaker Environment

The `inference.py` script might depend on specific Python packages (e.g., `torch`, `transformers`). You need to list these in a `requirements.txt` file, which will be included in the `model.tar.gz` package. SageMaker will use this file to install the necessary dependencies in the serving container.

A template `requirements.txt` has been created in the same directory as this notebook (`docs/source/examples/requirements.txt`).

**Before running the deployment:**
1.  **Review `docs/source/examples/requirements.txt`**.
2.  **Add or modify any dependencies** to match those required by your `inference.py` script and model.
3.  Ensure the `requirements_path` variable in the 'Run SageMaker Deployment' section below points to your finalized `requirements.txt` file. The default path in the notebook will point to `docs/source/examples/requirements.txt`.

## 5. Run SageMaker Deployment

Now we'll bring it all together. The cell below orchestrates the steps:
1.  **Define Paths**: You **must** update the `source_model_path`, `inference_script_path`, and `requirements_path` variables to point to your files.
2.  **Prepare Model Package**: Calls `prepare_model_package` to create `model.tar.gz`.
3.  **Create/Get SageMaker Role**: Calls `create_sagemaker_role` to ensure you have the necessary IAM permissions.
4.  **Deploy Model**: Calls `deploy_sagemaker_model` to deploy the packaged model to a SageMaker endpoint.

**Important Considerations:**
*   **File Paths**: Ensure the paths to your model (`.pt` file), `inference.py`, and `requirements.txt` are correct. For this example, it's assumed these files are located relative to the notebook, but you can use absolute paths.
*   **Instance Type**: The `instance_type` in `deploy_sagemaker_model` (e.g., `ml.m5.large`, `ml.g4dn.xlarge`) should be chosen based on your model's requirements (CPU/GPU, memory, etc.) and budget. GPU instances like `g4dn` are suitable for larger transformer models if your `inference.py` leverages the GPU. For CPU-based inference or smaller models, `m5` instances can be more cost-effective.
*   **AWS Permissions**: Ensure the AWS credentials used by `boto3` (implicitly configured via AWS CLI, environment variables, or IAM roles for SageMaker notebook instances) have permissions to:
    *   Create and manage IAM roles (if the role doesn't exist).
    *   List, create, and put objects in S3.
    *   Create and manage SageMaker endpoints and models.
*   **Region**: The SageMaker deployment will occur in the AWS region configured for your `boto3` session. You can explicitly set this if needed, e.g., `boto3.setup_default_session(region_name='us-west-2')`.

In [None]:
# --- Configuration: User needs to set these paths ---
# Option 1: Place your files in the same directory as this notebook
# and uncomment the lines below.
# current_dir = os.getcwd() 
# source_model_path = os.path.join(current_dir, "opensearch-semantic-highlighter-v1.pt")
# inference_script_path = os.path.join(current_dir, "inference.py")
# requirements_path = os.path.join(current_dir, "requirements.txt")

# Option 2: Specify absolute or relative paths directly.
# PLEASE UPDATE THESE PATHS
source_model_path = "path/to/your/opensearch-semantic-highlighter-v1.pt" 
inference_script_path = "path/to/your/inference.py" # This will be created in a later step by the notebook
requirements_path = "path/to/your/requirements.txt" # This will be created in a later step by the notebook
    
# Choose your desired SageMaker instance type
# For CPU-only: 'ml.m5.large', 'ml.c5.large', etc.
# For GPU-enabled: 'ml.g4dn.xlarge', 'ml.g5.xlarge', etc. (ensure your inference.py uses the GPU)
sagemaker_instance_type = 'ml.m5.large' 
# --- End Configuration ---

# Variables to store results from steps
model_package_archive = None
sagemaker_role = None
deployed_endpoint = None
predictor_instance = None

try:
    # Validate essential paths before starting
    # We will create dummy inference.py and requirements.txt later if they don't exist,
    # but the model path MUST be provided by the user.
    if not os.path.exists(source_model_path):
        logger.error(f"Source model file not found at: {source_model_path}")
        logger.error("Please update 'source_model_path' to point to your .pt model file.")
        raise FileNotFoundError(f"Model file not found: {source_model_path}")

    # For the purpose of this notebook, we'll create placeholder inference.py and requirements.txt
    # if they don't exist at the specified paths. Users should replace these with their actual files.
    if not os.path.exists(inference_script_path):
        logger.warning(f"Inference script not found at {inference_script_path}. A template will be created.")
        # In a real scenario, the user provides this. Here, we'll create it in a later notebook step.
        # For now, we might just create a dummy file to allow packaging to proceed if this cell is run early.
        # However, the real template creation is a separate plan step.
        # For this cell, we'll assume it will be created before prepare_model_package is called.
        pass # Actual creation is step 7

    if not os.path.exists(requirements_path):
        logger.warning(f"Requirements.txt not found at {requirements_path}. A template will be created.")
        # Similar to inference.py, actual creation is step 8
        pass

    logger.info("---- Step 1: Preparing Model Package ----")
    # Ensure the dummy/template files exist before packaging if running this cell directly after defining paths
    # This will be handled by creating the actual files in plan steps 7 and 8.
    # For now, the user must ensure these paths are valid or wait for those steps.
    model_package_archive = prepare_model_package(
        source_model_path=source_model_path,
        inference_script_path=inference_script_path, # Path to user's or template script
        requirements_path=requirements_path     # Path to user's or template requirements
    )
    logger.info(f"Model package created: {model_package_archive}")

    logger.info("---- Step 2: Creating/Getting SageMaker IAM Role ----")
    sagemaker_role = create_sagemaker_role(role_name='MySageMakerExecutionRoleSH') # Use a specific role name
    logger.info(f"Using SageMaker Role ARN: {sagemaker_role}")

    logger.info("---- Step 3: Deploying Model to SageMaker ----")
    predictor_instance, deployed_endpoint = deploy_sagemaker_model(
        model_package_tarball=model_package_archive, 
        sagemaker_role_arn=sagemaker_role,
        instance_type=sagemaker_instance_type 
        # entry_point, framework_version, py_version use defaults from function definition
    )
    
    logger.info("----------------------------------------------------")
    logger.info(f"SageMaker Endpoint Deployed Successfully!")
    logger.info(f"Endpoint Name: {deployed_endpoint}")
    logger.info(f"Access your endpoint using the 'predictor_instance' object or AWS SDK.")
    logger.info("----------------------------------------------------")
    
except FileNotFoundError as fnf_error:
    logger.error(f"Deployment process failed due to a missing file: {fnf_error}")
    logger.error("Please check the paths provided and ensure all required files exist.")
except Exception as main_error:
    logger.error(f"An error occurred during the deployment process: {main_error}", exc_info=True)
    # Consider adding cleanup for partially created resources if applicable, e.g. S3 objects or roles.

## 6. Testing and Cleanup (Important)

### Testing the Endpoint
You can test your deployed endpoint using the `predictor_instance` object returned by the `deploy` call:

```python
# Example payload - this depends on your model's input_fn in inference.py
# For a sentence highlighter, it might expect a JSON object with a text field.
# payload = {"text": "This is a sample sentence. This is another one."}
# try:
#     if predictor_instance:
#         response = predictor_instance.predict(payload)
#         logger.info(f"Prediction response: {response}")
#     else:
#         logger.warning("Predictor instance is not available. Deployment might have failed.")
# except Exception as e:
#     logger.error(f"Error during prediction: {e}")
```

### Cleaning Up
SageMaker endpoints incur costs while they are running. **It's crucial to delete the endpoint when you are done to avoid ongoing charges.**

```python
# --- IMPORTANT: Delete the endpoint to avoid charges ---
# try:
#     if predictor_instance:
#         endpoint_to_delete = predictor_instance.endpoint_name
#         predictor_instance.delete_endpoint() # Deletes the endpoint and endpoint configuration
#         logger.info(f"Endpoint '{endpoint_to_delete}' deleted successfully.")
#
#         # Optionally, delete the model from SageMaker
#         # model_name = predictor_instance.model_name # This might not be directly available on predictor
#         # To get model name, you might need to list models or infer from endpoint name if consistent
#         # sagemaker_session = sagemaker.Session()
#         # sagemaker_session.delete_model(model_name)
#         # logger.info(f"Model '{model_name}' deleted successfully.")
#
#         # Clean up the S3 bucket objects (optional)
#         # if model_package_archive and sagemaker_role: # check if these were created
#         #    s3_uri_parts = model_s3_uri.replace("s3://", "").split("/")
#         #    bucket_name = s3_uri_parts[0]
#         #    key_prefix = "/".join(s3_uri_parts[1:])
#         #    s3 = boto3.resource('s3')
#         #    bucket = s3.Bucket(bucket_name)
#         #    bucket.objects.filter(Prefix=key_prefix).delete()
#         #    logger.info(f"Deleted model artifacts from S3: s3://{bucket_name}/{key_prefix}")
#
#         # Remove the locally created model.tar.gz and endpoint name file
#         if os.path.exists('model.tar.gz'):
#             os.remove('model.tar.gz')
#         if os.path.exists('sagemaker_endpoint_name.txt'):
#             os.remove('sagemaker_endpoint_name.txt')
#
#     elif deployed_endpoint: # If predictor_instance is None but we have the name
#         logger.warning(f"Predictor instance not found. Attempting to delete endpoint '{deployed_endpoint}' by name.")
#         sm_client = boto3.client('sagemaker')
#         sm_client.delete_endpoint(EndpointName=deployed_endpoint)
#         logger.info(f"Endpoint '{deployed_endpoint}' deletion initiated.")
#         # Note: Deleting endpoint config and model would require more logic if only name is available
#     else:
#         logger.info("No active endpoint or endpoint name found to delete.")
#
# except Exception as e:
#     logger.error(f"Error during cleanup: {e}")
```
**Note:** The cleanup script above for S3 and SageMaker models is more advanced. The most critical part is `predictor_instance.delete_endpoint()`. For full cleanup, you would also delete the SageMaker Model and the S3 artifacts if they are no longer needed.