# Deploy Pre-trained Summarization Model to SageMaker

This notebook deploys the pre-trained BART-CNN model to SageMaker without any training.

In [None]:
# Install required libraries
!pip install --upgrade pip
!pip install boto3 sagemaker
!pip install transformers

In [None]:
# Import required libraries
import boto3
import sagemaker
import json
from sagemaker.huggingface import HuggingFaceModel

# Initialize SageMaker session
session = sagemaker.Session()
role = sagemaker.get_execution_role()

print(f"SageMaker role: {role}")
print(f"SageMaker session bucket: {session.default_bucket()}")

In [None]:
# Define the endpoint name - IMPORTANT: Use the same name expected by your React app
endpoint_name = "summarizer-endpoint"

# Choose instance type - m5.xlarge is a good balance of performance and cost
# You can also try ml.t3.medium for free tier compatibility, but it might be slower
instance_type = "ml.m5.xlarge"  # or "ml.t3.medium" for free tier

print(f"Using instance type: {instance_type}")
print(f"Using endpoint name: {endpoint_name}")

In [None]:
# Configure the pre-trained Hugging Face model
hub = {
    'HF_MODEL_ID': 'facebook/bart-large-cnn',  # model_id from hf.co/models
    'HF_TASK': 'summarization'                 # NLP task you want to use
}

# Create HuggingFaceModel
huggingface_model = HuggingFaceModel(
    env=hub,
    role=role,
    transformers_version='4.17.0',
    pytorch_version='1.10.2',
    py_version='py38',
    entry_point='inference.py',  # Custom inference script
    source_dir='../backend/',    # Path to inference.py file
)

In [None]:
# Check if the endpoint already exists
sm_client = boto3.client('sagemaker')

try:
    sm_client.describe_endpoint(EndpointName=endpoint_name)
    print(f"Endpoint {endpoint_name} already exists. Deleting it before deploying new model...")
    sm_client.delete_endpoint(EndpointName=endpoint_name)
    # Wait for the endpoint to be deleted
    import time
    print("Waiting for endpoint to be deleted...")
    time.sleep(60)  # Wait for 60 seconds
except Exception as e:
    if 'Could not find endpoint' in str(e):
        print(f"Endpoint {endpoint_name} does not exist yet. Will create it.")
    else:
        print(f"Error checking endpoint: {e}")

In [None]:
# Deploy the model
print(f"Deploying model to endpoint: {endpoint_name}")
print("This may take 5-10 minutes...")

try:
    predictor = huggingface_model.deploy(
        initial_instance_count=1,
        instance_type=instance_type,
        endpoint_name=endpoint_name
    )
    print(f"Model deployed successfully to endpoint: {endpoint_name}")
except Exception as e:
    print(f"Error deploying model: {e}")
    print("\nTroubleshooting steps:")
    print("1. Check your AWS account limits in the AWS Console")
    print("2. Verify that you have sufficient permissions")
    print("3. Check if there are existing endpoints with the same name")
    print("4. Try using a different instance type")

In [None]:
# Test the endpoint with a sample text
import json  # Add this import to fix the NameError

sample_text = """
The Chrysler Building, the famous art deco New York skyscraper, will be sold for a small fraction of its previous sales price. The deal, first reported by The Real Deal, was for $150 million, according to a source familiar with the deal. Mubadala, an Abu Dhabi investment fund, purchased 90% of the building for $800 million in 2008. Real estate firm Tishman Speyer had owned the other 10%. The buyer is RFR Holding, a New York real estate company. Officials with Tishman and RFR did not immediately respond to a request for comments. It's unclear when the deal will close. The building sold fairly quickly after being publicly placed on the market only two months ago. The sale was handled by CBRE Group. The incentive to sell the building at such a huge loss was due to the soaring rent the owners pay to Cooper Union, a New York college, for the land under the building. The rent is rising from $7.75 million last year to $32.5 million this year to $41 million in 2028. Meantime, rents in the building itself are not rising nearly that fast. While the building is an iconic landmark in the New York skyline, it is competing against newer office towers with large floor plans that are preferred by many tenants. The Chrysler Building was briefly the world's tallest, before it was surpassed by the Empire State Building, which was completed the following year.
"""

try:
    # Create a SageMaker runtime client for prediction
    runtime_client = boto3.client('sagemaker-runtime')
    
    # Send test data to the endpoint
    response = runtime_client.invoke_endpoint(
        EndpointName=endpoint_name,
        ContentType='application/json',
        Body=json.dumps({'text': sample_text})
    )
    
    # Process the response
    result = json.loads(response['Body'].read().decode())
    
    print("Generated summary:")
    print(result['summary'])
except Exception as e:
    print(f"Error during inference: {e}")
    print("This could be due to an issue with the endpoint setup.")
    print("Check the CloudWatch logs for the endpoint for more details.")

## Conclusion

You've now deployed a pre-trained summarization model to AWS SageMaker without having to train anything yourself. This endpoint will work with your existing React application as long as:

1. The endpoint name matches what's expected in your React app's configuration
2. The JSON format for input and output follows the same structure

Remember: The model will start incurring charges as long as the endpoint is active. To minimize costs, delete the endpoint when you're not using it.

In [None]:
# Uncomment and run this cell to delete the endpoint when you're done
# sm_client.delete_endpoint(EndpointName=endpoint_name)
# print(f"Endpoint {endpoint_name} deleted successfully")