# Building Your First Campaign

This notebook will walk you through the steps to build a recommendation model for movies based on data collected from the movielens data set. The goal is to recommend movies that are relevant based on a particular user.

The data is coming from the MovieLens project, you can learn more about the data and potential uses by doing a web search during any of the waiting periods in the cells below.

## How to Use the Notebook

Code is broken up into cells like the one below. There's a triangular `Run` button at the top of this page you can click to execute each cell and move onto the next, or you can press `Shift` + `Enter` while in the cell to execute it and move onto the next one.

As a cell is executing you'll notice a line to the side showcase an `*` while the cell is running or it will update to a number to indicate the last cell that completed executing after it has finished exectuting all the code within a cell.


Follow the instructions below and execute the cells to get started with Amazon Personalize.

## A brief Security Overview
Cloud security at AWS is the highest priority. As an AWS customer, you benefit from a data center and network architecture that is built to meet the requirements of the most security-sensitive organizations.

Security is a shared responsibility between AWS and you. [The shared responsibility model](http://aws.amazon.com/compliance/shared-responsibility-model/) describes this as security of the cloud and security in the cloud:

* **Security of the cloud –** AWS is responsible for protecting the infrastructure that runs AWS services in the AWS Cloud. AWS also provides you with services that you can use securely. Amazon Personalize uses data encryption to protect your data. For more information see [Data encryption](https://docs.aws.amazon.com/personalize/latest/dg/data-encryption.html). Third-party auditors regularly test and verify the effectiveness of our security as part of the [AWS Compliance Programs](http://aws.amazon.com/compliance/programs/). To learn about the compliance programs that apply to Amazon Personalize, see [AWS Services in Scope by Compliance Program](http://aws.amazon.com/compliance/services-in-scope/).


* **Security in the cloud –** Your responsibility is determined by the AWS service that you use. You are also responsible for other factors including the sensitivity of your data, your company’s requirements, and applicable laws and regulations.

After running your first campaign, make sure to check out[ Security Best Practices](3.Best_Practices-Clientside.ipynb) in Part 3 of this workshop


## Imports 

Python ships with a broad collection of libraries and we need to import those as well as the ones installed to help us like [boto3](https://aws.amazon.com/sdk-for-python/) (AWS SDK for python) and [Pandas](https://pandas.pydata.org/)/[Numpy](https://numpy.org/) which are core data science tools.

In [1]:
# Imports
import boto3
import json
import numpy as np
import pandas as pd
import time
import os


In [None]:

!conda install -y -c conda-forge unzip

Next you will want to validate that your environment can communicate successfully with Amazon Personalize, the lines below do just that.

In [2]:
# os.environ['AWS_DEFAULT_REGION'] = 'us-east-2'  # or your preferred region

# Configure the SDK to Personalize:
# personalize = boto3.client('personalize')
# personalize_runtime = boto3.client('personalize-runtime')

# Set your region
os.environ['AWS_DEFAULT_REGION'] = 'us-east-2'  # This matches the region you specified in SSO config

# Create a session with your SSO profile
session = boto3.Session(profile_name='767398024897_engineering')

# Create your clients using the session
personalize = session.client('personalize')
personalize_runtime = session.client('personalize-runtime')


## Configure the data

Data is imported into Amazon Personalize through Amazon S3, below we will specify a bucket that you have created within AWS for the purposes of this exercise.

Below you will update the `bucket` variable to instead be set to the value that you created earlier in the CloudFormation steps, this should be in a text file from your earlier work. the `filename` does not need to be changed.

### Bucket and Data Output Location
We are using the `personalize-s3-bucket` parameter stored in SSM during deployment of the CloudFormation template.

In [6]:
# Configure the SDK to SSM:
ssm = session.client('ssm')
personalizes3bucket = ssm.get_parameter(Name='/cloudformation/personalize-s3-bucket', WithDecryption=False)
bucket = personalizes3bucket['Parameter']['Value']
filename = "movie-lens-100k.csv"

### Download, Prepare, and Upload Training Data

At present you do not have the MovieLens data loaded locally yet for examination, execute the lines below to download the latest copy and to examine it quickly.

#### Download and Explore the Dataset

In [None]:
!wget -N https://files.grouplens.org/datasets/movielens/ml-100k.zip
!unzip -o ml-100k.zip


--2025-05-30 11:15:35--  https://files.grouplens.org/datasets/movielens/ml-100k.zip
Resolving files.grouplens.org (files.grouplens.org)... 128.101.65.152
Connecting to files.grouplens.org (files.grouplens.org)|128.101.65.152|:443... connected.
HTTP request sent, awaiting response... 304 Not Modified
File ‘ml-100k.zip’ not modified on server. Omitting download.

Archive:  ml-100k.zip
  inflating: ml-100k/allbut.pl       
  inflating: ml-100k/mku.sh          
  inflating: ml-100k/README          
  inflating: ml-100k/u.data          
  inflating: ml-100k/u.genre         
  inflating: ml-100k/u.info          
  inflating: ml-100k/u.item          
  inflating: ml-100k/u.occupation    
  inflating: ml-100k/u.user          
  inflating: ml-100k/u1.base         
  inflating: ml-100k/u1.test         
  inflating: ml-100k/u2.base         
  inflating: ml-100k/u2.test         
  inflating: ml-100k/u3.base         
  inflating: ml-100k/u3.test         
  inflating: ml-100k/u4.base         
  infl

Unnamed: 0,USER_ID,ITEM_ID,RATING,TIMESTAMP
0,196,242,3,881250949
1,186,302,3,891717742
...,...,...,...,...
99998,13,225,2,882399156
99999,12,203,3,879959583


In [16]:
data = pd.read_csv('./ml-100k/u.data', sep='\t', names=['USER_ID', 'ITEM_ID', 'RATING', 'TIMESTAMP'])
pd.set_option('display.max_rows', 5)


In [17]:
data



Unnamed: 0,USER_ID,ITEM_ID,RATING,TIMESTAMP
0,196,242,3,881250949
1,186,302,3,891717742
...,...,...,...,...
99998,13,225,2,882399156
99999,12,203,3,879959583


#### Prepare and Upload Data

As you can see the data contains a UserID, ItemID, Rating, and Timestamp.

We are now going to remove the items with low rankings, and remove the Rating column before we build our model.

Once done we will now save the file as a new CSV and then upload it to S3.

All of that is done by executing the lines in the cell below.

In [18]:
data = data[data['RATING'] > 3]                  # Keep only movies rated higher than 3 out of 5.
data = data[['USER_ID', 'ITEM_ID', 'TIMESTAMP']] # select columns that match the columns in the schema below
data.to_csv(filename, index=False)
session.resource('s3').Bucket(bucket).Object(filename).upload_file(filename)

### Create Schema

A core component of how Personalize understands your data comes from the Schema that is defined below. This configuration tells the service how to digest the data provided via your CSV file. Note the columns and types align to what was in the file you created above.

In [19]:
schema = {
    "type": "record",
    "name": "Interactions",
    "namespace": "com.amazonaws.personalize.schema",
    "fields": [
        {
            "name": "USER_ID",
            "type": "string"
        },
        {
            "name": "ITEM_ID",
            "type": "string"
        },
        {
            "name": "TIMESTAMP",
            "type": "long"
        }
    ],
    "version": "1.0"
}

create_schema_response = personalize.create_schema(
    name = "personalize-demo-schema",
    schema = json.dumps(schema)
)

schema_arn = create_schema_response['schemaArn']
print(json.dumps(create_schema_response, indent=2))

{
  "schemaArn": "arn:aws:personalize:us-east-2:767398024897:schema/personalize-demo-schema",
  "ResponseMetadata": {
    "RequestId": "2d868bfb-9dca-4f6c-ba4b-818ad5027449",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "date": "Fri, 30 May 2025 18:21:07 GMT",
      "content-type": "application/x-amz-json-1.1",
      "content-length": "89",
      "connection": "keep-alive",
      "x-amzn-requestid": "2d868bfb-9dca-4f6c-ba4b-818ad5027449",
      "strict-transport-security": "max-age=47304000; includeSubDomains",
      "x-frame-options": "DENY",
      "cache-control": "no-cache",
      "x-content-type-options": "nosniff"
    },
    "RetryAttempts": 0
  }
}


### Create and Wait for Dataset Group

The largest grouping in Personalize is a Dataset Group, this will isolate your data, event trackers, solutions, and campaigns. Grouping things together that share a common collection of data. Feel free to alter the name below if you'd like.

#### Create Dataset Group

In [20]:
create_dataset_group_response = personalize.create_dataset_group(
    name = "personalize-launch-demo"
)

dataset_group_arn = create_dataset_group_response['datasetGroupArn']
print(json.dumps(create_dataset_group_response, indent=2))

{
  "datasetGroupArn": "arn:aws:personalize:us-east-2:767398024897:dataset-group/personalize-launch-demo",
  "ResponseMetadata": {
    "RequestId": "5f10dea1-c2be-46b2-94f8-2fe3b82ea8ed",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "date": "Fri, 30 May 2025 18:21:25 GMT",
      "content-type": "application/x-amz-json-1.1",
      "content-length": "102",
      "connection": "keep-alive",
      "x-amzn-requestid": "5f10dea1-c2be-46b2-94f8-2fe3b82ea8ed",
      "strict-transport-security": "max-age=47304000; includeSubDomains",
      "x-frame-options": "DENY",
      "cache-control": "no-cache",
      "x-content-type-options": "nosniff"
    },
    "RetryAttempts": 0
  }
}


#### Wait for Dataset Group to Have ACTIVE Status

Before we can use the Dataset Group in any items below it must be active, execute the cell below and wait for it to show active.

In [21]:
max_time = time.time() + 3*60*60 # 3 hours
while time.time() < max_time:
    describe_dataset_group_response = personalize.describe_dataset_group(
        datasetGroupArn = dataset_group_arn
    )
    status = describe_dataset_group_response["datasetGroup"]["status"]
    print("DatasetGroup: {}".format(status))
    
    if status == "ACTIVE" or status == "CREATE FAILED":
        break
        
    time.sleep(60)

DatasetGroup: ACTIVE


#### Create and Wait for Dataset

After the group, the next thing to create is the actual datasets, in this example we will only create 1 for the interactions data. Execute the cells below to create it.

In [22]:
dataset_type = "INTERACTIONS"
create_dataset_response = personalize.create_dataset(
    name = "personalize-launch-interactions",
    datasetType = dataset_type,
    datasetGroupArn = dataset_group_arn,
    schemaArn = schema_arn
)

dataset_arn = create_dataset_response['datasetArn']
print(json.dumps(create_dataset_response, indent=2))

{
  "datasetArn": "arn:aws:personalize:us-east-2:767398024897:dataset/personalize-launch-demo/INTERACTIONS",
  "ResponseMetadata": {
    "RequestId": "0352ba17-5699-4fd1-b6e9-7e25ba6b7da9",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "date": "Fri, 30 May 2025 18:21:46 GMT",
      "content-type": "application/x-amz-json-1.1",
      "content-length": "104",
      "connection": "keep-alive",
      "x-amzn-requestid": "0352ba17-5699-4fd1-b6e9-7e25ba6b7da9",
      "strict-transport-security": "max-age=47304000; includeSubDomains",
      "x-frame-options": "DENY",
      "cache-control": "no-cache",
      "x-content-type-options": "nosniff"
    },
    "RetryAttempts": 0
  }
}


#### Wait for Dataset to Have ACTIVE Status

Before we can use the Dataset in any items below it must be active, execute the cell below and wait for it to show active.

In [23]:
max_time = time.time() + 3*60*60 # 3 hours
while time.time() < max_time:
    describe_dataset_response = personalize.describe_dataset(
        datasetArn = dataset_arn
    )
    status = describe_dataset_response["dataset"]["status"]
    print("Dataset: {}".format(status))
    
    if status == "ACTIVE" or status == "CREATE FAILED":
        break
        
    time.sleep(30)

Dataset: CREATE PENDING
Dataset: ACTIVE


#### S3 Bucket access to Personalize

Amazon Personalize needs to be able to read the content of your S3 bucket created during deployment of the CloudFormation template. Here again, we are using the `personalize-iam-role-arn` parameter stored in SSM from the role and policy created.

In [25]:
personalizeiamrolearn = ssm.get_parameter(Name='/cloudformation/personalize-iam-role-arn', WithDecryption=False)
personalizeiamrolearn = personalizeiamrolearn['Parameter']['Value']

## Import the data

Earlier you created the DatasetGroup and Dataset to house your information, now you will execute an import job that will load the data from S3 into Amazon Personalize for usage building your model.

#### Create Dataset Import Job

In [27]:
create_dataset_import_job_response = personalize.create_dataset_import_job(
    jobName = "personalize-demo-import1",
    datasetArn = dataset_arn,
    dataSource = {
        "dataLocation": "s3://{}/{}".format(bucket, filename)
    },
    roleArn = personalizeiamrolearn
)

dataset_import_job_arn = create_dataset_import_job_response['datasetImportJobArn']
print(json.dumps(create_dataset_import_job_response, indent=2))

{
  "datasetImportJobArn": "arn:aws:personalize:us-east-2:767398024897:dataset-import-job/personalize-demo-import1",
  "ResponseMetadata": {
    "RequestId": "564f882c-7fd8-4048-82f6-b12ee8ead30e",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "date": "Fri, 30 May 2025 18:28:08 GMT",
      "content-type": "application/x-amz-json-1.1",
      "content-length": "112",
      "connection": "keep-alive",
      "x-amzn-requestid": "564f882c-7fd8-4048-82f6-b12ee8ead30e",
      "strict-transport-security": "max-age=47304000; includeSubDomains",
      "x-frame-options": "DENY",
      "cache-control": "no-cache",
      "x-content-type-options": "nosniff"
    },
    "RetryAttempts": 0
  }
}


#### Wait for Dataset Import Job to Have ACTIVE Status

It can take a while before the import job completes, please wait until you see that it is active below.

In [28]:
max_time = time.time() + 3*60*60 # 3 hours
while time.time() < max_time:
    describe_dataset_import_job_response = personalize.describe_dataset_import_job(
        datasetImportJobArn = dataset_import_job_arn
    )
    status = describe_dataset_import_job_response["datasetImportJob"]['status']
    print("DatasetImportJob: {}".format(status))
    
    if status == "ACTIVE" or status == "CREATE FAILED":
        break
        
    time.sleep(60)

DatasetImportJob: CREATE PENDING
DatasetImportJob: CREATE IN_PROGRESS
DatasetImportJob: CREATE IN_PROGRESS
DatasetImportJob: ACTIVE


## Create the Solution and Version

In Amazon Personalize a trained model is called a Solution, each Solution can have many specific versions that relate to a given volume of data when the model was trained.

To begin we will list all the recipies that are supported, a recipie is an algorithm that has not been trained on your data yet. After listing you'll select one and use that to build your model.

### Select Recipe

In [29]:
list_recipes_response = personalize.list_recipes()
list_recipes_response

{'recipes': [{'name': 'aws-ecomm-customers-who-viewed-x-also-viewed',
   'recipeArn': 'arn:aws:personalize:::recipe/aws-ecomm-customers-who-viewed-x-also-viewed',
   'status': 'ACTIVE',
   'creationDateTime': datetime.datetime(2019, 6, 9, 17, 0, tzinfo=tzlocal()),
   'lastUpdatedDateTime': datetime.datetime(2024, 6, 19, 8, 19, 55, 369000, tzinfo=tzlocal()),
   'domain': 'ECOMMERCE'},
  {'name': 'aws-ecomm-frequently-bought-together',
   'recipeArn': 'arn:aws:personalize:::recipe/aws-ecomm-frequently-bought-together',
   'status': 'ACTIVE',
   'creationDateTime': datetime.datetime(2019, 6, 9, 17, 0, tzinfo=tzlocal()),
   'lastUpdatedDateTime': datetime.datetime(2024, 6, 19, 8, 19, 55, 369000, tzinfo=tzlocal()),
   'domain': 'ECOMMERCE'},
  {'name': 'aws-ecomm-popular-items-by-purchases',
   'recipeArn': 'arn:aws:personalize:::recipe/aws-ecomm-popular-items-by-purchases',
   'status': 'ACTIVE',
   'creationDateTime': datetime.datetime(2019, 6, 9, 17, 0, tzinfo=tzlocal()),
   'lastUpdated

#### User Personalization
The [User-Personalization](https://docs.aws.amazon.com/personalize/latest/dg/native-recipe-new-item-USER_PERSONALIZATION.html) (aws-user-personalization) recipe is optimized for all USER_PERSONALIZATION recommendation scenarios. When recommending items, it uses automatic item exploration.

With automatic exploration, Amazon Personalize automatically tests different item recommendations, learns from how users interact with these recommended items, and boosts recommendations for items that drive better engagement and conversion. This improves item discovery and engagement when you have a fast-changing catalog, or when new items, such as news articles or promotions, are more relevant to users when fresh.

You can balance how much to explore (where items with less interactions data or relevance are recommended more frequently) against how much to exploit (where recommendations are based on what we know or relevance). Amazon Personalize automatically adjusts future recommendations based on implicit user feedback.

First, select the recipe by finding the ARN in the list of recipes above.

In [30]:
recipe_arn = "arn:aws:personalize:::recipe/aws-user-personalization" # aws-user-personalization selected for demo purposes

### Create and Wait for Solution

First you will create the solution with the API, then you will create a version. It will take several minutes to train the model and thus create your version of a solution. Once it gets started and you are seeing the in progress notifications it is a good time to take a break, grab a coffee, etc.

#### Create Solution

In [31]:
create_solution_response = personalize.create_solution(
    name = "personalize-demo-soln-user-personalization",
    datasetGroupArn = dataset_group_arn,
    recipeArn = recipe_arn
)

solution_arn = create_solution_response['solutionArn']
print(json.dumps(create_solution_response, indent=2))

{
  "solutionArn": "arn:aws:personalize:us-east-2:767398024897:solution/personalize-demo-soln-user-personalization",
  "ResponseMetadata": {
    "RequestId": "5313fb8d-fe8f-4770-be43-31dd523a94a1",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "date": "Fri, 30 May 2025 18:33:38 GMT",
      "content-type": "application/x-amz-json-1.1",
      "content-length": "112",
      "connection": "keep-alive",
      "x-amzn-requestid": "5313fb8d-fe8f-4770-be43-31dd523a94a1",
      "strict-transport-security": "max-age=47304000; includeSubDomains",
      "x-frame-options": "DENY",
      "cache-control": "no-cache",
      "x-content-type-options": "nosniff"
    },
    "RetryAttempts": 0
  }
}


#### Create Solution Version

In [32]:
create_solution_version_response = personalize.create_solution_version(
    solutionArn = solution_arn
)

solution_version_arn = create_solution_version_response['solutionVersionArn']
print(json.dumps(create_solution_version_response, indent=2))

{
  "solutionVersionArn": "arn:aws:personalize:us-east-2:767398024897:solution/personalize-demo-soln-user-personalization/fe4d1c7a",
  "ResponseMetadata": {
    "RequestId": "fabe2d37-0839-4a23-a6c6-7a47eef5454b",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "date": "Fri, 30 May 2025 18:33:44 GMT",
      "content-type": "application/x-amz-json-1.1",
      "content-length": "128",
      "connection": "keep-alive",
      "x-amzn-requestid": "fabe2d37-0839-4a23-a6c6-7a47eef5454b",
      "strict-transport-security": "max-age=47304000; includeSubDomains",
      "x-frame-options": "DENY",
      "cache-control": "no-cache",
      "x-content-type-options": "nosniff"
    },
    "RetryAttempts": 0
  }
}


#### Wait for Solution Version to Have ACTIVE Status

This will take approximately 40-50 minutes.

In [33]:
max_time = time.time() + 3*60*60 # 3 hours
while time.time() < max_time:
    describe_solution_version_response = personalize.describe_solution_version(
        solutionVersionArn = solution_version_arn
    )
    status = describe_solution_version_response["solutionVersion"]["status"]
    print("SolutionVersion: {}".format(status))
    
    if status == "ACTIVE" or status == "CREATE FAILED":
        break
        
    time.sleep(60)

SolutionVersion: CREATE PENDING
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: ACTIVE


#### Get Metrics of Solution Version

Now that your solution and version exists, you can obtain the metrics for it to judge its performance. These metrics are not particularly good as it is a demo set of data, but with larger more complex datasets you should see improvements.

In [34]:
get_solution_metrics_response = personalize.get_solution_metrics(
    solutionVersionArn = solution_version_arn
)

print(json.dumps(get_solution_metrics_response, indent=2))

{
  "solutionVersionArn": "arn:aws:personalize:us-east-2:767398024897:solution/personalize-demo-soln-user-personalization/fe4d1c7a",
  "metrics": {
    "coverage": 0.2887,
    "mean_reciprocal_rank_at_25": 0.1007,
    "normalized_discounted_cumulative_gain_at_10": 0.1205,
    "normalized_discounted_cumulative_gain_at_25": 0.1638,
    "normalized_discounted_cumulative_gain_at_5": 0.0787,
    "precision_at_10": 0.0292,
    "precision_at_25": 0.0207,
    "precision_at_5": 0.027
  },
  "ResponseMetadata": {
    "RequestId": "56b4a338-0874-47c1-9588-31c5dafaa9f1",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "date": "Fri, 30 May 2025 18:48:18 GMT",
      "content-type": "application/x-amz-json-1.1",
      "content-length": "425",
      "connection": "keep-alive",
      "x-amzn-requestid": "56b4a338-0874-47c1-9588-31c5dafaa9f1",
      "strict-transport-security": "max-age=47304000; includeSubDomains",
      "x-frame-options": "DENY",
      "cache-control": "no-cache",
      "x-conte

We recommend reading [the documentation](https://docs.aws.amazon.com/personalize/latest/dg/working-with-training-metrics.html) to understand the metrics, but we have also copied parts of the documentation below for convenience.

You need to understand the following terms regarding evaluation in Personalize:

- *Relevant recommendation* refers to a recommendation that matches a value in the testing data for the particular user.
- *Rank* refers to the position of a recommended item in the list of recommendations. Position 1 (the top of the list) is presumed to be the most relevant to the user.
- *Query* refers to the internal equivalent of a GetRecommendations call.

The metrics produced by Personalize are:

- coverage: The proportion of unique recommended items from all queries out of the total number of unique items in the training data (includes both the Items and Interactions datasets).
- mean_reciprocal_rank_at_25: The [mean of the reciprocal ranks](https://en.wikipedia.org/wiki/Mean_reciprocal_rank) of the first relevant recommendation out of the top 25 recommendations over all queries. This metric is appropriate if you're interested in the single highest ranked recommendation.
- normalized_discounted_cumulative_gain_at_K: Discounted gain assumes that recommendations lower on a list of recommendations are less relevant than higher recommendations. Therefore, each recommendation is discounted (given a lower weight) by a factor dependent on its position. To produce the [cumulative discounted gain](https://en.wikipedia.org/wiki/Discounted_cumulative_gain) (DCG) at K, each relevant discounted recommendation in the top K recommendations is summed together. The normalized discounted cumulative gain (NDCG) is the DCG divided by the ideal DCG such that NDCG is between 0 - 1. (The ideal DCG is where the top K recommendations are sorted by relevance.) Amazon Personalize uses a weighting factor of 1/log(1 + position), where the top of the list is position 1. This metric rewards relevant items that appear near the top of the list, because the top of a list usually draws more attention.
- precision_at_K: The number of relevant recommendations out of the top K recommendations divided by K. This metric rewards precise recommendation of the relevant items.

## Create and Wait for the Campaign

Now that you have a working solution version you will need to create a campaign to use it with your applications. A campaign is a hosted solution version; an endpoint which you can query for recommendations. Pricing is set by estimating throughput capacity (requests from users for personalization per second). When deploying a campaign, you set a minimum transactions per second (TPS) value (`minProvisionedTPS`). This service, like many within AWS, will automatically scale based on demand, but if latency is critical, you may want to provision ahead for larger demand. For this demo, the minimum throughput threshold is set to 1. For more information, see the [pricing](https://aws.amazon.com/personalize/pricing/) page.

As mentioned above, the user-personalization recipe used for our solution supports automatic exploration of "cold" items. You can control how much exploration is performed when creating your campaign. The `itemExplorationConfig` data type supports `explorationWeight` and `explorationItemAgeCutOff` parameters. Exploration weight determines how frequently recommendations include items with less interactions data or relevance. The closer the value is to 1.0, the more exploration. At zero, no exploration occurs and recommendations are based on current data (relevance). Exploration item age cut-off determines items to be explored based on time frame since latest interaction. Provide the maximum item age, in days since the latest interaction, to define the scope of item exploration. The larger the value, the more items are considered during exploration. For our campaign below, we'll specify an exploration weight of 0.5.

#### Create Campaign

In [35]:
create_campaign_response = personalize.create_campaign(
    name = "personalize-demo-camp",
    solutionVersionArn = solution_version_arn,
    minProvisionedTPS = 1,
    campaignConfig = {
        "itemExplorationConfig": {
            "explorationWeight": "0.5"
        }
    }
)

campaign_arn = create_campaign_response['campaignArn']
print(json.dumps(create_campaign_response, indent=2))

{
  "campaignArn": "arn:aws:personalize:us-east-2:767398024897:campaign/personalize-demo-camp",
  "ResponseMetadata": {
    "RequestId": "2a5a9513-614d-45eb-9ba7-cd6d801f3179",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "date": "Fri, 30 May 2025 18:48:18 GMT",
      "content-type": "application/x-amz-json-1.1",
      "content-length": "91",
      "connection": "keep-alive",
      "x-amzn-requestid": "2a5a9513-614d-45eb-9ba7-cd6d801f3179",
      "strict-transport-security": "max-age=47304000; includeSubDomains",
      "x-frame-options": "DENY",
      "cache-control": "no-cache",
      "x-content-type-options": "nosniff"
    },
    "RetryAttempts": 0
  }
}


#### Wait for Campaign to Have ACTIVE Status

This should take about 10 minutes.

In [36]:
max_time = time.time() + 3*60*60 # 3 hours
while time.time() < max_time:
    describe_campaign_response = personalize.describe_campaign(
        campaignArn = campaign_arn
    )
    status = describe_campaign_response["campaign"]["status"]
    print("Campaign: {}".format(status))
    
    if status == "ACTIVE" or status == "CREATE FAILED":
        break
        
    time.sleep(60)

Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: ACTIVE


## Get Sample Recommendations

After the campaign is active you are ready to get recommendations. First we need to select a random user from the collection. Then we will create a few helper functions for getting movie information to show for recommendations instead of just IDs.

In [37]:
# Getting a random user:
user_id, item_id, _ = data.sample().values[0]
print("USER: {}".format(user_id))

USER: 592


In [38]:
# First load items into memory
items = pd.read_csv('./ml-100k/u.item', sep='|', usecols=[0,1], encoding='latin-1', names=['ITEM_ID', 'TITLE'], index_col='ITEM_ID')

def get_movie_title(movie_id):
    """
    Takes in an ID, returns a title
    """
    movie_id = int(movie_id)-1
    return items.iloc[movie_id]['TITLE']


#### Call GetRecommendations

Using the user that you obtained above, the lines below will get recommendations for you and return the list of movies that are recommended.


In [39]:
get_recommendations_response = personalize_runtime.get_recommendations(
    campaignArn = campaign_arn,
    userId = str(user_id),
)
# Update DF rendering
pd.set_option('display.max_rows', 30)

print("Recommendations for user: ", user_id)

item_list = get_recommendations_response['itemList']

recommendation_list = []

for item in item_list:
    title = get_movie_title(item['itemId'])
    recommendation_list.append(title)
    
recommendations_df = pd.DataFrame(recommendation_list, columns = ['OriginalRecs'])
recommendations_df

Recommendations for user:  592


Unnamed: 0,OriginalRecs
0,Apt Pupil (1998)
1,Ulee's Gold (1997)
2,"Apostle, The (1997)"
3,Chasing Amy (1997)
4,Boogie Nights (1997)
5,Amistad (1997)
6,Gattaca (1997)
7,Cop Land (1997)
8,"Ice Storm, The (1997)"
9,As Good As It Gets (1997)


## Review

Using the codes above you have successfully trained a deep learning model to generate movie recommendations based on prior user behavior. Think about other types of problems where this data is available and what it might look like to build a system like this to offer those recommendations.

Now you are ready to move onto the next notebook `2.View_Campaign_And_Interactions.ipynb`



## Notes for the Next Notebook:

There are a few values you will need for the next notebook, execute the cells below to store them so they can be copied and pasted into the next part of the exercise.

In [40]:
%store campaign_arn

Stored 'campaign_arn' (str)


  db[ 'autorestore/' + arg ] = obj


In [41]:
%store dataset_group_arn

Stored 'dataset_group_arn' (str)


  db[ 'autorestore/' + arg ] = obj


In [42]:
%store solution_version_arn

Stored 'solution_version_arn' (str)


  db[ 'autorestore/' + arg ] = obj


In [43]:
%store solution_arn

Stored 'solution_arn' (str)


  db[ 'autorestore/' + arg ] = obj


In [44]:
%store dataset_arn

Stored 'dataset_arn' (str)


  db[ 'autorestore/' + arg ] = obj


In [45]:
%store campaign_arn

Stored 'campaign_arn' (str)


In [46]:
%store schema_arn

Stored 'schema_arn' (str)


  db[ 'autorestore/' + arg ] = obj


In [47]:
%store bucket

Stored 'bucket' (str)


  db[ 'autorestore/' + arg ] = obj


In [48]:
%store filename

Stored 'filename' (str)


  db[ 'autorestore/' + arg ] = obj


In [49]:
%store recommendations_df

Stored 'recommendations_df' (DataFrame)


  db[ 'autorestore/' + arg ] = obj


In [50]:
%store user_id

Stored 'user_id' (int64)


  db[ 'autorestore/' + arg ] = obj
