# Retail Demo Store - Personalization Workshop

Welcome to the Retail Demo Store Personalization Workshop. In this module we're going to be adding three core personalization features powered by [Amazon Personalize](https://aws.amazon.com/personalize/): related product recommendations on the product detail page, personalized recommendations on the Retail Demo Store homepage, and personalized ranking of items on the featured product page. This will allow us to give our users targeted recommendations based on their activity.

Recommended Time: 2 Hours

## Setup

To get started, we need to perform a bit of setup. Walk through each of the following steps to configure your environment to interact with the Amazon Personalize Service.

### Import Dependencies and Setup Boto3 Python Clients

Throughout this workshop we will need access to some common libraries and clients for connecting to AWS services.

In [1]:
# Import Dependencies

import boto3
import json
import pandas as pd
import time
import requests
import csv

from random import randint
from botocore.exceptions import ClientError

# Setup Clients

personalize = boto3.client('personalize')
personalize_runtime = boto3.client('personalize-runtime')

servicediscovery = boto3.client('servicediscovery')
ssm = boto3.client('ssm')

### Configure Bucket and Data Output Location

We will be configuring some variables that will store the location of our source data. When the Retail Demo Store stack was deployed in this account, an S3 bucket was created for you and the name of this bucket was stored in Systems Manager Parameter Store. Using the Boto3 client we can get the name of this bucket for use within our Notebook.

In [2]:
response = ssm.get_parameter(
    Name='retaildemostore-stack-bucket'
)

bucket = response['Parameter']['Value']     # Do Not Change
items_filename = "items.csv"                # Do Not Change
users_filename = "users.csv"                # Do Not Change
interactions_filename = "interactions.csv"  # Do Not Change

print('Bucket: {}'.format(bucket))

Bucket: retaildemostore2-base-13wx5tftqdf5j-b-stackbucket-yr29l9ipgd1s


## Get, Prepare, and Upload User, Product, and Interaction Data

Amazon Personalize provides predefined recipes, based on common use cases, for training models. A recipe is a machine learning algorithm that you use with settings, or hyperparameters, and the data you provide to train an Amazon Personalize model. The data you provide to train a model are organized into separate datasets by the type of data being provided. A collection of datasets are organized into a dataset group. The three dataset types supported by Personalize are items, users, and interactions. Depending on the recipe type you choose, a different combination of dataset types are required. For all recipe types, an interactions dataset is required. Interactions represent how users interact with items. For example, viewing a product, watching a video, listening to a recording, or reading an article. For this workshop, we will be using a recipe that supports all three dataset types.

When we deployed the Retail Demo Store, it was deployed with an initial seed of fictitious User and Product data. We will use this data to train three models in the Amazon Personalize service which will be used to serve product recommendations, related items,  and to rerank product lists for our users. The User and Product data can be accessed from the Retail Demo Store's Users and Products microservices, respectively. We will access our data through microservice data APIs, process the data, and upload them as CSVs to S3. Once our datasets are in S3, we can import them into the Amazon Personalize service.

Let's get started.

### Get Products Service Instance

We will be pulling our Product data from the Products Service that was deployed in Amazon Elastic Container Service as part of the Retail Demo Store. To connect to this service we will use [AWS Cloud Map](https://aws.amazon.com/cloud-map/)'s Service Discovery to discover an instance of the Product Service, and then connect directly to that service instances to access our data.

In [3]:
response = servicediscovery.discover_instances(
    NamespaceName='retaildemostore.local',
    ServiceName='products',
    MaxResults=1,
    HealthStatus='HEALTHY'
)

products_service_instance = response['Instances'][0]['Attributes']['AWS_INSTANCE_IPV4']
print('Products Service Instance IP: {}'.format(products_service_instance))

Products Service Instance IP: 10.215.20.30


#### Download and Explore the Products Dataset

In [4]:
response = requests.get('http://{}/products/all'.format(products_service_instance))
products = response.json()
products_df = pd.DataFrame(products)
pd.set_option('display.max_rows', 5)

products_df

Unnamed: 0,id,url,sk,name,category,style,description,price,image,featured
0,36,http://d1jmev7kvuszlh.cloudfront.net/#/product/36,,Exercise Headphones,electronics,headphones,These stylishly red ear buds wrap securely aro...,19.99,5.jpg,
1,49,http://d1jmev7kvuszlh.cloudfront.net/#/product/49,,Light Brown Leather Lace-Up Boot,footwear,boot,Sturdy enough for the outdoors yet stylish to ...,89.95,11.jpg,
...,...,...,...,...,...,...,...,...,...,...
54,43,http://d1jmev7kvuszlh.cloudfront.net/#/product/43,,Beauty Mask,beauty,grooming,Remove dirt and revitalize skin with this blac...,9.99,9.jpg,
55,14,http://d1jmev7kvuszlh.cloudfront.net/#/product/14,,Coffee Gift Package,housewares,consumable,Mug and Coffee gift set combination package.,39.99,2.jpg,


#### Prepare and Upload Data

When training models in Amazon Personalize, we can provide meta data about our items. For this workshop we will add each product's category and style to the item dataset. The product's unique identifier is required. Then we will rename the columns in our dataset to match our schema (defined later) and those expected by Personalize. Finally, we will save our dataset as a CSV and copy it to our S3 bucket.

In [5]:
products_dataset_df = products_df[['id','category','style']]
products_dataset_df = products_dataset_df.rename(columns = {'id':'ITEM_ID','category':'CATEGORY','style':'STYLE'}) 

products_dataset_df.to_csv(items_filename, index=False)
boto3.Session().resource('s3').Bucket(bucket).Object(items_filename).upload_file(items_filename)

### Get Users Service Instance

We will be pulling our User data from the Users Service that is deployed as part of the Retail Demo Store. To connect to this service we will use Service Discovery to discover an instance of the User Service, and then connect directly to that service instance to access our data.

In [6]:
response = servicediscovery.discover_instances(
    NamespaceName='retaildemostore.local',
    ServiceName='users',
    MaxResults=1,
    HealthStatus='HEALTHY'
)

users_service_instance = response['Instances'][0]['Attributes']['AWS_INSTANCE_IPV4']
print('Users Service Instance IP: {}'.format(users_service_instance))

Users Service Instance IP: 10.215.10.70


#### Download and Explore the Users Dataset

In [7]:
response = requests.get('http://{}/users/all?count=5000'.format(users_service_instance))
users = response.json()
users_df = pd.DataFrame(users)
pd.set_option('display.max_rows', 5)

users_df

Unnamed: 0,id,username,email,first_name,last_name,addresses,age,gender,persona
0,0,user0,sherry.diaz@example.com,Sherry,Diaz,"[{'first_name': 'Sherry', 'last_name': 'Diaz',...",34,F,electronics_beauty
1,1,user1,joseph.murray@example.com,Joseph,Murray,"[{'first_name': 'Joseph', 'last_name': 'Murray...",33,M,footwear_outdoors
...,...,...,...,...,...,...,...,...,...
4998,4998,user4998,john.morales@example.com,John,Morales,"[{'first_name': 'John', 'last_name': 'Morales'...",46,M,jewelry_accessories
4999,4999,user4999,nathan.chen@example.com,Nathan,Chen,"[{'first_name': 'Nathan', 'last_name': 'Chen',...",31,M,footwear_outdoors


#### Prepare and Upload Data

Similar to the items dataset we created above, we can provide metadata on our users when training models in Personalize. For this workshop we will include each user's age and gender. As before, we will name the columns to match our schema, save the data as a CSV, and upload to our S3 bucket.

In [8]:
users_dataset_df = users_df[['id','age','gender']]
users_dataset_df = users_dataset_df.rename(columns = {'id':'USER_ID','age':'AGE','gender':'GENDER'}) 

users_dataset_df.to_csv(users_filename, index=False)
boto3.Session().resource('s3').Bucket(bucket).Object(users_filename).upload_file(users_filename)

### Create User-Items Interactions Dataset

To mimic user behavior, we will be generating a new dataset that represents user interactions with items. To make the interactions more realistic, we will use the pre-defined shopper persona for each user to generate event types for products matching that persona.

In [9]:
# Minimum number of interactions to generate
min_interactions = 500000

# Percentages of each event type to generate
product_added_percent = .08
cart_viewed_percent = .05
checkout_started_percent = .02
order_completed_percent = .01

# Count of interactions generated for each event type
product_viewed_count = 0
product_added_count = 0
cart_viewed_count = 0
checkout_started_count = 0
order_completed_count = 0

# How many days in the past (from now) to start generating interactions
days_back = 90

next_timestamp = int(time.time()) - (days_back * 24 * 60 * 60)
seconds_increment = int((time.time() - next_timestamp) / min_interactions)

assert seconds_increment > 0, "Increase days_back or reduce min_interactions"

print("Days back: " + str(days_back))
print("Starting timestamp: " + str(next_timestamp))
print("Seconds increment: " + str(seconds_increment))

f = csv.writer(open("interactions.csv", "w"))
f.writerow(["ITEM_ID", "USER_ID", "EVENT_TYPE", "TIMESTAMP"])

print("Generating Interactions...")
interactions = 0

while interactions <= min_interactions:
    
    # Pick a Random User
    user = users[randint(0, len(users)-1)]
    
    # Pick a Random Product
    product = products[randint(0, len(products)-1)]
    
    # Determine categories from user's persona
    persona = user['persona']
    preferred_categories = persona.split('_')
    
    # If Product Category matches a Persona Category, Record Interaction
    if product['category'] in preferred_categories:
        this_timestamp = next_timestamp + randint(0, seconds_increment)

        f.writerow([int(product['id']),
                    int(user['id']), 
                    'ProductViewed',
                    this_timestamp])
        
        next_timestamp += seconds_increment
        product_viewed_count += 1
        interactions += 1

        if product_added_count < int(product_viewed_count * product_added_percent):
            this_timestamp += randint(0, int(seconds_increment / 2))
            f.writerow([int(product['id']),
                        int(user['id']), 
                        'ProductAdded',
                        this_timestamp])
            interactions += 1
            product_added_count += 1

        if cart_viewed_count < int(product_viewed_count * cart_viewed_percent):
            this_timestamp += randint(0, int(seconds_increment / 2))
            f.writerow([int(product['id']),
                        int(user['id']), 
                        'CartViewed',
                        this_timestamp])
            interactions += 1
            cart_viewed_count += 1

        if checkout_started_count < int(product_viewed_count * checkout_started_percent):
            this_timestamp += randint(0, int(seconds_increment / 2))
            f.writerow([int(product['id']),
                        int(user['id']), 
                        'CheckoutStarted',
                        this_timestamp])
            interactions += 1
            checkout_started_count += 1

        if order_completed_count < int(product_viewed_count * order_completed_percent):
            this_timestamp += randint(0, int(seconds_increment / 2))
            f.writerow([int(product['id']),
                        int(user['id']), 
                        'OrderCompleted',
                        this_timestamp])
            interactions += 1
            order_completed_count += 1

    
print("Done")
print("Total interactions: " + str(interactions))
print("Total product viewed: " + str(product_viewed_count))
print("Total product added: " + str(product_added_count))
print("Total cart viewed: " + str(cart_viewed_count))
print("Total checkout started: " + str(checkout_started_count))
print("Total order completed: " + str(order_completed_count))


Days back: 90
Starting timestamp: 1592366620
Seconds increment: 15
Generating Interactions...
Done
Total interactions: 500002
Total product viewed: 431038
Total product added: 34483
Total cart viewed: 21551
Total checkout started: 8620
Total order completed: 4310


#### Open and Explore the Interactions Dataset

In [11]:
interactions_df.EVENT_TYPE.unique()

array(['ProductViewed', 'ProductAdded', 'CartViewed', 'CheckoutStarted',
       'OrderCompleted'], dtype=object)

In [10]:
interactions_df = pd.read_csv(interactions_filename)
interactions_df

Unnamed: 0,ITEM_ID,USER_ID,EVENT_TYPE,TIMESTAMP
0,56,4558,ProductViewed,1592366622
1,8,473,ProductViewed,1592366642
...,...,...,...,...
499930,56,4405,ProductViewed,1598831271
499931,28,4883,ProductViewed,1598831282


#### Prepare and Upload Data

In [12]:
boto3.Session().resource('s3').Bucket(bucket).Object(interactions_filename).upload_file(interactions_filename)

## Configure Amazon Personalize

Now that we've prepared our three datasets and uploaded them to S3 we'll need to configure the Amazon Personalize service to understand our data so that it can be used to train models for generating recommendations.

### Create Schemas for Datasets

Amazon Personalize requires a schema for each dataset so it can map the columns in our CSVs to fields for model training. Each schema is declared in JSON using the [Apache Avro](https://avro.apache.org/) format.

Let's define and create schemas in Personalize for our datasets.

#### Items Datsaset Schema

In [13]:
items_schema = {
    "type": "record",
    "name": "Items",
    "namespace": "com.amazonaws.personalize.schema",
    "fields": [
        {
            "name": "ITEM_ID",
            "type": "string"
        },
        {
            "name": "CATEGORY",
            "type": "string",
            "categorical": True,
        },
        {
            "name": "STYLE",
            "type": "string",
            "categorical": True,
        }
    ],
    "version": "1.0"
}

create_schema_response = personalize.create_schema(
    name = "retaildemostore-schema-items",
    schema = json.dumps(items_schema)
)

items_schema_arn = create_schema_response['schemaArn']
print(json.dumps(create_schema_response, indent=2))

{
  "schemaArn": "arn:aws:personalize:us-east-1:029498593638:schema/retaildemostore-schema-items",
  "ResponseMetadata": {
    "RequestId": "f91242c5-4104-4dca-835a-0ac224ee3e28",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Tue, 15 Sep 2020 04:07:42 GMT",
      "x-amzn-requestid": "f91242c5-4104-4dca-835a-0ac224ee3e28",
      "content-length": "94",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


#### Users Dataset Schema

In [14]:
users_schema = {
    "type": "record",
    "name": "Users",
    "namespace": "com.amazonaws.personalize.schema",
    "fields": [
        {
            "name": "USER_ID",
            "type": "string"
        },
        {
            "name": "AGE",
            "type": "int"
        },
        {
            "name": "GENDER",
            "type": "string",
            "categorical": True,
        }
    ],
    "version": "1.0"
}

create_schema_response = personalize.create_schema(
    name = "retaildemostore-schema-users",
    schema = json.dumps(users_schema)
)

users_schema_arn = create_schema_response['schemaArn']
print(json.dumps(create_schema_response, indent=2))

{
  "schemaArn": "arn:aws:personalize:us-east-1:029498593638:schema/retaildemostore-schema-users",
  "ResponseMetadata": {
    "RequestId": "3db9bb47-dcd5-421a-ab2d-4e4c9ce4820f",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Tue, 15 Sep 2020 04:07:45 GMT",
      "x-amzn-requestid": "3db9bb47-dcd5-421a-ab2d-4e4c9ce4820f",
      "content-length": "94",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


#### Interactions Dataset Schema

In [15]:
interactions_schema = {
    "type": "record",
    "name": "Interactions",
    "namespace": "com.amazonaws.personalize.schema",
    "fields": [
        {
            "name": "ITEM_ID",
            "type": "string"
        },
        {
            "name": "USER_ID",
            "type": "string"
        },
        {
            "name": "EVENT_TYPE",
            "type": "string"
        },
        {
            "name": "TIMESTAMP",
            "type": "long"
        }
    ],
    "version": "1.0"
}

create_schema_response = personalize.create_schema(
    name = "retaildemostore-schema-interactions",
    schema = json.dumps(interactions_schema)
)

interactions_schema_arn = create_schema_response['schemaArn']
print(json.dumps(create_schema_response, indent=2))

{
  "schemaArn": "arn:aws:personalize:us-east-1:029498593638:schema/retaildemostore-schema-interactions",
  "ResponseMetadata": {
    "RequestId": "f6d2c698-383f-4a3b-8414-015e2287ce58",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Tue, 15 Sep 2020 04:08:03 GMT",
      "x-amzn-requestid": "f6d2c698-383f-4a3b-8414-015e2287ce58",
      "content-length": "101",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


### Create and Wait for Dataset Group

Next we need to create the dataset group that will contain our three datasets.

#### Create Dataset Group

In [16]:
create_dataset_group_response = personalize.create_dataset_group(
    name = "retaildemostore"
)

dataset_group_arn = create_dataset_group_response['datasetGroupArn']
print(json.dumps(create_dataset_group_response, indent=2))

{
  "datasetGroupArn": "arn:aws:personalize:us-east-1:029498593638:dataset-group/retaildemostore",
  "ResponseMetadata": {
    "RequestId": "bfec4ff3-8800-48b0-bb28-2a9cbecce142",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Tue, 15 Sep 2020 04:08:08 GMT",
      "x-amzn-requestid": "bfec4ff3-8800-48b0-bb28-2a9cbecce142",
      "content-length": "94",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


#### Wait for Dataset Group to Have ACTIVE Status

In [17]:
status = None
max_time = time.time() + 3*60*60 # 3 hours
while time.time() < max_time:
    describe_dataset_group_response = personalize.describe_dataset_group(
        datasetGroupArn = dataset_group_arn
    )
    status = describe_dataset_group_response["datasetGroup"]["status"]
    print("DatasetGroup: {}".format(status))
    
    if status == "ACTIVE" or status == "CREATE FAILED":
        break
        
    time.sleep(15)

DatasetGroup: CREATE PENDING
DatasetGroup: ACTIVE


### Create Items Dataset

Next we will create the datasets in Personalize for our three dataset types. Let's start with the items dataset.

In [18]:
dataset_type = "ITEMS"
create_dataset_response = personalize.create_dataset(
    name = "retaildemostore-dataset-items",
    datasetType = dataset_type,
    datasetGroupArn = dataset_group_arn,
    schemaArn = items_schema_arn
)

items_dataset_arn = create_dataset_response['datasetArn']
print(json.dumps(create_dataset_response, indent=2))

{
  "datasetArn": "arn:aws:personalize:us-east-1:029498593638:dataset/retaildemostore/ITEMS",
  "ResponseMetadata": {
    "RequestId": "2e7316d3-7d71-401a-991e-f3cd55a51052",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Tue, 15 Sep 2020 04:12:07 GMT",
      "x-amzn-requestid": "2e7316d3-7d71-401a-991e-f3cd55a51052",
      "content-length": "89",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


### Create Users Dataset

In [19]:
dataset_type = "USERS"
create_dataset_response = personalize.create_dataset(
    name = "retaildemostore-dataset-users",
    datasetType = dataset_type,
    datasetGroupArn = dataset_group_arn,
    schemaArn = users_schema_arn
)

users_dataset_arn = create_dataset_response['datasetArn']
print(json.dumps(create_dataset_response, indent=2))

{
  "datasetArn": "arn:aws:personalize:us-east-1:029498593638:dataset/retaildemostore/USERS",
  "ResponseMetadata": {
    "RequestId": "1949783b-e9b0-451d-8fc5-e2fbf90f3eaf",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Tue, 15 Sep 2020 04:12:47 GMT",
      "x-amzn-requestid": "1949783b-e9b0-451d-8fc5-e2fbf90f3eaf",
      "content-length": "89",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


### Create Interactions Dataset

In [20]:
dataset_type = "INTERACTIONS"
create_dataset_response = personalize.create_dataset(
    name = "retaildemostore-dataset-interactions",
    datasetType = dataset_type,
    datasetGroupArn = dataset_group_arn,
    schemaArn = interactions_schema_arn
)

interactions_dataset_arn = create_dataset_response['datasetArn']
print(json.dumps(create_dataset_response, indent=2))

{
  "datasetArn": "arn:aws:personalize:us-east-1:029498593638:dataset/retaildemostore/INTERACTIONS",
  "ResponseMetadata": {
    "RequestId": "5676be61-a27d-448a-932b-7c02ffdb6263",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Tue, 15 Sep 2020 04:12:51 GMT",
      "x-amzn-requestid": "5676be61-a27d-448a-932b-7c02ffdb6263",
      "content-length": "96",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


## Import Datasets to Personalize

Up to this point we have generated CSVs containing data for our users, items, and interactions and staged them in an S3 bucket. We also created schemas in Personalize that define the columns in our CSVs. Then we created a datset group and three datasets in Personalize that will receive our data. In the following steps we will create import jobs with Personalize that will import the datasets from our S3 bucket into the service.

### Setup Permissions

By default, the Personalize service does not have permission to acccess the data we uploaded into the S3 bucket in our account. In order to grant access to the  Personalize service to read our CSVs, we need to set a Bucket Policy and create an IAM role that the Amazon Personalize service will assume.

#### Attach policy to S3 bucket

In [21]:
s3 = boto3.client("s3")

policy = {
    "Version": "2012-10-17",
    "Id": "PersonalizeS3BucketAccessPolicy",
    "Statement": [
        {
            "Sid": "PersonalizeS3BucketAccessPolicy",
            "Effect": "Allow",
            "Principal": {
                "Service": "personalize.amazonaws.com"
            },
            "Action": [
                "s3:GetObject",
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::{}".format(bucket),
                "arn:aws:s3:::{}/*".format(bucket)
            ]
        }
    ]
}

s3.put_bucket_policy(Bucket=bucket, Policy=json.dumps(policy));

#### Create S3 Read Only Access Role

In [22]:
iam = boto3.client("iam")

role_name = "RetailDemoStorePersonalizeS3Role"
assume_role_policy_document = {
    "Version": "2012-10-17",
    "Statement": [
        {
          "Effect": "Allow",
          "Principal": {
            "Service": "personalize.amazonaws.com"
          },
          "Action": "sts:AssumeRole"
        }
    ]
}

create_role_response = iam.create_role(
    RoleName = role_name,
    AssumeRolePolicyDocument = json.dumps(assume_role_policy_document)
);

iam.attach_role_policy(
    RoleName = role_name,
    PolicyArn = "arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess"
);

role_arn = create_role_response["Role"]["Arn"]
print('IAM Role: {}'.format(role_arn))
# Pause to allow role to fully persist
time.sleep(10)

IAM Role: arn:aws:iam::029498593638:role/RetailDemoStorePersonalizeS3Role


### Create Import Jobs

With the permissions in place to allow Personalize to access our CSV files, let's create three import jobs to import each file into its respective dataset. Each import job can take several minutes to complete so we'll create all three and then wait for them all to complete.

#### Create Items Dataset Import Job

In [23]:
items_create_dataset_import_job_response = personalize.create_dataset_import_job(
    jobName = "retaildemostore-dataset-items-import-job",
    datasetArn = items_dataset_arn,
    dataSource = {
        "dataLocation": "s3://{}/{}".format(bucket, items_filename)
    },
    roleArn = role_arn
)

items_dataset_import_job_arn = items_create_dataset_import_job_response['datasetImportJobArn']
print(json.dumps(items_create_dataset_import_job_response, indent=2))

{
  "datasetImportJobArn": "arn:aws:personalize:us-east-1:029498593638:dataset-import-job/retaildemostore-dataset-items-import-job",
  "ResponseMetadata": {
    "RequestId": "3729c25a-7a2d-44c3-8162-3737355ea65f",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Tue, 15 Sep 2020 04:13:57 GMT",
      "x-amzn-requestid": "3729c25a-7a2d-44c3-8162-3737355ea65f",
      "content-length": "128",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


#### Create Users Dataset Import Job

In [24]:
users_create_dataset_import_job_response = personalize.create_dataset_import_job(
    jobName = "retaildemostore-dataset-users-import-job",
    datasetArn = users_dataset_arn,
    dataSource = {
        "dataLocation": "s3://{}/{}".format(bucket, users_filename)
    },
    roleArn = role_arn
)

users_dataset_import_job_arn = users_create_dataset_import_job_response['datasetImportJobArn']
print(json.dumps(users_create_dataset_import_job_response, indent=2))

{
  "datasetImportJobArn": "arn:aws:personalize:us-east-1:029498593638:dataset-import-job/retaildemostore-dataset-users-import-job",
  "ResponseMetadata": {
    "RequestId": "8c74bd2c-5e43-4bbd-ae9e-5e057162d8a5",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Tue, 15 Sep 2020 04:14:01 GMT",
      "x-amzn-requestid": "8c74bd2c-5e43-4bbd-ae9e-5e057162d8a5",
      "content-length": "128",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


#### Create Interactions Dataset Import Job

In [25]:
interactions_create_dataset_import_job_response = personalize.create_dataset_import_job(
    jobName = "retaildemostore-dataset-interactions-import-job",
    datasetArn = interactions_dataset_arn,
    dataSource = {
        "dataLocation": "s3://{}/{}".format(bucket, interactions_filename)
    },
    roleArn = role_arn
)

interactions_dataset_import_job_arn = interactions_create_dataset_import_job_response['datasetImportJobArn']
print(json.dumps(interactions_create_dataset_import_job_response, indent=2))

{
  "datasetImportJobArn": "arn:aws:personalize:us-east-1:029498593638:dataset-import-job/retaildemostore-dataset-interactions-import-job",
  "ResponseMetadata": {
    "RequestId": "47dba9f1-9dec-42ed-8f4b-b4bfce62f805",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Tue, 15 Sep 2020 04:14:16 GMT",
      "x-amzn-requestid": "47dba9f1-9dec-42ed-8f4b-b4bfce62f805",
      "content-length": "135",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


### Wait for Import Jobs to Complete

It will take 10-15 minutes for the import jobs to complete, while you're waiting you can learn more about Datasets and Schemas here: https://docs.aws.amazon.com/personalize/latest/dg/how-it-works-dataset-schema.html

We will wait for all three jobs to finish.

#### Wait for Items Import Job to Complete

In [26]:
status = None
max_time = time.time() + 3*60*60 # 3 hours
while time.time() < max_time:
    describe_dataset_import_job_response = personalize.describe_dataset_import_job(
        datasetImportJobArn = items_dataset_import_job_arn
    )
    status = describe_dataset_import_job_response["datasetImportJob"]['status']
    print("DatasetImportJob: {}".format(status))
    
    if status == "ACTIVE" or status == "CREATE FAILED":
        break
        
    time.sleep(60)

DatasetImportJob: CREATE IN_PROGRESS
DatasetImportJob: CREATE IN_PROGRESS
DatasetImportJob: CREATE IN_PROGRESS
DatasetImportJob: CREATE IN_PROGRESS
DatasetImportJob: CREATE IN_PROGRESS
DatasetImportJob: CREATE IN_PROGRESS
DatasetImportJob: CREATE IN_PROGRESS
DatasetImportJob: CREATE IN_PROGRESS
DatasetImportJob: CREATE IN_PROGRESS
DatasetImportJob: CREATE IN_PROGRESS
DatasetImportJob: CREATE IN_PROGRESS
DatasetImportJob: CREATE IN_PROGRESS
DatasetImportJob: CREATE IN_PROGRESS
DatasetImportJob: CREATE IN_PROGRESS
DatasetImportJob: CREATE IN_PROGRESS
DatasetImportJob: ACTIVE


#### Wait for Users Import Job to Complete

Since we submitted all three import jobs at the same time, it's likely that the users import job is already complete. Let's check to be sure.

In [None]:
status = None
max_time = time.time() + 3*60*60 # 3 hours
while time.time() < max_time:
    describe_dataset_import_job_response = personalize.describe_dataset_import_job(
        datasetImportJobArn = users_dataset_import_job_arn
    )
    status = describe_dataset_import_job_response["datasetImportJob"]['status']
    print("DatasetImportJob: {}".format(status))
    
    if status == "ACTIVE" or status == "CREATE FAILED":
        break
        
    time.sleep(60)

#### Wait for Interactions Import Job to Complete

Since we submitted all three import jobs at the same time, it's likely that the interactions import job is already complete. Let's check to be sure.

In [27]:
status = None
max_time = time.time() + 3*60*60 # 3 hours
while time.time() < max_time:
    describe_dataset_import_job_response = personalize.describe_dataset_import_job(
        datasetImportJobArn = interactions_dataset_import_job_arn
    )
    status = describe_dataset_import_job_response["datasetImportJob"]['status']
    print("DatasetImportJob: {}".format(status))
    
    if status == "ACTIVE" or status == "CREATE FAILED":
        break
        
    time.sleep(60)

DatasetImportJob: ACTIVE


## Create Solutions

With our three datasets imported into our dataset group, we can now turn to training models. As a reminder, we will be training three models in this workshop to support three different personalization use-cases. One model will be used to make related product recommendations on the product detail view/page, another model will be used to make personalized product recommendations to users on the homepage, and the last model will be used to rerank product lists on the category and featured products page. In Amazon Personalize, training a model involves creating a Solution and Solution Version. So when we are finished we will have three solutions and a solution version for each solution. 

When creating a solution, you provide your dataset group and the recipe for training. Let's declare the recipes that we will need for our solutions.

### List Recipes

First, let's list all available recipes.

In [28]:
list_recipes_response = personalize.list_recipes()
list_recipes_response

{'recipes': [{'name': 'aws-hrnn',
   'recipeArn': 'arn:aws:personalize:::recipe/aws-hrnn',
   'status': 'ACTIVE',
   'creationDateTime': datetime.datetime(2019, 6, 10, 0, 0, tzinfo=tzlocal()),
   'lastUpdatedDateTime': datetime.datetime(2020, 8, 25, 16, 57, 0, 148000, tzinfo=tzlocal())},
  {'name': 'aws-hrnn-coldstart',
   'recipeArn': 'arn:aws:personalize:::recipe/aws-hrnn-coldstart',
   'status': 'ACTIVE',
   'creationDateTime': datetime.datetime(2019, 6, 10, 0, 0, tzinfo=tzlocal()),
   'lastUpdatedDateTime': datetime.datetime(2020, 8, 25, 16, 57, 0, 148000, tzinfo=tzlocal())},
  {'name': 'aws-hrnn-metadata',
   'recipeArn': 'arn:aws:personalize:::recipe/aws-hrnn-metadata',
   'status': 'ACTIVE',
   'creationDateTime': datetime.datetime(2019, 6, 10, 0, 0, tzinfo=tzlocal()),
   'lastUpdatedDateTime': datetime.datetime(2020, 8, 25, 16, 57, 0, 148000, tzinfo=tzlocal())},
  {'name': 'aws-personalized-ranking',
   'recipeArn': 'arn:aws:personalize:::recipe/aws-personalized-ranking',
   's

As you can see above, there are several recipes to choose from. Let's declare the recipes for each Solution.

#### Declare Personalize Recipe for Related Products

On the product detail page we want to display related products so we'll create a campaign using the [SIMS](https://docs.aws.amazon.com/personalize/latest/dg/native-recipe-sims.html) recipe.

> The Item-to-item similarities (SIMS) recipe is based on the concept of collaborative filtering. A SIMS model leverages user-item interaction data to recommend items similar to a given item. In the absence of sufficient user behavior data for an item, this recipe recommends popular items.

In [29]:
related_recipe_arn = "arn:aws:personalize:::recipe/aws-sims"

#### Declare Personalize Recipe for Product Recommendations

Since we are providing metadata for users and items, we will be using the [HRNN-Metadata](https://docs.aws.amazon.com/personalize/latest/dg/native-recipe-hrnn-metadata.html) recipe for our product recommendations solution.

> The HRNN-Metadata recipe predicts the items that a user will interact with. It is similar to the HRNN recipe, with additional features derived from contextual, user, and item metadata (from Interactions, Users, and Items datasets, respectively). HRNN-Metadata provides accuracy benefits over non-metadata models when high quality metadata is available. Using this recipe might require longer training times.

In [30]:
recommend_recipe_arn = "arn:aws:personalize:::recipe/aws-hrnn-metadata"

#### Declare Personalize Recipe for Personalized Ranking

In use-cases where we have a curated list of products, we can use the [Personalized-Ranking](https://docs.aws.amazon.com/personalize/latest/dg/native-recipe-search.html) recipe to reorder the products for the current user.

> The Personalized-Ranking recipe generates personalized rankings. A personalized ranking is a list of recommended items that are re-ranked for a specific user.

In [31]:
ranking_recipe_arn = "arn:aws:personalize:::recipe/aws-personalized-ranking"

### Create Solutions and Solution Versions

With our recipes defined, we can now create our solutions and solution versions.

#### Create Related Products Solution

In [32]:
create_solution_response = personalize.create_solution(
    name = "retaildemostore-related-products",
    datasetGroupArn = dataset_group_arn,
    recipeArn = related_recipe_arn,
    eventType = "ProductViewed"
)

related_solution_arn = create_solution_response['solutionArn']
print(json.dumps(create_solution_response, indent=2))

{
  "solutionArn": "arn:aws:personalize:us-east-1:029498593638:solution/retaildemostore-related-products",
  "ResponseMetadata": {
    "RequestId": "bad8445d-adc6-4dec-bb1c-37b599212b9e",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Tue, 15 Sep 2020 04:36:52 GMT",
      "x-amzn-requestid": "bad8445d-adc6-4dec-bb1c-37b599212b9e",
      "content-length": "102",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


#### Create Related Products Solution Version

In [33]:
create_solution_version_response = personalize.create_solution_version(
    solutionArn = related_solution_arn
)

related_solution_version_arn = create_solution_version_response['solutionVersionArn']
print(json.dumps(create_solution_version_response, indent=2))

{
  "solutionVersionArn": "arn:aws:personalize:us-east-1:029498593638:solution/retaildemostore-related-products/888d3f38",
  "ResponseMetadata": {
    "RequestId": "93e6839c-0858-4218-b445-5d6b9f42b530",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Tue, 15 Sep 2020 04:36:53 GMT",
      "x-amzn-requestid": "93e6839c-0858-4218-b445-5d6b9f42b530",
      "content-length": "118",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


#### Create Product Recommendation Solution

In [34]:
create_solution_response = personalize.create_solution(
    name = "retaildemostore-product-personalization",
    datasetGroupArn = dataset_group_arn,
    recipeArn = recommend_recipe_arn,
    eventType = "ProductViewed"
)

recommend_solution_arn = create_solution_response['solutionArn']
print(json.dumps(create_solution_response, indent=2))

{
  "solutionArn": "arn:aws:personalize:us-east-1:029498593638:solution/retaildemostore-product-personalization",
  "ResponseMetadata": {
    "RequestId": "2fa2e6c4-854d-490d-bd3b-2b278a245a83",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Tue, 15 Sep 2020 04:36:53 GMT",
      "x-amzn-requestid": "2fa2e6c4-854d-490d-bd3b-2b278a245a83",
      "content-length": "109",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


#### Create Product Recommendation Solution Version

In [35]:
create_solution_version_response = personalize.create_solution_version(
    solutionArn = recommend_solution_arn
)

recommend_solution_version_arn = create_solution_version_response['solutionVersionArn']
print(json.dumps(create_solution_version_response, indent=2))

{
  "solutionVersionArn": "arn:aws:personalize:us-east-1:029498593638:solution/retaildemostore-product-personalization/07717f58",
  "ResponseMetadata": {
    "RequestId": "d85e280c-2c28-4cf2-a5df-4c0b97f96567",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Tue, 15 Sep 2020 04:36:55 GMT",
      "x-amzn-requestid": "d85e280c-2c28-4cf2-a5df-4c0b97f96567",
      "content-length": "125",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


#### Create Personalized Ranking Solution

In [36]:
create_solution_response = personalize.create_solution(
    name = "retaildemostore-personalized-ranking",
    datasetGroupArn = dataset_group_arn,
    recipeArn = ranking_recipe_arn,
    eventType = "ProductViewed"    
)

ranking_solution_arn = create_solution_response['solutionArn']
print(json.dumps(create_solution_response, indent=2))

{
  "solutionArn": "arn:aws:personalize:us-east-1:029498593638:solution/retaildemostore-personalized-ranking",
  "ResponseMetadata": {
    "RequestId": "63dbfdb2-51f3-41aa-9ed9-8acb969f2347",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Tue, 15 Sep 2020 04:36:55 GMT",
      "x-amzn-requestid": "63dbfdb2-51f3-41aa-9ed9-8acb969f2347",
      "content-length": "106",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


#### Create Personalized Ranking Solution Version

In [37]:
create_solution_version_response = personalize.create_solution_version(
    solutionArn = ranking_solution_arn
)

ranking_solution_version_arn = create_solution_version_response['solutionVersionArn']
print(json.dumps(create_solution_version_response, indent=2))

{
  "solutionVersionArn": "arn:aws:personalize:us-east-1:029498593638:solution/retaildemostore-personalized-ranking/2f27227e",
  "ResponseMetadata": {
    "RequestId": "8ef0c98b-ede6-407c-8c9e-af2f7c6ec1d0",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Tue, 15 Sep 2020 04:36:57 GMT",
      "x-amzn-requestid": "8ef0c98b-ede6-407c-8c9e-af2f7c6ec1d0",
      "content-length": "122",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


### Wait for Solution Versions to Complete

It can take 40-60 minutes for all solution versions to be created. During this process a model is being trained and tested with the data contained within your datasets. The duration of training jobs can increase based on the size of the dataset, training parameters and using AutoML vs. manually selecting a recipe. We submitted requests for all three solutions and versions at once so they are trained in parallel and then below we will wait for all three to finish.

While you are waiting for this process to complete you can learn more about solutions here: https://docs.aws.amazon.com/personalize/latest/dg/training-deploying-solutions.html

#### Wait for Related Products Solution Version to Have ACTIVE Status

In [None]:
status = None
max_time = time.time() + 3*60*60 # 3 hours
while time.time() < max_time:
    describe_solution_version_response = personalize.describe_solution_version(
        solutionVersionArn = related_solution_version_arn
    )
    status = describe_solution_version_response["solutionVersion"]["status"]
    print("SolutionVersion: {}".format(status))
    if status == 'CREATE FAILED':
        print(describe_solution_version_response["solutionVersion"]["failureReason"])
    
    if status == "ACTIVE" or status == "CREATE FAILED":
        break
        
    time.sleep(60)

SolutionVersion: CREATE PENDING
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: ACTIVE


#### Wait for Product Recommendation Solution Version to Have ACTIVE Status

Since we created the solution versions at the same time, they were being created in parallel. Therefore, it's likely that our product recommendations and personalized ranking solution versions are already complete. Let's check to make sure.

In [None]:
status = None
max_time = time.time() + 3*60*60 # 3 hours
while time.time() < max_time:
    describe_solution_version_response = personalize.describe_solution_version(
        solutionVersionArn = recommend_solution_version_arn
    )
    status = describe_solution_version_response["solutionVersion"]["status"]
    print("SolutionVersion: {}".format(status))
    if status == 'CREATE FAILED':
        print(describe_solution_version_response["solutionVersion"]["failureReason"])

    if status == "ACTIVE" or status == "CREATE FAILED":
        break
        
    time.sleep(60)

SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: ACTIVE


#### Wait for Personalized Ranking Solution Version to Have ACTIVE Status

In [None]:
status = None
max_time = time.time() + 3*60*60 # 3 hours
while time.time() < max_time:
    describe_solution_version_response = personalize.describe_solution_version(
        solutionVersionArn = ranking_solution_version_arn
    )
    status = describe_solution_version_response["solutionVersion"]["status"]
    print("SolutionVersion: {}".format(status))
    if status == 'CREATE FAILED':
        print(describe_solution_version_response["solutionVersion"]["failureReason"])
    
    if status == "ACTIVE" or status == "CREATE FAILED":
        break
        
    time.sleep(60)

SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: ACTIVE


### Evaluate Offline Metrics for Solution Versions

Amazon Personalize provides [offline metrics](https://docs.aws.amazon.com/personalize/latest/dg/working-with-training-metrics.html#working-with-training-metrics-metrics) that allow you to evaluate the performance of the solution version before you deploy the model in your application. Metrics can also be used to view the effects of modifying a Solution's hyperparameters or to compare the metrics between solutions that use the same training data but created with different recipes.

Let's retrieve the metrics for the solution versions we just created.

#### Related Products Metrics

In [51]:
get_solution_metrics_response = personalize.get_solution_metrics(
    solutionVersionArn = related_solution_version_arn
)

print(json.dumps(get_solution_metrics_response, indent=2))

{
  "solutionVersionArn": "arn:aws:personalize:us-east-1:029498593638:solution/retaildemostore-related-products/888d3f38",
  "metrics": {
    "coverage": 1.0,
    "mean_reciprocal_rank_at_25": 0.6626,
    "normalized_discounted_cumulative_gain_at_10": 0.6143,
    "normalized_discounted_cumulative_gain_at_25": 0.7094,
    "normalized_discounted_cumulative_gain_at_5": 0.4682,
    "precision_at_10": 0.459,
    "precision_at_25": 0.2372,
    "precision_at_5": 0.4655
  },
  "ResponseMetadata": {
    "RequestId": "3e537892-60dc-4114-8a65-9cdbfeb52993",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Wed, 16 Sep 2020 01:49:45 GMT",
      "x-amzn-requestid": "3e537892-60dc-4114-8a65-9cdbfeb52993",
      "content-length": "412",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


#### Product Recommendations Metrics

In [None]:
get_solution_metrics_response = personalize.get_solution_metrics(
    solutionVersionArn = recommend_solution_version_arn
)

print(json.dumps(get_solution_metrics_response, indent=2))

{
  "solutionVersionArn": "arn:aws:personalize:us-east-1:029498593638:solution/retaildemostore-product-personalization/07717f58",
  "metrics": {
    "coverage": 0.9825,
    "mean_reciprocal_rank_at_25": 0.6399,
    "normalized_discounted_cumulative_gain_at_10": 0.5801,
    "normalized_discounted_cumulative_gain_at_25": 0.7044,
    "normalized_discounted_cumulative_gain_at_5": 0.4545,
    "precision_at_10": 0.4614,
    "precision_at_25": 0.2595,
    "precision_at_5": 0.4524
  },
  "ResponseMetadata": {
    "RequestId": "f7eaa953-3d00-44da-8a29-3672064fcfcc",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Tue, 15 Sep 2020 05:30:05 GMT",
      "x-amzn-requestid": "f7eaa953-3d00-44da-8a29-3672064fcfcc",
      "content-length": "423",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


#### Personalized Ranking Metrics

In [None]:
get_solution_metrics_response = personalize.get_solution_metrics(
    solutionVersionArn = ranking_solution_version_arn
)

print(json.dumps(get_solution_metrics_response, indent=2))

{
  "solutionVersionArn": "arn:aws:personalize:us-east-1:029498593638:solution/retaildemostore-personalized-ranking/2f27227e",
  "metrics": {
    "coverage": 0.4386,
    "mean_reciprocal_rank_at_25": 0.3263,
    "normalized_discounted_cumulative_gain_at_10": 0.3065,
    "normalized_discounted_cumulative_gain_at_25": 0.3467,
    "normalized_discounted_cumulative_gain_at_5": 0.2298,
    "precision_at_10": 0.2308,
    "precision_at_25": 0.116,
    "precision_at_5": 0.2228
  },
  "ResponseMetadata": {
    "RequestId": "24f90b58-2472-49f9-af32-ff9a90701a43",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Tue, 15 Sep 2020 05:30:05 GMT",
      "x-amzn-requestid": "24f90b58-2472-49f9-af32-ff9a90701a43",
      "content-length": "419",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


## Create Campaigns

Once we're satisfied with our solution versions, we need to create Campaigns for each solution version. When creating a campaign you specify the minimum transactions per second (`minProvisionedTPS`) that you expect to make against the service for this campaign. Personalize will automatically scale the inference endpoint up and down for the campaign to match demand but will never scale below `minProvisionedTPS`.

Let's create campaigns for our three solution versions with each set at `minProvisionedTPS` of 1.

#### Create Related Products Campaign

In [52]:
create_campaign_response = personalize.create_campaign(
    name = "retaildemostore-related-products",
    solutionVersionArn = related_solution_version_arn,
    minProvisionedTPS = 1
)

related_campaign_arn = create_campaign_response['campaignArn']
print(json.dumps(create_campaign_response, indent=2))

{
  "campaignArn": "arn:aws:personalize:us-east-1:029498593638:campaign/retaildemostore-related-products",
  "ResponseMetadata": {
    "RequestId": "73a62d44-33b0-4372-8aef-a909f9468d41",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Wed, 16 Sep 2020 01:50:39 GMT",
      "x-amzn-requestid": "73a62d44-33b0-4372-8aef-a909f9468d41",
      "content-length": "102",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


#### Create Product Recommendation Campaign

In [53]:
create_campaign_response = personalize.create_campaign(
    name = "retaildemostore-product-personalization",
    solutionVersionArn = recommend_solution_version_arn,
    minProvisionedTPS = 1
)

recommend_campaign_arn = create_campaign_response['campaignArn']
print(json.dumps(create_campaign_response, indent=2))

{
  "campaignArn": "arn:aws:personalize:us-east-1:029498593638:campaign/retaildemostore-product-personalization",
  "ResponseMetadata": {
    "RequestId": "333df408-c097-4c47-b275-fd368207f1f2",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Wed, 16 Sep 2020 01:50:41 GMT",
      "x-amzn-requestid": "333df408-c097-4c47-b275-fd368207f1f2",
      "content-length": "109",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


#### Create Personalized Ranking Campaign

In [54]:
create_campaign_response = personalize.create_campaign(
    name = "retaildemostore-personalized-ranking",
    solutionVersionArn = ranking_solution_version_arn,
    minProvisionedTPS = 1
)

ranking_campaign_arn = create_campaign_response['campaignArn']
print(json.dumps(create_campaign_response, indent=2))

{
  "campaignArn": "arn:aws:personalize:us-east-1:029498593638:campaign/retaildemostore-personalized-ranking",
  "ResponseMetadata": {
    "RequestId": "f903e0e2-6385-434a-965e-a96f3965cd93",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Wed, 16 Sep 2020 01:50:42 GMT",
      "x-amzn-requestid": "f903e0e2-6385-434a-965e-a96f3965cd93",
      "content-length": "106",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


#### Wait for Related Products Campaign to Have ACTIVE Status

It can take 20-30 minutes for the campaigns to be fully created. 

While you are waiting for this to complete you can learn more about campaigns here: https://docs.aws.amazon.com/personalize/latest/dg/campaigns.html

In [55]:
status = None
max_time = time.time() + 3*60*60 # 3 hours
while time.time() < max_time:
    describe_campaign_response = personalize.describe_campaign(
        campaignArn = related_campaign_arn
    )
    status = describe_campaign_response["campaign"]["status"]
    print("Campaign: {}".format(status))
    
    if status == "ACTIVE" or status == "CREATE FAILED":
        break
        
    time.sleep(60)

Campaign: ACTIVE


#### Wait for Product Recommendation Campaign to Have ACTIVE Status

Since we created our campaigns at the same time, they were being built in parallel. Therefore, it's likely that the product recommendation and personalized ranking campaigns are already active. Let's check to make sure.

In [56]:
status = None
max_time = time.time() + 3*60*60 # 3 hours
while time.time() < max_time:
    describe_campaign_response = personalize.describe_campaign(
        campaignArn = recommend_campaign_arn
    )
    status = describe_campaign_response["campaign"]["status"]
    print("Campaign: {}".format(status))
    
    if status == "ACTIVE" or status == "CREATE FAILED":
        break
        
    time.sleep(60)

Campaign: ACTIVE


#### Wait for Personalized Ranking Campaign to Have ACTIVE Status

In [57]:
status = None
max_time = time.time() + 3*60*60 # 3 hours
while time.time() < max_time:
    describe_campaign_response = personalize.describe_campaign(
        campaignArn = ranking_campaign_arn
    )
    status = describe_campaign_response["campaign"]["status"]
    print("Campaign: {}".format(status))
    
    if status == "ACTIVE" or status == "CREATE FAILED":
        break
        
    time.sleep(60)

Campaign: ACTIVE


## Test Campaigns

Now that our campaigns have been fully created, let's test each campaign and evaluate the results.

### Test Related Product Recommendations Campaign

Let's test the recommendations made by the related items/products campaign by selecting a product from the Retail Demo Store's Products microservice and requesting related item recommendations for that product.

#### Select a Product

We'll just pick a random product for simplicity. Feel free to change the `productId` below and execute the following cells with a different product to get a sense for how the recommendations change.

In [58]:
productId = 22

response = requests.get('http://{}/products/id/{}'.format(products_service_instance, productId))
product = response.json()
print(json.dumps(product, indent=4, sort_keys=True))

{
    "category": "housewares",
    "description": "Mug and Coffee gift set combination package.",
    "id": "22",
    "image": "3.jpg",
    "name": "Coffee Gift Package",
    "price": 39.99,
    "sk": "",
    "style": "consumable",
    "url": "http://d1jmev7kvuszlh.cloudfront.net/#/product/22"
}


#### Get Related Product Recommendations for Product

Now let's call Amazon Personalize to get related item/product recommendations for our product from the related item campaign.

In [59]:
get_recommendations_response = personalize_runtime.get_recommendations(
    campaignArn = related_campaign_arn,
    itemId = str(productId),
    numResults = 10
)

item_list = get_recommendations_response['itemList']

In [60]:
print(json.dumps(item_list, indent=4))

[
    {
        "itemId": "2"
    },
    {
        "itemId": "50"
    },
    {
        "itemId": "51"
    },
    {
        "itemId": "26"
    },
    {
        "itemId": "34"
    },
    {
        "itemId": "42"
    },
    {
        "itemId": "38"
    },
    {
        "itemId": "14"
    },
    {
        "itemId": "30"
    },
    {
        "itemId": "18"
    }
]


Since the `itemId`'s in the above response don't tell us much about the products being recommended, let's get detailed information for each item ID from the Products microservice.

In [61]:
for item in item_list:
    response = requests.get('http://{}/products/id/{}'.format(products_service_instance, item['itemId']))
    print(json.dumps(response.json(), indent = 4))

{
    "id": "2",
    "url": "http://d1jmev7kvuszlh.cloudfront.net/#/product/2",
    "sk": "",
    "name": "Striped Shirt",
    "category": "apparel",
    "style": "shirt",
    "description": "A classic look for the summer season.",
    "price": 9.99,
    "image": "1.jpg",
    "featured": "true"
}
{
    "id": "50",
    "url": "http://d1jmev7kvuszlh.cloudfront.net/#/product/50",
    "sk": "",
    "name": "Blue Wind Breaker Jacket",
    "category": "apparel",
    "style": "jacket",
    "description": "Stay warm and comfortable on those cool fall evenings.",
    "price": 79.95,
    "image": "25.jpg"
}
{
    "id": "51",
    "url": "http://d1jmev7kvuszlh.cloudfront.net/#/product/51",
    "sk": "",
    "name": "Ceramic Mixing Bowls",
    "category": "housewares",
    "style": "bowls",
    "description": "Add some style to your kitchen with this artistically designed mixing bowls.",
    "price": 49.95,
    "image": "7.jpg"
}
{
    "id": "26",
    "url": "http://d1jmev7kvuszlh.cloudfront.net/#/

### Test Product Recommendations Campaign

Let's test the recommendations made by the product recommendations campaign by selecting a user from the Retail Demo Store's Users microservice and requesting item recommendations for that user.

#### Select a User

We'll just pick a random user for simplicity. Feel free to change the `userId` below and execute the following cells with a different user to get a sense for how the recommendations change.

In [62]:
userId = 10

response = requests.get('http://{}/users/id/{}'.format(users_service_instance, userId))
user = response.json()
persona = user['persona']

In [63]:
print(json.dumps(user, indent=4, sort_keys=True))

{
    "addresses": [
        {
            "address1": "2875 Holly River Suite 600",
            "address2": "",
            "city": "Lake Laura",
            "country": "US",
            "default": true,
            "first_name": "Robert",
            "last_name": "Williams",
            "state": "SD",
            "zipcode": "57093"
        }
    ],
    "age": 28,
    "email": "robert.williams@example.com",
    "first_name": "Robert",
    "gender": "M",
    "id": "10",
    "last_name": "Williams",
    "persona": "jewelry_accessories",
    "username": "user10"
}


**Take note of the `persona` value for the user above. We should see recommendations for products consistent with this persona.**

In [64]:
print('Shopper persona for user {} is {}'.format(userId, persona))

Shopper persona for user 10 is jewelry_accessories


#### Get Product Recommendations for User

Now let's call Amazon Personalize to get recommendations for our user from the product recommendations campaign.

In [65]:
get_recommendations_response = personalize_runtime.get_recommendations(
    campaignArn = recommend_campaign_arn,
    userId = str(userId),
    numResults = 10
)

item_list = get_recommendations_response['itemList']

In [66]:
print(json.dumps(item_list, indent=4))

[
    {
        "itemId": "55",
        "score": 0.0942199
    },
    {
        "itemId": "1",
        "score": 0.0923305
    },
    {
        "itemId": "39",
        "score": 0.0867814
    },
    {
        "itemId": "41",
        "score": 0.0794307
    },
    {
        "itemId": "47",
        "score": 0.0730218
    },
    {
        "itemId": "15",
        "score": 0.0719396
    },
    {
        "itemId": "33",
        "score": 0.0704805
    },
    {
        "itemId": "25",
        "score": 0.0688857
    },
    {
        "itemId": "31",
        "score": 0.0679565
    },
    {
        "itemId": "9",
        "score": 0.0650193
    }
]


Since the `itemId`'s in the above response don't tell us much about the products being recommended, let's retrieve product details from the Products microservice.

In [67]:
print('User persona: ' + persona)

for item in item_list:
    response = requests.get('http://{}/products/id/{}'.format(products_service_instance, item['itemId']))
    print(json.dumps(response.json(), indent = 4))

User persona: jewelry_accessories
{
    "id": "55",
    "url": "http://d1jmev7kvuszlh.cloudfront.net/#/product/55",
    "sk": "",
    "name": "Pink Leather Purse",
    "category": "accessories",
    "style": "bag",
    "description": "Conveniently sized to carry all your essentials in style, pink tassle included.",
    "price": 89.99,
    "image": "9.jpg"
}
{
    "id": "1",
    "url": "http://d1jmev7kvuszlh.cloudfront.net/#/product/1",
    "sk": "",
    "name": "Black Leather Backpack",
    "category": "accessories",
    "style": "bag",
    "description": "Our handmade leather backpack will look great at the office or out on the town.",
    "price": 109.99,
    "image": "1.jpg",
    "featured": "true"
}
{
    "id": "39",
    "url": "http://d1jmev7kvuszlh.cloudfront.net/#/product/39",
    "sk": "",
    "name": "Turquoise Womens Necklace",
    "category": "jewelry",
    "style": "necklace",
    "description": "This attractive turquoise pendant necklace will set off any outfit.",
    "pri

Are the recommended products consistent with the persona? Note that this is a rather contrived example using a limited amount of generated interaction data without model parameter tuning. The purpose is to give you hands on experience building models and retrieving inferences from Amazon Personalize. 

### Test Personalized Ranking Campaign

Next let's evaluate the results of the personalized ranking campaign. As a reminder, given a list of items and a user, this campaign will rerank the items based on the preferences of the user. For the Retail Demo Store, we will use this campaign to rerank the products listed for each category and the featured products list.

#### Get Featured Products List

First let's get the list of featured products from the Products microservice.

In [68]:
response = requests.get('http://{}/products/featured'.format(products_service_instance))
featured_products = response.json()
print(json.dumps(featured_products, indent = 4))

[
    {
        "id": "2",
        "url": "http://d1jmev7kvuszlh.cloudfront.net/#/product/2",
        "sk": "",
        "name": "Striped Shirt",
        "category": "apparel",
        "style": "shirt",
        "description": "A classic look for the summer season.",
        "price": 9.99,
        "image": "1.jpg",
        "featured": "true"
    },
    {
        "id": "1",
        "url": "http://d1jmev7kvuszlh.cloudfront.net/#/product/1",
        "sk": "",
        "name": "Black Leather Backpack",
        "category": "accessories",
        "style": "bag",
        "description": "Our handmade leather backpack will look great at the office or out on the town.",
        "price": 109.99,
        "image": "1.jpg",
        "featured": "true"
    },
    {
        "id": "6",
        "url": "http://d1jmev7kvuszlh.cloudfront.net/#/product/6",
        "sk": "",
        "name": "Coffee Gift Package",
        "category": "housewares",
        "style": "consumable",
        "description": "Mug and Cof

#### ReRank Featured Products

Using the featured products list just retrieved, first we'll create a list of item IDs that we want to rerank for a specific user. This reranking will allow us to provide ranked products based on the user's behavior. These behaviors should be consistent the same persona that was mentioned above.

In [69]:
unranked_product_ids = []

for product in featured_products:
    unranked_product_ids.append(product['id'])
    
print(unranked_product_ids)

['2', '1', '6', '5', '4', '3']


In [70]:
response = personalize_runtime.get_personalized_ranking(
    campaignArn=ranking_campaign_arn,
    inputList=unranked_product_ids,
    userId=str(userId)
)
print(json.dumps(response['personalizedRanking'], indent = 4))

[
    {
        "itemId": "1",
        "score": 0.9975692
    },
    {
        "itemId": "5",
        "score": 0.0007871
    },
    {
        "itemId": "2",
        "score": 0.0004842
    },
    {
        "itemId": "3",
        "score": 0.0004737
    },
    {
        "itemId": "6",
        "score": 0.0003825
    },
    {
        "itemId": "4",
        "score": 0.0003034
    }
]


Are the reranked results different than the original results from the Search service? Experiment with a different `userId` in the cells above to see how the item ranking changes.

## Enable Campaigns in Retail Demo Store Recommendations Service

Now that we've tested our campaigns and can get related product, product recommendations, and reranked items for our users, we need to enable the campaigns in the Retail Demo Store's Recommendations service. The Recommendations service is called by the Retail Demo Store Web UI when a signed in user visits a page with personalized content capabilities (home page, product detail page, and category page). The Recommendations service checks Systems Manager Parameters values to determine the Personalize campaign ARNs to use for each of our three personalization use-cases.

Let's set the campaign ARNs for our campaigns in the expected parameter names.

### Update SSM Parameter To Enable Related Products

In [71]:
response = ssm.put_parameter(
    Name='retaildemostore-related-products-campaign-arn',
    Description='Retail Demo Store Related Products Campaign Arn Parameter',
    Value='{}'.format(related_campaign_arn),
    Type='String',
    Overwrite=True
)

### Update SSM Parameter To Enable Product Recommendations

In [72]:
response = ssm.put_parameter(
    Name='retaildemostore-product-recommendation-campaign-arn',
    Description='Retail Demo Store Product Recommendation Campaign Arn Parameter',
    Value='{}'.format(recommend_campaign_arn),
    Type='String',
    Overwrite=True
)

### Update SSM Parameter To Enable Search Personalization

In [73]:
response = ssm.put_parameter(
    Name='retaildemostore-personalized-ranking-campaign-arn',
    Description='Retail Demo Store Personalized Ranking Campaign Arn Parameter',
    Value='{}'.format(ranking_campaign_arn),
    Type='String',
    Overwrite=True
)

## Evaluate Personalization in Retail Demo Store's Web UI

Now that you've enabled each personalization feature by setting the respective campaign ARN, you can test these personalization features through the Web App UI. If you haven't already opened a browser window/tab to the Retail Demo Store Web UI, navigate to the CloudFormation console in this AWS account and check the Outputs section of the stack used to launch the Retail Demo Store. Make sure you're checking the base stack and not the nested stacks that were created. In the Outputs section look for the output named: WebURL and browse to the URL provided.

![CloudFormation Outputs](../images/cfn-webui-outputs.png)

If you haven't already created a user account in your Retail Demo Store instance, let's create one now. Once you've accessed the Retail Demo Store Web UI, you can logon and create a new account. Click on the Sign In button and then the "**No account? Create account**" link to create an account. Follow the prompts and enter the required data. You will need to provide a valid mobile phone number in order to receive an SMS message with the confirmation code to validate your account.

Once you've created and validated your account, Click on the Sign In button again and login with the account you created.

### Emulate Shopper

To confirm product recommendations are personalized, you can emulate a different user. Click on your username in the top right-corner and then select **Profile**. Use the Switch User drop-down to select a different user. In the drop down you will see a user's First and Last Name and a Persona that matches that user's behavior. Product recommendations should match the persona of the user you've selected.

![Emulate Shopper](../images/retaildemostore-emulate.png)

### Viewing Related Product Recommendations

With the user emulation saved, browse to a product detail page and evaluate the products listed in the **What other items do customers view related to this product?** section. Do they appear consistent with the product being displayed? 

![Related Product Recommendations](./images/retaildemostore-related-products.png)

### Viewing Product Recommendations

With the user emulation saved, browse to the Retail Demo Store home page and evaluate the products listed in the **Inspired by your shopping trends** section (towards bottom of page). Do they appear consistent with the shopping persona you're emulating? 

![Personalized Product Recommendations](./images/retaildemostore-product-recs.png)

Note that if the section is titled **Featured**, this indicates that either you are not signed in as a user or the campaign ARN is not set as the appropriate SSM parameter.

### Personalized Ranking

Finally, let's evaluate the personalizated ranking use-case. With a user emulated, browse to the featured product category list by clicking on "Featured" from the Retail Demo Store home page.

![Personalized Product Ranking](./images/retaildemostore-personalized-ranking.png)

## Event Tracking

Up to this point we have trained and deployed three Amazon Personalize campaigns based on historical data that we generated in this workshop. This allows us to make related product, user recommendations, and rerank product lists based on already observed behavior of our users. However, user intent often changes in real-time such that what products the user is interested in now may be different than what they were interested in a week ago, a day ago, or even a few minutes ago. Making recommendations that keep up with evolving user intent is one of the more difficult challenges with personalization. Fortunately, Amazon Personalize has a mechanism for this exact issue.

Amazon Personalize supports the ability to send real-time user events (i.e. clickstream) data into the service. Personalize uses this event data to improve recommendations. It will also save these events and automatically include them when solutions for the same dataset group are re-created (i.e. model retraining).

The Retail Demo Store's Web UI already has logic to send events such as 'ProductViewed', 'ProductAdded', and 'OrderCompleted' to a Personalize Event Tracker. All we need to do is create an event tracker in Personalize, set the tracking Id for the tracker in an SSM parameter, and rebuild the Web UI service to pick up the change.

### Create Personalize Event Tracker

Let's start by creating an event tracker for our dataset group.

In [74]:
event_tracker_response = personalize.create_event_tracker(
    datasetGroupArn=dataset_group_arn,
    name='retaildemostore-event-tracker'
)

event_tracker_arn = event_tracker_response['eventTrackerArn']
event_tracking_id = event_tracker_response['trackingId']

print('Event Tracker ARN: ' + event_tracker_arn)
print('Event Tracking ID: ' + event_tracking_id)

Event Tracker ARN: arn:aws:personalize:us-east-1:029498593638:event-tracker/700b2f9d
Event Tracking ID: fdb6a7da-b86d-4736-a125-822310e12e90


### Wait for Event Tracker Status to Become ACTIVE

The event tracker should take a minute or so to become active.

In [75]:
status = None
max_time = time.time() + 60*60 # 1 hours
while time.time() < max_time:
    describe_event_tracker_response = personalize.describe_event_tracker(
        eventTrackerArn = event_tracker_arn
    )
    status = describe_event_tracker_response["eventTracker"]["status"]
    print("EventTracker: {}".format(status))
    
    if status == "ACTIVE" or status == "CREATE FAILED":
        break
        
    time.sleep(15)

EventTracker: CREATE PENDING
EventTracker: CREATE PENDING
EventTracker: ACTIVE


### Update SSM Parameter To Enable Event Tracking

The Retail Demo Store's Web UI service just needs a Personalize Event Tracking Id to be able to send events to Personalize. The CodeBuild configuration for the Web UI service will pull the event tracking ID from an SSM parameter. 

Let's set our tracking ID in an SSM parameter.

In [76]:
response = ssm.put_parameter(
    Name='retaildemostore-personalize-event-tracker-id',
    Description='Retail Demo Store Personalize Event Tracker ID Parameter',
    Value='{}'.format(event_tracking_id),
    Type='String',
    Overwrite=True
)

### Trigger Web UI Service Release

Next let's trigger a new release of the Retail Demo Store's Web UI service so that it will pick up our SSM parameter change.

In the AWS console, browse to the AWS Code Pipeline service. Find the pipeline with **WebUIService** in the name. Click on the pipeline name.

![AWS CodePipeline](./images/retaildemostore-codepipeline.png)

#### Trigger Release

To manually trigger a new release, click the **Release change** button, click the **Release** button on the popup dialog window, and then wait for the pipeline to build and deploy.

![AWS CodePipeline Release](./images/retaildemostore-codepipeline-release.png)

### Verify Event Tracking

Return to your web browser tab/window where the Retail Demo Store Web UI is loaded. There are a couple ways to verify that events are being sent to the Event Tracker. First, you can use your browser's developer tools to monitor the network calls made by the Retail Demo Store Web UI when you're browsing to product detail pages, adding items to carts, and completing orders. The other way you can verify that events are being received by the event tracker is in CloudWatch metrics for Personalize.

1. **Reload the app by refreshing your browser.**
2. In the Retail Demo Store Web app, view product detail pages, add items to your cart, complete an order.
3. Verify that the Web UI is making "events" calls to the Personalize Event Tracker.
4. In the AWS console, browse to CloudWatch and then Metrics.

![Personalize CloudWatch Metrics](./images/retaildemostore-eventtracker-cw.png)



## Workshop Complete

Congratulations! You have completed the Retail Demo Store Personalization Workshop.

### Cleanup

If you launched the Retail Demo Store in your personal AWS account **AND** you're done with all workshops, you can follow the [Personalize workshop cleanup](./1.2-Personalize-Cleanup.ipynb) notebook to delete all of the Amazon Personalize resources created by this workshop. **IMPORTANT: since the Personalize resources were created by this notebook and not CloudFormation, deleting the CloudFormation stack for the Retail Demo Store will not remove the Personalize resources. You MUST run the [Personalize workshop cleanup](./1.2-Personalize-Cleanup.ipynb) notebook or manually clean up these resources.**

If you are participating in an AWS managed event such as a workshop and using an AWS provided temporary account, you can skip the cleanup workshop unless otherwise instructed.