# Deploying Campaigns and Filters<a class="anchor" id="top"></a>

In this notebook, you will deploy and interact with campaigns in Amazon Personalize.

1. [Introduction](#intro)
1. [Create campaigns](#create)
1. [Interact with campaigns](#interact)
1. [Batch recommendations](#batch)
1. [Wrap up](#wrapup)

## Introduction <a class="anchor" id="intro"></a>
[Back to top](#top)

At this point, you should have several solutions and at least one solution version for each. Once a solution version is created, it is possible to get recommendations from them, and to get a feel for their overall behavior.

This notebook starts off by deploying each of the solution versions from the previous notebook into individual campaigns. Once they are active, there are resources for querying the recommendations, and helper functions to digest the output into something more human-readable. 

As you with your customer on Amazon Personalize, you can modify the helper functions to fit the structure of their data input files to keep the additional rendering working.

To get started, once again, we need to import libraries, load values from previous notebooks, and load the SDK.

In [1]:
import time
from time import sleep
import json
from datetime import datetime
import uuid

import boto3
import pandas as pd

In [2]:
%store -r

In [3]:
personalize = boto3.client('personalize')
personalize_runtime = boto3.client('personalize-runtime')

# Establish a connection to Personalize's event streaming
personalize_events = boto3.client(service_name='personalize-events')

## Create campaigns <a class="anchor" id="create"></a>
[Back to top](#top)

A campaign is a hosted solution version; an endpoint which you can query for recommendations. Pricing is set by estimating throughput capacity (requests from users for personalization per second). When deploying a campaign, you set a minimum throughput per second (TPS) value. This service, like many within AWS, will automatically scale based on demand, but if latency is critical, you may want to provision ahead for larger demand. For this POC and demo, all minimum throughput thresholds are set to 1. For more information, see the [pricing page](https://aws.amazon.com/personalize/pricing/).

Let's start deploying the campaigns.

### HRNN

Deploy a campaign for your HRNN solution version. It can take around 10 minutes to deploy a campaign. Normally, we would use a while loop to poll until the task is completed. However the task would block other cells from executing, and the goal here is to create multiple campaigns. So we will set up the while loop for all of the campaigns further down in the notebook. There, you will also find instructions for viewing the progress in the AWS console.

In [4]:
userpersonalization_create_campaign_response = personalize.create_campaign(
    name = "personalize-poc-hrnn",
    solutionVersionArn = userpersonalization_solution_version_arn,
    minProvisionedTPS = 1
)

userpersonalization_campaign_arn = userpersonalization_create_campaign_response['campaignArn']
print(json.dumps(userpersonalization_create_campaign_response, indent=2))

{
  "campaignArn": "arn:aws:personalize:us-east-1:059124553121:campaign/personalize-poc-hrnn",
  "ResponseMetadata": {
    "RequestId": "6ae29a70-3c49-47c2-aa44-db3d8efdf3e0",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Thu, 27 Aug 2020 20:40:52 GMT",
      "x-amzn-requestid": "6ae29a70-3c49-47c2-aa44-db3d8efdf3e0",
      "content-length": "90",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


### SIMS

Deploy a campaign for your SIMS solution version. It can take around 10 minutes to deploy a campaign. Normally, we would use a while loop to poll until the task is completed. However the task would block other cells from executing, and the goal here is to create multiple campaigns. So we will set up the while loop for all of the campaigns further down in the notebook. There, you will also find instructions for viewing the progress in the AWS console.

In [5]:
sims_create_campaign_response = personalize.create_campaign(
    name = "personalize-poc-SIMS",
    solutionVersionArn = sims_solution_version_arn,
    minProvisionedTPS = 1
)

sims_campaign_arn = sims_create_campaign_response['campaignArn']
print(json.dumps(sims_create_campaign_response, indent=2))

{
  "campaignArn": "arn:aws:personalize:us-east-1:059124553121:campaign/personalize-poc-SIMS",
  "ResponseMetadata": {
    "RequestId": "b937ad22-b7b4-404b-8b37-bfca43f3e0ac",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Thu, 27 Aug 2020 20:40:55 GMT",
      "x-amzn-requestid": "b937ad22-b7b4-404b-8b37-bfca43f3e0ac",
      "content-length": "90",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


### Personalized Ranking

Deploy a campaign for your personalized ranking solution version. It can take around 10 minutes to deploy a campaign. Normally, we would use a while loop to poll until the task is completed. However the task would block other cells from executing, and the goal here is to create multiple campaigns. So we will set up the while loop for all of the campaigns further down in the notebook. There, you will also find instructions for viewing the progress in the AWS console.

In [6]:
rerank_create_campaign_response = personalize.create_campaign(
    name = "personalize-poc-rerank",
    solutionVersionArn = rerank_solution_version_arn,
    minProvisionedTPS = 1
)

rerank_campaign_arn = rerank_create_campaign_response['campaignArn']
print(json.dumps(rerank_create_campaign_response, indent=2))

{
  "campaignArn": "arn:aws:personalize:us-east-1:059124553121:campaign/personalize-poc-rerank",
  "ResponseMetadata": {
    "RequestId": "500a9191-f0ee-42cd-bfd6-6d2f1d1fa8b7",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Thu, 27 Aug 2020 20:40:57 GMT",
      "x-amzn-requestid": "500a9191-f0ee-42cd-bfd6-6d2f1d1fa8b7",
      "content-length": "92",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


### View campaign creation status

As promised, how to view the status updates in the console:

* In another browser tab you should already have the AWS Console up from opening this notebook instance. 
* Switch to that tab and search at the top for the service `Personalize`, then go to that service page. 
* Click `View dataset groups`.
* Click the name of your dataset group, most likely something with POC in the name.
* Click `Campaigns`.
* You will now see a list of all of the campaigns you created above, including a column with the status of the campaign. Once it is `Active`, your campaign is ready to be queried.

Or simply run the cell below to keep track of the campaign creation status.

In [14]:
in_progress_campaigns = [
    userpersonalization_campaign_arn,
    sims_campaign_arn,
    rerank_campaign_arn
]

max_time = time.time() + 3*60*60 # 3 hours
while time.time() < max_time:
    for campaign_arn in in_progress_campaigns:
        version_response = personalize.describe_campaign(
            campaignArn = campaign_arn
        )
        status = version_response["campaign"]["status"]
        
        if status == "ACTIVE":
            print("Build succeeded for {}".format(campaign_arn))
            in_progress_campaigns.remove(campaign_arn)
        elif status == "CREATE FAILED":
            print("Build failed for {}".format(campaign_arn))
            in_progress_campaigns.remove(campaign_arn)
    
    if len(in_progress_campaigns) <= 0:
        break
    else:
        print("At least one campaign build is still in progress")
        
    time.sleep(60)

Build succeeded for arn:aws:personalize:us-east-1:059124553121:campaign/personalize-poc-hrnn
Build succeeded for arn:aws:personalize:us-east-1:059124553121:campaign/personalize-poc-rerank
At least one campaign build is still in progress
Build succeeded for arn:aws:personalize:us-east-1:059124553121:campaign/personalize-poc-SIMS


## Create Filters <a class="anchor" id="interact"></a>
[Back to top](#top)

Now that all campaigns are deployed and active, we can create filters. Filters can be created for both Items and Events. A few common use cases for filters in Video On Demand are

Categorical filters based on Item Metadata - Often your item metadata will have information about thee title such as Genre, Keyword, Year, Decade etc. Filtering on these can provide recommendations within that data, such as action movies.

Events - you may want to filter out certain events and provide results based on those events, such as moving a title from a "suggestions to watch" recommendation to a "watch again" recommendations.

Lets look at the item metadata and user interactions, so we can get an idea what type of filters we can create.

In [9]:
# Create a dataframe for the items by reading in the correct source CSV
items_df = pd.read_csv(data_dir + '/item-meta.csv', sep=',', index_col=0)
interactions_df = pd.read_csv(data_dir + '/interactions.csv', sep=',', index_col=0)

# Render some sample data
items_df.head(10)
#interactions_df.head(10)

  mask |= (ar1 == a)


Unnamed: 0_level_0,GENRE
ITEM_ID,Unnamed: 1_level_1
1,Adventure|Animation|Children|Comedy|Fantasy
2,Adventure|Children|Fantasy
3,Comedy|Romance
4,Comedy|Drama|Romance
5,Comedy
6,Action|Crime|Thriller
7,Comedy|Romance
8,Adventure|Children
9,Action
10,Action|Adventure|Thriller


Now what we want to do is determine the genres to filter on, for that we need a list of all genres. First we will get all the unique values of the column GENRE, then split strings on `|` if they exist, everyone will then get added to a long list which will be converted to a set for efficiency. That set will then be made into a list so that it can be iterated, and we can then use the create filter API.

In [16]:
unique_genre_field_values = items_df['GENRE'].unique()

genre_val_list = []

def process_for_bar_char(val, val_list):
    if '|' in val:
        values = val.split('|')
        for item in values:
            val_list.append(item)
    else:
        val_list.append(val)
    return val_list
    

for val in unique_genre_field_values:
    genre_val_list = process_for_bar_char(val, genre_val_list)

genres_to_filter = list(set(genre_val_list))

We have good categorical information about the genres, so lets create some filters based on genre. We will create 5 for several genres (Action, Comedy, Horror, Sci-Fi, Children). Note: The default limit for filters is 10.

In [17]:
for genre in genres_to_filter:
    createfilter_response = personalize.create_filter(name=genre,
    datasetGroupArn=dataset_group_arn,
    filterExpression='INCLUDE ItemID WHERE Items.GENRE IN ("{genre}")'
)


LimitExceededException: An error occurred (LimitExceededException) when calling the CreateFilter operation: More than 5 resources with PENDING or IN_PROGRESS status. Please try again later.

Lets also create 2 event filters for watched and unwatched content

In [29]:
createwatchedfilter_response = personalize.create_filter(name='watched',
    datasetGroupArn=dataset_group_arn,
    filterExpression='EXCLUDE ItemID WHERE Interactions.event_type IN ("watch")'
    )

createunwatchedfilter_response = personalize.create_filter(name='unwatched',
    datasetGroupArn=dataset_group_arn,
    filterExpression='INCLUDE ItemID WHERE Interactions.event_type IN ("watch")'
    )


In [47]:
%store sims_campaign_arn
%store userpersonalization_campaign_arn
%store rerank_campaign_arn


Stored 'sims_campaign_arn' (str)
Stored 'userpersonalization_campaign_arn' (str)
Stored 'rerank_campaign_arn' (str)
