# Creating and Evaluating Solutions 

To recap from the first notebook:

For the most part the algorithms in Amazon Personalize look to solve different tasks explained here:

1. HRNN & HRNN-Metadata - Personalization
1. HRNN Coldstart - Personalization that promotes new conten
1. Personalized-Ranking - Takes a collection of items and then orders them in probable order of interest using an HRNN-like approach.
1. SIMS(Similar Items) - Given one item, what other items are also interacted with by users.
1. Popularity-Count - What items are most popular, if HRNN or HRNN-Metadata do not have an answer for the user you query, this is what is returned by default.


No matter the use case, the algorithms all share a base of learning on user-item-interaction data which is defined by 3 core attributes:

1. UserID - User who interacted
1. ItemID - Item the user interacted with
1. Timestamp - When did this interaction occur

We also support event types and event values defined by:

1. Event Type - Categorical label of an event (browse, purcahsed, rated, etc).
1. Event Value - Something corresponding to event type that happened. Generally speaking we look to normalized between 0 and 1 for the values over the types. So if there are three phases to complete a transaction (clicked, added-to-cart, and purchased) there would be an event_value for each phase as 0.33, 0.66, 1.0 respectfully.

In this particular exercise we will leave event_type and event_value ignored. They can come in handy later but are skipped for the initial POC. 

The previous notebooks covered:

1. Selecting a dataset.
1. Preparing interactions data for Personalize.
1. Preparing item or user metadata for Personalize [Optional].
1. Creating a Dataset Group.
1. Creating and importing data into an Interactions dataset within the dataset group.
1. Creating and importing data into the metadata datasets [Optional].


## Creating Solutions

This nobeook will cover creating the following solutions:

1. HRNN
1. SIMS
1. Personalized-Ranking

After that the metrics will be explained and another notebook will showcase how to interact with the Solutions once they are deployed into a Campaign.

The first step is to reload the imports and the stored variables from the previous notebooks.

In [1]:
import boto3
from time import sleep
import subprocess
import pandas as pd
import json
import time
import pprint
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter
import matplotlib.dates as mdates
from datetime import datetime

In [2]:
%store -r

In [4]:
# Configure the SDK to Personalize:
personalize = boto3.client('personalize')
personalize_runtime = boto3.client('personalize-runtime')

In Amazon Personalize a trained model is called a Solution, each Solution can have many specific versions that relate to a given volume of data when the model was trained.

To begin we will list all the recipies that are supported, a recipie is an algorithm that has not been trained on your data yet. After listing you'll select one and use that to build your model.


In [5]:
list_recipes_response = personalize.list_recipes()
list_recipes_response

{'recipes': [{'name': 'aws-hrnn',
   'recipeArn': 'arn:aws:personalize:::recipe/aws-hrnn',
   'status': 'ACTIVE',
   'creationDateTime': datetime.datetime(2019, 6, 10, 0, 0, tzinfo=tzlocal()),
   'lastUpdatedDateTime': datetime.datetime(2019, 6, 20, 0, 39, 17, 65000, tzinfo=tzlocal())},
  {'name': 'aws-hrnn-coldstart',
   'recipeArn': 'arn:aws:personalize:::recipe/aws-hrnn-coldstart',
   'status': 'ACTIVE',
   'creationDateTime': datetime.datetime(2019, 6, 10, 0, 0, tzinfo=tzlocal()),
   'lastUpdatedDateTime': datetime.datetime(2019, 6, 20, 0, 39, 17, 64000, tzinfo=tzlocal())},
  {'name': 'aws-hrnn-metadata',
   'recipeArn': 'arn:aws:personalize:::recipe/aws-hrnn-metadata',
   'status': 'ACTIVE',
   'creationDateTime': datetime.datetime(2019, 6, 10, 0, 0, tzinfo=tzlocal()),
   'lastUpdatedDateTime': datetime.datetime(2019, 6, 20, 0, 39, 17, 64000, tzinfo=tzlocal())},
  {'name': 'aws-personalized-ranking',
   'recipeArn': 'arn:aws:personalize:::recipe/aws-personalized-ranking',
   'stat

That is just a JSON representation of all of the algorithms that we have mentioned already. 

Next we will select a particular algorithm then build a model with it.

### HRNN


HRNN is one of the more advanced recommendation models that you can use and it allows for things like real-time updates of recommendations based on user behavior. It also tends to out perform other approaches like collaborative filtering. We will kick this job off first as it takes the longest to complete.

#### Select Recipe

In [6]:
HRNN_recipe_arn = "arn:aws:personalize:::recipe/aws-hrnn"

#### Create and Wait for Solution
First you will create the solution with the API, then you will create a version. It will take several minutes to train the model and thus create your version of a solution. Once it gets started and you are seeing the in progress notifications it is a good time to take a break, grab a coffee, etc.

Note the solution is just a label kind of identifier, you'll also need to create a version which is the actual trained model.

In [10]:
hrnn_create_solution_response = personalize.create_solution(
    name = "personalize-poc-hrnn",
    datasetGroupArn = dataset_group_arn,
    recipeArn = HRNN_recipe_arn
)

hrnn_solution_arn = hrnn_create_solution_response['solutionArn']
print(json.dumps(hrnn_create_solution_response, indent=2))

{
  "solutionArn": "arn:aws:personalize:us-east-1:059124553121:solution/personalize-poc-hrnn",
  "ResponseMetadata": {
    "RequestId": "1d2424db-7ad0-47ef-b305-91d64d7263ac",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Sun, 12 Jan 2020 19:34:51 GMT",
      "x-amzn-requestid": "1d2424db-7ad0-47ef-b305-91d64d7263ac",
      "content-length": "90",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


#### Create the Solution Version

This process will actually take a while to complete, upwards of 25 minutes on. Normally there would be while loops to poll until the task is completed. However the task would block other cells from executing and the goal here is to create many models and deploy them quickly. Below there are instructions to viewing the progress in browser. After creating all of the solution versions go there and watch for updates.

In [12]:
hrnn_create_solution_version_response = personalize.create_solution_version(
    solutionArn = hrnn_solution_arn
)

In [13]:
hrnn_solution_version_arn = hrnn_create_solution_version_response['solutionVersionArn']
print(json.dumps(hrnn_create_solution_version_response, indent=2))

{
  "solutionVersionArn": "arn:aws:personalize:us-east-1:059124553121:solution/personalize-poc-hrnn/2ef6b9c1",
  "ResponseMetadata": {
    "RequestId": "757ea8fd-19a0-44de-bc43-23174dd783f5",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Sun, 12 Jan 2020 19:37:48 GMT",
      "x-amzn-requestid": "757ea8fd-19a0-44de-bc43-23174dd783f5",
      "content-length": "106",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}
