# Amazon Personalize Workshop

Overview on how to run this workshop

1. Create a schema describing the dataset, using Personalize-reserved keywords for user ids, item ids, etc.
2. Create a dataset group that contains datasets used for building the model and for predicting: user-item interactions (aka “who liked what”), users and items. The last two are optional, as we will see in the example below.
3. Send data to Personalize.
4. Create a solution, i.e. select a recommendation recipe and train it on the dataset group.
5. Create a campaign to predict new samples.

In [1]:
!pip install awscli botocore boto3 --upgrade 

Requirement already up-to-date: awscli in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (1.16.125)
Requirement already up-to-date: botocore in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (1.12.115)
Requirement already up-to-date: boto3 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (1.9.115)
Requirement not upgraded as not directly required: PyYAML<=3.13,>=3.10 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from awscli) (3.12)
Requirement not upgraded as not directly required: rsa<=3.5.0,>=3.1.2 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from awscli) (3.4.2)
Requirement not upgraded as not directly required: docutils>=0.10 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from awscli) (0.14)
Requirement not upgraded as not directly required: colorama<=0.3.9,>=0.2.5 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from awscli) (0.3.9)

In [11]:
import pandas
import boto3
from sagemaker import get_execution_role
from sklearn.utils import shuffle
import sagemaker

In [3]:
!wget -N https://s3-us-west-2.amazonaws.com/personalize-cli-json-models/personalize.json
!wget -N https://s3-us-west-2.amazonaws.com/personalize-cli-json-models/personalize-runtime.json
!aws configure add-model --service-model file://`pwd`/personalize.json --service-name personalize
!aws configure add-model --service-model file://`pwd`/personalize-runtime.json --service-name personalize-runtime

--2019-03-18 06:07:08--  https://s3-us-west-2.amazonaws.com/personalize-cli-json-models/personalize.json
Resolving s3-us-west-2.amazonaws.com (s3-us-west-2.amazonaws.com)... 52.218.216.232
Connecting to s3-us-west-2.amazonaws.com (s3-us-west-2.amazonaws.com)|52.218.216.232|:443... connected.
HTTP request sent, awaiting response... 304 Not Modified
File ‘personalize.json’ not modified on server. Omitting download.

--2019-03-18 06:07:09--  https://s3-us-west-2.amazonaws.com/personalize-cli-json-models/personalize-runtime.json
Resolving s3-us-west-2.amazonaws.com (s3-us-west-2.amazonaws.com)... 52.218.216.232
Connecting to s3-us-west-2.amazonaws.com (s3-us-west-2.amazonaws.com)|52.218.216.232|:443... connected.
HTTP request sent, awaiting response... 304 Not Modified
File ‘personalize-runtime.json’ not modified on server. Omitting download.



In [5]:
!wget -N http://files.grouplens.org/datasets/movielens/ml-100k.zip
!unzip -o ml-100k.zip

--2019-03-18 06:08:38--  http://files.grouplens.org/datasets/movielens/ml-100k.zip
Resolving files.grouplens.org (files.grouplens.org)... 128.101.34.235
Connecting to files.grouplens.org (files.grouplens.org)|128.101.34.235|:80... connected.
HTTP request sent, awaiting response... 304 Not Modified
File ‘ml-100k.zip’ not modified on server. Omitting download.

Archive:  ml-100k.zip
  inflating: ml-100k/allbut.pl       
  inflating: ml-100k/mku.sh          
  inflating: ml-100k/README          
  inflating: ml-100k/u.data          
  inflating: ml-100k/u.genre         
  inflating: ml-100k/u.info          
  inflating: ml-100k/u.item          
  inflating: ml-100k/u.occupation    
  inflating: ml-100k/u.user          
  inflating: ml-100k/u1.base         
  inflating: ml-100k/u1.test         
  inflating: ml-100k/u2.base         
  inflating: ml-100k/u2.test         
  inflating: ml-100k/u3.base         
  inflating: ml-100k/u3.test         
  inflating: ml-100k/u4.base         
  inflat

Unnamed: 0,USER_ID,ITEM_ID,RATING,TIMESTAMP
0,196,242,3,881250949
1,186,302,3,891717742
...,...,...,...,...
99998,13,225,2,882399156
99999,12,203,3,879959583


## Schema

```json
{"type": "record", 
"name": "Interactions", 
"namespace": "com.amazonaws.personalize.schema",
"fields":[
    {"name": "ITEM_ID", "type": "string"},
    {"name": "USER_ID", "type": "string"},
    {"name": "TIMESTAMP", "type": "long"}
],
"version": "1.0"}
```

In [18]:
role = get_execution_role()
print("IAM SERVICE ROLE", role)
sess = sagemaker.Session()
bucket = sess.default_bucket()
prefix = 'amazon-personalize'

INFO:sagemaker:Created S3 bucket: sagemaker-us-east-1-194989662172


IAM SERVICE ROLE arn:aws:iam::194989662172:role/service-role/AmazonSageMaker-ExecutionRole-20171218T174555


## Prepare and Upload Dataset

In [17]:
data = pandas.read_csv('./ml-100k/u.data', sep='\t', names=['USER_ID', 'ITEM_ID', 'RATING', 'TIMESTAMP'])
pandas.set_option('display.max_rows', 5)
print(data)
filename = "processed.csv"
data = data[data['RATING'] > 3.6]                # keep only movies rated 3.6 and above
data = data[['USER_ID', 'ITEM_ID', 'TIMESTAMP']] # select columns that match the columns in the schema below
data.to_csv(filename, index=False)
boto3.Session().resource('s3').Bucket(bucket).Object(prefix+'/'+filename).upload_file(filename)


       USER_ID  ITEM_ID  RATING  TIMESTAMP
0          196      242       3  881250949
1          186      302       3  891717742
...        ...      ...     ...        ...
99998       13      225       2  882399156
99999       12      203       3  879959583

[100000 rows x 4 columns]


## Create Dataset Group

In [27]:

!aws personalize list-campaigns
!aws personalize describe-campaign --campaign-arn $CAMPAIGN_ARN

{
    "campaigns": [
        {
            "name": "DEMO-campaign",
            "campaignArn": "arn:aws:personalize:us-east-1:194989662172:campaign/DEMO-campaign",
            "status": "ACTIVE",
            "creationDateTime": 1552507038.214,
            "lastUpdatedDateTime": 1552508841.248
        },
        {
            "name": "donnieMovies-campaign",
            "campaignArn": "arn:aws:personalize:us-east-1:194989662172:campaign/donnieMovies-campaign",
            "status": "ACTIVE",
            "creationDateTime": 1552902398.737,
            "lastUpdatedDateTime": 1552902790.966
        }
    ]
}
{
    "campaign": {
        "name": "donnieMovies-campaign",
        "updateMode": "MANUAL",
        "campaignArn": "arn:aws:personalize:us-east-1:194989662172:campaign/donnieMovies-campaign",
        "solutionArn": "arn:aws:personalize:us-east-1:194989662172:solution/donnieMovie-solution",
        "solutionVersionArn": "arn:aws:personalize:us-east-1:194989662172:solution/donnieMovie-s

In [43]:
CAMPAIGN_ARN="XXX"
USER_ID="13"
!aws personalize-runtime get-recommendations --campaign-arn $CAMPAIGN_ARN --user-id $USER_ID 

{
    "itemList": [
        {
            "itemId": "328"
        },
        {
            "itemId": "650"
        },
        {
            "itemId": "1395"
        },
        {
            "itemId": "451"
        },
        {
            "itemId": "1524"
        },
        {
            "itemId": "391"
        },
        {
            "itemId": "194"
        },
        {
            "itemId": "395"
        },
        {
            "itemId": "105"
        },
        {
            "itemId": "884"
        },
        {
            "itemId": "52"
        },
        {
            "itemId": "723"
        },
        {
            "itemId": "951"
        },
        {
            "itemId": "1064"
        },
        {
            "itemId": "608"
        },
        {
            "itemId": "1128"
        },
        {
            "itemId": "929"
        },
        {
            "itemId": "367"
        },
        {
            "itemId": "155"