# Creating Clickstream Data

Recommendations can be made more context aware and accurate by including user behavior like clicks. This notebook assumes you have fully deployed the Personalens project both as an application and the Personalize service in your AWS account.

## Getting Started

Before running the rest of this notebook fill out the variables below:

In [16]:
# Obtain these from the DB Notebook
campaignArn = "arn:aws:personalize:us-east-1:059124553121:campaign/Dj-campaign"


In [31]:
# Imports
import boto3

import json
import numpy as np
import pandas as pd
import time

!wget -N https://s3-us-west-2.amazonaws.com/personalize-cli-json-models/personalize.json
!wget -N https://s3-us-west-2.amazonaws.com/personalize-cli-json-models/personalize-runtime.json
!wget -N https://s3-us-west-2.amazonaws.com/personalize-cli-json-models/personalize-events.json
!aws configure add-model --service-model file://`pwd`/personalize.json --service-name personalize
!aws configure add-model --service-model file://`pwd`/personalize-runtime.json --service-name personalize-runtime
!aws configure add-model --service-name personalize-events --service-model file://personalize-events.json

personalize = boto3.client(service_name='personalize', endpoint_url='https://personalize.us-east-1.amazonaws.com')
personalize_runtime = boto3.client(service_name='personalize-runtime', endpoint_url='https://personalize-runtime.us-east-1.amazonaws.com')



--2019-02-15 21:40:14--  https://s3-us-west-2.amazonaws.com/personalize-cli-json-models/personalize.json
Resolving s3-us-west-2.amazonaws.com (s3-us-west-2.amazonaws.com)... 52.218.245.216
Connecting to s3-us-west-2.amazonaws.com (s3-us-west-2.amazonaws.com)|52.218.245.216|:443... connected.
HTTP request sent, awaiting response... 304 Not Modified
File ‘personalize.json’ not modified on server. Omitting download.

--2019-02-15 21:40:15--  https://s3-us-west-2.amazonaws.com/personalize-cli-json-models/personalize-runtime.json
Resolving s3-us-west-2.amazonaws.com (s3-us-west-2.amazonaws.com)... 52.218.245.216
Connecting to s3-us-west-2.amazonaws.com (s3-us-west-2.amazonaws.com)|52.218.245.216|:443... connected.
HTTP request sent, awaiting response... 304 Not Modified
File ‘personalize-runtime.json’ not modified on server. Omitting download.

--2019-02-15 21:40:15--  https://s3-us-west-2.amazonaws.com/personalize-cli-json-models/personalize-events.json
Resolving s3-us-west-2.amazonaws.com

In [18]:
# Imports for Django and Pandas

import json
import datetime
import django
django.setup()

from movielens.models import User
from movielens.models import Item
from movielens.models import UserData

from movielens.views import stream_event

## Creating an IAM Role

Follow the instructions here, then save the roleARN in the variable below.

https://docs.aws.amazon.com/personalize/latest/dg/getting-started.html#gs-create-role-with-permissions

In [5]:
roleArn = "arn:aws:iam::059124553121:role/PersonalizeRole"

## Set Dataset group for the clickstream data

From your previous notebook `Personalize_Sample_With_DB.ipynb` obtain your `datasetGroupArn` and define it below.

In [6]:
personalize = boto3.client('personalize')

datasetGroupArn = "arn:aws:personalize:us-east-1:059124553121:dataset-group/personalens-dataset-group"

## Getting a Tracking ID

A tracking ID associates an event with a dataset group and authorizes you to send data to Amazon Personalize. You generate a tracking ID by calling the CreateEventTracker API. You supply the IAM role that you created in Create an IAM Role.

In [7]:
# Obtain the dataset group ARN from the previous output:
response = personalize.create_event_tracker(
    name="MovieClickTracker",
    datasetGroupArn=datasetGroupArn,
    roleArn=roleArn
)
print(response['eventTrackerArn'])
print(response['trackingId'])

arn:aws:personalize:us-east-1:059124553121:event-tracker/a2344bdf
4a7e200e-9de4-46ee-acb0-4494371d5670


In [8]:
# Save the Tracker ID and ARN:
eventTrackerArn = "arn:aws:personalize:us-east-1:059124553121:event-tracker/a2344bdf"
trackingId = "4a7e200e-9de4-46ee-acb0-4494371d5670"

The schema for the event data is already predefined, you can learn more here: https://docs.aws.amazon.com/personalize/latest/dg/recording-events.html

The schema itself:

```JSON
{
    "schema": {
        "name": "event-interactions-schema",
        "schemaArn": "arn:aws:personalize:us-west-2:acct-id:schema/event-interactions-schema",
        "schema": {
          "type": "record",
          "name": "Interactions",
          "namespace": "com.amazonaws.personalize.schema",
          "fields": [
            {
              "name": "user_id",
              "type": "string"
            },
            {
              "name": "session_id",
              "type": "string"
            },
            {
              "name": "timestamp",
              "type": "long"
            }
            {
              "name": "event_type",
              "type": "string"
            },
            {
              "name": "item_id",
              "type": "string"
            },
            {
              "name": "event_value",
              "type": "string"
            },
          ],
          "version": "1.0"
        }"
    }
}
```

## Patching Up Users

When the users were first saved, a session ID was not specified for them, you can see that it is required above in the schema so we should generate and save something now. For simplistic purposes we will just set it to be equal to the users' primary key, however in a production system this could change with varying visits and could signal intent.

In [9]:
users = User.objects.all()
for user in users:
    user.session = user.user_id
    user.save()

## Creating Seed Data

Now with the datagroup able to receive events, you will need to supply a few initial items to get the dataset ready for the model training process. Below a collection of users will "click" on random movies and then stream that data into the dataset.

Before continuing take the trackingId from above and using Cloud9 save it to the settings.py file in the personalens sub folder.

In [None]:
import django
django.setup()

from movielens.models import User
from movielens.models import Item
from movielens.models import UserData

from movielens.views import stream_event

In [24]:
# First let us define the first 10 users that we are going to generate data for

users = User.objects.all()[:10]

# Now for each user we want them to click a collection of 100 random movies

for user in users:
    for x in range(0, 99):
        movie = Item.objects.order_by('?').first()
        stream_event(user=user, movie=movie)


   

Clickstream Status:  True
Streaming event:  1 Robocop 3 (1993)
Clickstream Status:  True
Streaming event:  1 Next Step, The (1995)
Clickstream Status:  True
Streaming event:  1 Davy Crockett, King of the Wild Frontier (1955)
Clickstream Status:  True
Streaming event:  1 Dead Poets Society (1989)
Clickstream Status:  True
Streaming event:  1 NeverEnding Story III, The (1994)
Clickstream Status:  True
Streaming event:  1 Courage Under Fire (1996)
Clickstream Status:  True
Streaming event:  1 Maverick (1994)
Clickstream Status:  True
Streaming event:  1 Brady Bunch Movie, The (1995)
Clickstream Status:  True
Streaming event:  1 Blink (1994)
Clickstream Status:  True
Streaming event:  1 Phat Beach (1996)
Clickstream Status:  True
Streaming event:  1 Gold Diggers: The Secret of Bear Mountain (1995)
Clickstream Status:  True
Streaming event:  1 Red Corner (1997)
Clickstream Status:  True
Streaming event:  1 Murder, My Sweet (1944)
Clickstream Status:  True
Streaming event:  1 Fox and the Hou

Clickstream Status:  True
Streaming event:  2 Being Human (1993)
Clickstream Status:  True
Streaming event:  2 White Balloon, The (1995)
Clickstream Status:  True
Streaming event:  2 Cyclo (1995)
Clickstream Status:  True
Streaming event:  2 Man Who Would Be King, The (1975)
Clickstream Status:  True
Streaming event:  2 Manhattan (1979)
Clickstream Status:  True
Streaming event:  2 Carpool (1996)
Clickstream Status:  True
Streaming event:  2 Escape from New York (1981)
Clickstream Status:  True
Streaming event:  2 Life with Mikey (1993)
Clickstream Status:  True
Streaming event:  2 Night on Earth (1991)
Clickstream Status:  True
Streaming event:  2 Legal Deceit (1997)
Clickstream Status:  True
Streaming event:  2 G.I. Jane (1997)
Clickstream Status:  True
Streaming event:  2 Caro Diario (Dear Diary) (1994)
Clickstream Status:  True
Streaming event:  2 Meet John Doe (1941)
Clickstream Status:  True
Streaming event:  2 Blood & Wine (1997)
Clickstream Status:  True
Streaming event:  2 Ami

Clickstream Status:  True
Streaming event:  3 Shadow Conspiracy (1997)
Clickstream Status:  True
Streaming event:  3 Kaspar Hauser (1993)
Clickstream Status:  True
Streaming event:  3 Ace Ventura: When Nature Calls (1995)
Clickstream Status:  True
Streaming event:  3 Renaissance Man (1994)
Clickstream Status:  True
Streaming event:  3 North (1994)
Clickstream Status:  True
Streaming event:  3 Sixth Man, The (1997)
Clickstream Status:  True
Streaming event:  3 Great Day in Harlem, A (1994)
Clickstream Status:  True
Streaming event:  3 Addicted to Love (1997)
Clickstream Status:  True
Streaming event:  3 Wonderful, Horrible Life of Leni Riefenstahl, The (1993)
Clickstream Status:  True
Streaming event:  3 Swan Princess, The (1994)
Clickstream Status:  True
Streaming event:  3 Pallbearer, The (1996)
Clickstream Status:  True
Streaming event:  3 Hunted, The (1995)
Clickstream Status:  True
Streaming event:  3 Woman in Question, The (1950)
Clickstream Status:  True
Streaming event:  3 Quiz 

Clickstream Status:  True
Streaming event:  4 Savage Nights (Nuits fauves, Les) (1992)
Clickstream Status:  True
Streaming event:  4 Ruling Class, The (1972)
Clickstream Status:  True
Streaming event:  4 Three Wishes (1995)
Clickstream Status:  True
Streaming event:  4 Kaspar Hauser (1993)
Clickstream Status:  True
Streaming event:  4 Heavy Metal (1981)
Clickstream Status:  True
Streaming event:  4 Tank Girl (1995)
Clickstream Status:  True
Streaming event:  4 Vertigo (1958)
Clickstream Status:  True
Streaming event:  4 Picture Perfect (1997)
Clickstream Status:  True
Streaming event:  4 Boys of St. Vincent, The (1993)
Clickstream Status:  True
Streaming event:  4 Nico Icon (1995)
Clickstream Status:  True
Streaming event:  4 Tigrero: A Film That Was Never Made (1994)
Clickstream Status:  True
Streaming event:  4 Jackal, The (1997)
Clickstream Status:  True
Streaming event:  4 Truth or Consequences, N.M. (1997)
Clickstream Status:  True
Streaming event:  4 2 Days in the Valley (1996)
C

Clickstream Status:  True
Streaming event:  6 Priest (1994)
Clickstream Status:  True
Streaming event:  6 Three Musketeers, The (1993)
Clickstream Status:  True
Streaming event:  6 Ed's Next Move (1996)
Clickstream Status:  True
Streaming event:  6 Boxing Helena (1993)
Clickstream Status:  True
Streaming event:  6 Coneheads (1993)
Clickstream Status:  True
Streaming event:  6 Foxfire (1996)
Clickstream Status:  True
Streaming event:  6 Wooden Man's Bride, The (Wu Kui) (1994)
Clickstream Status:  True
Streaming event:  6 Graduate, The (1967)
Clickstream Status:  True
Streaming event:  6 Funeral, The (1996)
Clickstream Status:  True
Streaming event:  6 Schizopolis (1996)
Clickstream Status:  True
Streaming event:  6 Shooter, The (1995)
Clickstream Status:  True
Streaming event:  6 I Can't Sleep (J'ai pas sommeil) (1994)
Clickstream Status:  True
Streaming event:  6 Family Thing, A (1996)
Clickstream Status:  True
Streaming event:  6 Once Upon a Time... When We Were Colored (1995)
Clickst

Clickstream Status:  True
Streaming event:  7 House Party 3 (1994)
Clickstream Status:  True
Streaming event:  7 Theodore Rex (1995)
Clickstream Status:  True
Streaming event:  7 Price Above Rubies, A (1998)
Clickstream Status:  True
Streaming event:  7 Inspector General, The (1949)
Clickstream Status:  True
Streaming event:  7 Dangerous Beauty (1998)
Clickstream Status:  True
Streaming event:  7 Ill Gotten Gains (1997)
Clickstream Status:  True
Streaming event:  7 Story of Xinghua, The (1993)
Clickstream Status:  True
Streaming event:  7 Shall We Dance? (1996)
Clickstream Status:  True
Streaming event:  7 Dumb & Dumber (1994)
Clickstream Status:  True
Streaming event:  7 Fly Away Home (1996)
Clickstream Status:  True
Streaming event:  7 Eat Drink Man Woman (1994)
Clickstream Status:  True
Streaming event:  7 Foreign Correspondent (1940)
Clickstream Status:  True
Streaming event:  7 Mrs. Winterbourne (1996)
Clickstream Status:  True
Streaming event:  7 English Patient, The (1996)
Click

Clickstream Status:  True
Streaming event:  8 Maya Lin: A Strong Clear Vision (1994)
Clickstream Status:  True
Streaming event:  8 Poetic Justice (1993)
Clickstream Status:  True
Streaming event:  8 Wolf (1994)
Clickstream Status:  True
Streaming event:  8 Thin Man, The (1934)
Clickstream Status:  True
Streaming event:  8 Joy Luck Club, The (1993)
Clickstream Status:  True
Streaming event:  8 Ghost and Mrs. Muir, The (1947)
Clickstream Status:  True
Streaming event:  8 Angus (1995)
Clickstream Status:  True
Streaming event:  8 Devil in a Blue Dress (1995)
Clickstream Status:  True
Streaming event:  8 Butcher Boy, The (1998)
Clickstream Status:  True
Streaming event:  8 Shallow Grave (1994)
Clickstream Status:  True
Streaming event:  8 Antonia's Line (1995)
Clickstream Status:  True
Streaming event:  8 With Honors (1994)
Clickstream Status:  True
Streaming event:  8 Sophie's Choice (1982)
Clickstream Status:  True
Streaming event:  8 I Like It Like That (1994)
Clickstream Status:  True


Clickstream Status:  True
Streaming event:  9 Mad City (1997)
Clickstream Status:  True
Streaming event:  9 Anastasia (1997)
Clickstream Status:  True
Streaming event:  9 Apple Dumpling Gang, The (1975)
Clickstream Status:  True
Streaming event:  9 Fried Green Tomatoes (1991)
Clickstream Status:  True
Streaming event:  9 Madness of King George, The (1994)
Clickstream Status:  True
Streaming event:  9 Women, The (1939)
Clickstream Status:  True
Streaming event:  9 Back to the Future (1985)
Clickstream Status:  True
Streaming event:  9 Blue Angel, The (Blaue Engel, Der) (1930)
Clickstream Status:  True
Streaming event:  9 Chairman of the Board (1998)
Clickstream Status:  True
Streaming event:  9 Judge Dredd (1995)
Clickstream Status:  True
Streaming event:  9 Anastasia (1997)
Clickstream Status:  True
Streaming event:  9 Sleepover (1995)
Clickstream Status:  True
Streaming event:  9 Getting Away With Murder (1996)
Clickstream Status:  True
Streaming event:  9 Love Jones (1997)
Clickstrea

## Generating the Solution:

Just like in your original setup notebook the following cells could take a little time to run, be patient and they should be complete in 30-40 minutes.

In [26]:
recipe_list = [
    "arn:aws:personalize:::recipe/awspersonalizehrnnmodel",
    "arn:aws:personalize:::recipe/awspersonalizedeepfmmodel",
    "arn:aws:personalize:::recipe/awspersonalizesimsmodel",
    "arn:aws:personalize:::recipe/awspersonalizeffnnmodel",
    "arn:aws:personalize:::recipe/popularity-baseline"
]

recipe_arn = recipe_list[0]

create_solution_response = personalize.create_solution(
    name = "Dj-movielens-clickst",
    datasetGroupArn = datasetGroupArn,
    recipeArn = recipe_arn,
    minProvisionedTPS = 1
)

solution_arn = create_solution_response['solutionArn']
print(json.dumps(create_solution_response, indent=2))

{
  "solutionArn": "arn:aws:personalize:us-east-1:059124553121:solution/Dj-movielens-clickst",
  "ResponseMetadata": {
    "RequestId": "ee1da46c-4882-4089-9959-f99d4605d260",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Fri, 15 Feb 2019 20:17:58 GMT",
      "x-amzn-requestid": "ee1da46c-4882-4089-9959-f99d4605d260",
      "content-length": "90",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


In [27]:
status = None
max_time = time.time() + 3*60*60 # 3 hours
while time.time() < max_time:
    describe_solution_response = personalize.describe_solution(
        solutionArn = solution_arn
    )
    status = describe_solution_response["solution"]["status"]
    print("Solution: {}".format(status))
    
    if status == "ACTIVE" or status == "CREATE FAILED":
        break
        
    time.sleep(60)

Solution: CREATE PENDING
Solution: CREATE IN_PROGRESS
Solution: CREATE IN_PROGRESS
Solution: CREATE IN_PROGRESS
Solution: CREATE IN_PROGRESS
Solution: CREATE IN_PROGRESS
Solution: CREATE IN_PROGRESS
Solution: CREATE IN_PROGRESS
Solution: CREATE IN_PROGRESS
Solution: CREATE IN_PROGRESS
Solution: CREATE IN_PROGRESS
Solution: CREATE IN_PROGRESS
Solution: CREATE IN_PROGRESS
Solution: CREATE IN_PROGRESS
Solution: CREATE IN_PROGRESS
Solution: CREATE IN_PROGRESS
Solution: CREATE IN_PROGRESS
Solution: CREATE IN_PROGRESS
Solution: CREATE IN_PROGRESS
Solution: CREATE IN_PROGRESS
Solution: CREATE IN_PROGRESS
Solution: CREATE IN_PROGRESS
Solution: CREATE IN_PROGRESS
Solution: CREATE IN_PROGRESS
Solution: CREATE IN_PROGRESS
Solution: CREATE IN_PROGRESS
Solution: CREATE IN_PROGRESS
Solution: CREATE IN_PROGRESS
Solution: CREATE IN_PROGRESS
Solution: CREATE IN_PROGRESS
Solution: CREATE IN_PROGRESS
Solution: CREATE IN_PROGRESS
Solution: CREATE IN_PROGRESS
Solution: CREATE IN_PROGRESS
Solution: CREATE I

## Create and Wait for Campaign

In [29]:
create_campaign_response = personalize.create_campaign(
    name = "Dj-click-campaign",
    solutionArn = solution_arn,
    updateMode = "MANUAL"
)

campaign_arn = create_campaign_response['campaignArn']
print(json.dumps(create_campaign_response, indent=2))

{
  "campaignArn": "arn:aws:personalize:us-east-1:059124553121:campaign/Dj-click-campaign",
  "ResponseMetadata": {
    "RequestId": "6f3941cf-9693-4d31-b87b-e1c074eed6eb",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Fri, 15 Feb 2019 21:24:23 GMT",
      "x-amzn-requestid": "6f3941cf-9693-4d31-b87b-e1c074eed6eb",
      "content-length": "87",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


In [30]:


status = None
max_time = time.time() + 3*60*60 # 3 hours
while time.time() < max_time:
    describe_campaign_response = personalize.describe_campaign(
        campaignArn = campaign_arn
    )
    status = describe_campaign_response["campaign"]["status"]
    print("Campaign: {}".format(status))
    
    if status == "ACTIVE" or status == "CREATE FAILED":
        break
        
    time.sleep(60)

Campaign: CREATE PENDING
Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: ACTIVE


## Exporting to the Application

Now that you have a new campaign ARN, you will want to update it in your settings.py file as well. After updating, simply refresh the page and you will be getting recommendations using the new solution.