# Recherche d'articles similaires + HPO

Ce bloc-notes suit une structure similaire à celle de la retenue temporelle, mais utilise aws-sims au lieu de HRNN. Nous observons les articles similaires trouvés, et vérifions si HPO permet d'améliorer le processus.

In [2]:
import boto3, os
import json
import numpy as np
import pandas as pd
import time
from botocore.exceptions import ClientError

In [3]:
suffix = str(np.random.uniform())[4:9]

In [4]:
bucket = "demo-sims-"+   suffix        # replace with the name of your S3 bucket
filename = "DEMO-sims.csv"

In [5]:
!aws s3 mb s3://{bucket}

make_bucket: demo-sims-41593


In [3]:
personalize = boto3.client(service_name='personalize')
personalize_runtime = boto3.client(service_name='personalize-runtime')

# Télécharger et traiter les données

In [7]:
!curl -O http://files.grouplens.org/datasets/movielens/ml-1m.zip
!unzip -o ml-1m.zip

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 5778k  100 5778k    0     0  8243k      0 --:--:-- --:--:-- --:--:-- 8243k
Archive:  ml-1m.zip
  inflating: ml-1m/movies.dat        
  inflating: ml-1m/ratings.dat       
  inflating: ml-1m/README            
  inflating: ml-1m/users.dat         


In [8]:
data = pd.read_csv('./ml-1m/ratings.dat', sep='::', names=['USER_ID','ITEM_ID','RATING','TIMESTAMP'])
pd.set_option('display.max_rows', 5)
data

  """Entry point for launching an IPython kernel.


Unnamed: 0,USER_ID,ITEM_ID,RATING,TIMESTAMP
0,1,1193,5,978300760
1,1,661,3,978302109
...,...,...,...,...
1000207,6040,1096,4,956715648
1000208,6040,1097,4,956715569


In [9]:
# data = data[data['RATING'] > 3.6]  # Use all data to predict view recommendations
data = data[['USER_ID', 'ITEM_ID', 'TIMESTAMP']] # select columns that match the columns in the schema below
print('unique users %d; unique items %d'%(
    len(data['USER_ID'].unique()), len(data['ITEM_ID'].unique())))

unique users 6040; unique items 3706


## Charger les données

In [10]:
data.to_csv(filename, index=False)
boto3.Session().resource('s3').Bucket(bucket).Object(filename).upload_file(filename)

USER_ID,ITEM_ID,TIMESTAMP
1,1193,978300760
1,661,978302109
1,914,978301968
1,3408,978300275
1,2355,978824291
1,1197,978302268
1,1287,978302039
1,2804,978300719
1,594,978302268


# Créer un schéma

In [11]:
schema = {
    "type": "record",
    "name": "Interactions",
    "namespace": "com.amazonaws.personalize.schema",
    "fields": [
        {
            "name": "USER_ID",
            "type": "string"
        },
        {
            "name": "ITEM_ID",
            "type": "string"
        },
        {
            "name": "TIMESTAMP",
            "type": "long"
        }
    ],
    "version": "1.0"
}

create_schema_response = personalize.create_schema(
    name = "DEMO-sims-schema-"+suffix,
    schema = json.dumps(schema)
)

schema_arn = create_schema_response['schemaArn']
print(json.dumps(create_schema_response, indent=2))

{
  "schemaArn": "arn:aws:personalize:us-east-1:261294318658:schema/DEMO-sims-schema-41593",
  "ResponseMetadata": {
    "RequestId": "af9a8275-23d6-445a-afa6-98bad364a812",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Sat, 01 Jun 2019 02:42:21 GMT",
      "x-amzn-requestid": "af9a8275-23d6-445a-afa6-98bad364a812",
      "content-length": "88",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


## Jeux de données et groupes de jeu de données

### Créer un groupe de jeux de données

In [12]:
create_dataset_group_response = personalize.create_dataset_group(
    name = "DEMO-sims-dataset-group-"+suffix
)

dataset_group_arn = create_dataset_group_response['datasetGroupArn']
print(json.dumps(create_dataset_group_response, indent=2))

{
  "datasetGroupArn": "arn:aws:personalize:us-east-1:261294318658:dataset-group/DEMO-sims-dataset-group-41593",
  "ResponseMetadata": {
    "RequestId": "1bb461dc-0cf8-42e5-b9aa-31e71743e548",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Sat, 01 Jun 2019 02:42:27 GMT",
      "x-amzn-requestid": "1bb461dc-0cf8-42e5-b9aa-31e71743e548",
      "content-length": "108",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


In [13]:
status = None
max_time = time.time() + 3*60*60 # 3 hours
while time.time() < max_time:
    describe_dataset_group_response = personalize.describe_dataset_group(
        datasetGroupArn = dataset_group_arn
    )
    status = describe_dataset_group_response["datasetGroup"]["status"]
    print("DatasetGroup: {}".format(status))
    
    if status == "ACTIVE" or status == "CREATE FAILED":
        break
        
    time.sleep(20)

DatasetGroup: ACTIVE


### Créer un jeu de données de type « Interactions »

In [14]:
dataset_type = "INTERACTIONS"
create_dataset_response = personalize.create_dataset(
    datasetType = dataset_type,
    datasetGroupArn = dataset_group_arn,
    schemaArn = schema_arn,
    name = "DEMO-sims-dataset-"+suffix
)

dataset_arn = create_dataset_response['datasetArn']
print(json.dumps(create_dataset_response, indent=2))

{
  "datasetArn": "arn:aws:personalize:us-east-1:261294318658:dataset/DEMO-sims-dataset-group-41593/INTERACTIONS",
  "ResponseMetadata": {
    "RequestId": "e4b79afb-d339-42af-a404-4df350c3d23f",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Sat, 01 Jun 2019 02:42:36 GMT",
      "x-amzn-requestid": "e4b79afb-d339-42af-a404-4df350c3d23f",
      "content-length": "110",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


## Permissions de compartiments S3 pour un accès personnalisé

### Attacher une politique au compartiment S3

In [15]:
s3 = boto3.client("s3")

policy = {
    "Version": "2012-10-17",
    "Id": "PersonalizeS3BucketAccessPolicy",
    "Statement": [
        {
            "Sid": "PersonalizeS3BucketAccessPolicy",
            "Effect": "Allow",
            "Principal": {
                "Service": "personalize.amazonaws.com"
            },
            "Action": [
                "s3:GetObject",
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::{}".format(bucket),
                "arn:aws:s3:::{}/*".format(bucket)
            ]
        }
    ]
}

s3.put_bucket_policy(Bucket=bucket, Policy=json.dumps(policy));

### Créer un rôle d'accès en lecture seule de S3

In [17]:
iam = boto3.client("iam")

role_name = "PersonalizeS3Role-"+suffix
assume_role_policy_document = {
    "Version": "2012-10-17",
    "Statement": [
        {
          "Effect": "Allow",
          "Principal": {
            "Service": "personalize.amazonaws.com"
          },
          "Action": "sts:AssumeRole"
        }
    ]
}
try:
    create_role_response = iam.create_role(
        RoleName = role_name,
        AssumeRolePolicyDocument = json.dumps(assume_role_policy_document)
    );

    iam.attach_role_policy(
        RoleName = role_name,
        PolicyArn = "arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess"
    );

    role_arn = create_role_response["Role"]["Arn"]
except ClientError as e:
    if e.response['Error']['Code'] == 'EntityAlreadyExists':
        role_arn = iam.get_role(RoleName=role_name)['Role']['Arn']
    else:
        raise
print(role_arn)
# sometimes need to wait a bit for the role to be created
time.sleep(45)
print(role_arn)

arn:aws:iam::261294318658:role/PersonalizeS3Role-41593


## Tâches d'importation de données

In [19]:
create_dataset_import_job_response = personalize.create_dataset_import_job(
    jobName = "DEMO-sims-dataset-import-job-"+suffix,
    datasetArn = dataset_arn,
    dataSource = {
        "dataLocation": "s3://{}/{}".format(bucket, filename)
    },
    roleArn = role_arn
)

dataset_import_job_arn = create_dataset_import_job_response['datasetImportJobArn']
print(json.dumps(create_dataset_import_job_response, indent=2))

{
  "datasetImportJobArn": "arn:aws:personalize:us-east-1:261294318658:dataset-import-job/DEMO-sims-dataset-import-job-41593",
  "ResponseMetadata": {
    "RequestId": "a8cd2d04-4b91-4205-8df9-b715c52e5a2e",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Sat, 01 Jun 2019 02:42:58 GMT",
      "x-amzn-requestid": "a8cd2d04-4b91-4205-8df9-b715c52e5a2e",
      "content-length": "122",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


### Attendre que la tâche d'importation de données et l'exécution de la tâche d'importation de données aient le statut ACTIF

In [20]:
status = None
max_time = time.time() + 3*60*60 # 3 hours
while time.time() < max_time:
    describe_dataset_import_job_response = personalize.describe_dataset_import_job(
        datasetImportJobArn = dataset_import_job_arn
    )
    
    dataset_import_job = describe_dataset_import_job_response["datasetImportJob"]
    if "latestDatasetImportJobRun" not in dataset_import_job:
        status = dataset_import_job["status"]
        print("DatasetImportJob: {}".format(status))
    else:
        status = dataset_import_job["latestDatasetImportJobRun"]["status"]
        print("LatestDatasetImportJobRun: {}".format(status))
    
    if status == "ACTIVE" or status == "CREATE FAILED":
        break
        
    time.sleep(60)

DatasetImportJob: CREATE IN_PROGRESS
DatasetImportJob: CREATE IN_PROGRESS
DatasetImportJob: CREATE IN_PROGRESS
DatasetImportJob: CREATE IN_PROGRESS
DatasetImportJob: CREATE IN_PROGRESS
DatasetImportJob: CREATE IN_PROGRESS
DatasetImportJob: CREATE IN_PROGRESS
DatasetImportJob: CREATE IN_PROGRESS
DatasetImportJob: CREATE IN_PROGRESS
DatasetImportJob: CREATE IN_PROGRESS
DatasetImportJob: CREATE IN_PROGRESS
DatasetImportJob: CREATE IN_PROGRESS
DatasetImportJob: CREATE IN_PROGRESS
DatasetImportJob: CREATE IN_PROGRESS
DatasetImportJob: CREATE IN_PROGRESS
DatasetImportJob: ACTIVE


# Créer une solution

In [4]:
recipe_list = personalize.list_recipes()
for recipe in recipe_list['recipes']:
    print(recipe['recipeArn'])

arn:aws:personalize:::recipe/aws-hrnn
arn:aws:personalize:::recipe/aws-hrnn-coldstart
arn:aws:personalize:::recipe/aws-hrnn-metadata
arn:aws:personalize:::recipe/aws-personalized-ranking
arn:aws:personalize:::recipe/aws-popularity-count
arn:aws:personalize:::recipe/aws-sims
arn:aws:personalize:::recipe/aws-user-personalization


Il existe de nombreuses recettes pour différents scénarios. Dans cet exemple, nous ne disposons que de données relatives aux interactions, nous choisirons donc l'une des recettes de base.

| Est-ce possible ? | Recette | Description 
|-------- | -------- |:------------
| Y | aws-popularity-count | Calcule la popularité des articles en fonction du nombre d'événements correspondant à cet article dans le jeu de données des interactions utilisateur-article.
| Y | aws-hrnn | Prédit les articles avec lesquels un utilisateur va interagir. Un réseau neuronal hiérarchiquement récurrent qui peut reproduire l'ordre temporel des interactions utilisateur-article.
| N – requiert des métadonnées | aws-hrnn-metadata | Prédit les articles avec lesquels un utilisateur va interagir. HRNN avec des fonctions supplémentaires dérivées des métadonnées contextuelles (métadonnées d'interaction utilisateur-article), des métadonnées de l'utilisateur (jeu de données de l'utilisateur) et des métadonnées de l'article (jeu de données de l'article)
| N – pour les bandits et requiert des métadonnées | aws-hrnn-coldstart | Prédit les articles avec lesquels un utilisateur va interagir. HRNN – métadonnées avec exploration personnalisée des nouveaux articles.
| N – pour les requêtes basées sur les articles | aws-sims | Calcule les articles similaires à un article spécifique sur la base de la co-occurrence de l'article dans l'historique du même utilisateur dans le jeu de données d'interaction utilisateur-article.
| N – pour le reclassement d'une liste courte | aws-personalized-ranking | Reclasse une liste des articles pour un utilisateur. Entraîne sur le jeu de données des interactions utilisateur-article. 


Nous (ou autoML) pouvons exécuter toutes ces recettes de base et choisir le modèle le plus performant à partir des mesures internes. Nous recommandons d'effectuer des comparaisons, en particulier avec la popularité de référence, afin de constater l'amélioration des mesures grâce à la personnalisation. Cependant, dans cette démo, nous choisirons une recette, aws-sims, pour expliquer les tests de détection.

In [23]:
recipe_arn = "arn:aws:personalize:::recipe/aws-sims"

In [24]:
create_solution_response = personalize.create_solution(
    name = "DEMO-sims-solution-"+suffix,
    datasetGroupArn = dataset_group_arn,
    recipeArn = recipe_arn,
)

solution_arn = create_solution_response['solutionArn']
print(json.dumps(create_solution_response, indent=2))

{
  "solutionArn": "arn:aws:personalize:us-east-1:261294318658:solution/DEMO-sims-solution-41593",
  "ResponseMetadata": {
    "RequestId": "86338585-ef1f-403d-927a-0bf07c4e50a4",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Sat, 01 Jun 2019 03:12:27 GMT",
      "x-amzn-requestid": "86338585-ef1f-403d-927a-0bf07c4e50a4",
      "content-length": "94",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


In [25]:
create_solution_version_response = personalize.create_solution_version(
    solutionArn = solution_arn
)

solution_version_arn = create_solution_version_response['solutionVersionArn']
print(json.dumps(create_solution_version_response, indent=2))

{
  "solutionVersionArn": "arn:aws:personalize:us-east-1:261294318658:solution/DEMO-sims-solution-41593/5548c91f",
  "ResponseMetadata": {
    "RequestId": "be363337-237b-4056-9a23-8a0cb0df3a94",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Sat, 01 Jun 2019 03:12:29 GMT",
      "x-amzn-requestid": "be363337-237b-4056-9a23-8a0cb0df3a94",
      "content-length": "110",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


### Attendre que la version de la solution ait le statut ACTIVE (ACTIF)

In [26]:
status = None
max_time = time.time() + 3*60*60 # 3 hours
while time.time() < max_time:
    describe_solution_version_response = personalize.describe_solution_version(
        solutionVersionArn = solution_version_arn
    )
    status = describe_solution_version_response["solutionVersion"]["status"]
    print("SolutionVersion: {}".format(status))
    
    if status == "ACTIVE" or status == "CREATE FAILED":
        break
        
    time.sleep(60)

SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_P

### Obtenir les métriques de la solution

In [28]:
get_metrics_response = personalize.get_solution_metrics(
    solutionVersionArn = solution_version_arn
)

print(json.dumps(get_metrics_response, indent=2))

{
  "solutionVersionArn": "arn:aws:personalize:us-east-1:261294318658:solution/DEMO-sims-solution-41593/5548c91f",
  "metrics": {
    "coverage": 0.4893,
    "mean_reciprocal_rank_at_25": 0.0037,
    "normalized_discounted_cumulative_gain_at_10": 0.0051,
    "normalized_discounted_cumulative_gain_at_25": 0.0107,
    "normalized_discounted_cumulative_gain_at_5": 0.0034,
    "precision_at_10": 0.001,
    "precision_at_25": 0.0013,
    "precision_at_5": 0.001
  },
  "ResponseMetadata": {
    "RequestId": "20edf0f0-6ea2-41e1-b63e-30a9c5468960",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Sat, 01 Jun 2019 04:05:36 GMT",
      "x-amzn-requestid": "20edf0f0-6ea2-41e1-b63e-30a9c5468960",
      "content-length": "406",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


# Créer et attendre la campagne

In [29]:
create_campaign_response = personalize.create_campaign(
    name = "DEMO-sims-campaign-"+suffix,
    solutionVersionArn = solution_version_arn,
    minProvisionedTPS = 2,    
)

campaign_arn = create_campaign_response['campaignArn']
print(json.dumps(create_campaign_response, indent=2))

{
  "campaignArn": "arn:aws:personalize:us-east-1:261294318658:campaign/DEMO-sims-campaign-41593",
  "ResponseMetadata": {
    "RequestId": "b1b79a7c-196d-4ca7-8c6c-8e029f7d266a",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Sat, 01 Jun 2019 04:05:50 GMT",
      "x-amzn-requestid": "b1b79a7c-196d-4ca7-8c6c-8e029f7d266a",
      "content-length": "94",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


### Attendre que la campagne ait le statut ACTIF

In [30]:
status = None
max_time = time.time() + 3*60*60 # 3 hours
while time.time() < max_time:
    describe_campaign_response = personalize.describe_campaign(
        campaignArn = campaign_arn
    )
    status = describe_campaign_response["campaign"]["status"]
    print("Campaign: {}".format(status))
    
    if status == "ACTIVE" or status == "CREATE FAILED":
        break
        
    time.sleep(60)

Campaign: CREATE PENDING
Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: ACTIVE


## pour faciliter l'interprétation, examinons certains articles 

In [33]:
movies = pd.read_csv('./ml-1m/movies.dat', sep='::', names=['ITEM_ID','title','genre'])

  """Entry point for launching an IPython kernel.


In [35]:
movies=movies.set_index('ITEM_ID')

In [36]:
movies.head()

Unnamed: 0_level_0,title,genre
ITEM_ID,Unnamed: 1_level_1,Unnamed: 2_level_1
1,Toy Story (1995),Animation|Children's|Comedy
2,Jumanji (1995),Adventure|Children's|Fantasy
3,Grumpier Old Men (1995),Comedy|Romance
4,Waiting to Exhale (1995),Comedy|Drama
5,Father of the Bride Part II (1995),Comedy


### Choisissez quelques articles et vérifiez si les articles trouvés correspondent généralement à des genres similaires

Remarquez que le modèle n'a pas utilisé ces métadonnées (genre) pour la formation, il s'agit d'un test de validité ou de détection pour voir si le modèle a découvert des articles similaires qui « sont pertinents »

In [50]:
rec_response = personalize_runtime.get_recommendations(
        campaignArn = campaign_arn,
        itemId = str(5)
    )
rec_items = [int(x['itemId']) for x in rec_response['itemList']]

In [51]:
movies.loc[rec_items[:5]]

Unnamed: 0_level_0,title,genre
ITEM_ID,Unnamed: 1_level_1,Unnamed: 2_level_1
243,Gordy (1995),Comedy
2350,Heart Condition (1990),Comedy
3313,Class Reunion (1982),Comedy
626,"Thin Line Between Love and Hate, A (1996)",Comedy
1822,Meet the Deedles (1998),Children's|Comedy


In [52]:
rec_response = personalize_runtime.get_recommendations(
        campaignArn = campaign_arn,
        itemId = str(2)
    )
rec_items = [int(x['itemId']) for x in rec_response['itemList']]

In [53]:
movies.loc[rec_items[:5]]

Unnamed: 0_level_0,title,genre
ITEM_ID,Unnamed: 1_level_1,Unnamed: 2_level_1
56,Kids of the Round Table (1995),Adventure|Children's|Fantasy
2079,Kidnapped (1960),Children's|Drama
1520,Commandments (1997),Romance
146,"Amazing Panda Adventure, The (1995)",Adventure|Children's
626,"Thin Line Between Love and Hate, A (1996)",Comedy


## Effectuer le HPO

Nous examinons maintenant si le HPO améliore le processus

In [57]:
create_solution_response = personalize.create_solution(
    name = "DEMO-sims-solution-hpo-"+suffix,
    datasetGroupArn = dataset_group_arn,
    recipeArn = recipe_arn,
    performHPO = True,
    solutionConfig={
        'hpoConfig': {
            'hpoResourceConfig': {
                  'maxNumberOfTrainingJobs': '40',
                  'maxParallelTrainingJobs': '10'
              }
        }
    }
)

solution_arn = create_solution_response['solutionArn']
print(json.dumps(create_solution_response, indent=2))

{
  "solutionArn": "arn:aws:personalize:us-east-1:261294318658:solution/DEMO-sims-solution-hpo-41593",
  "ResponseMetadata": {
    "RequestId": "7aebbfcc-83f7-497f-8d1a-e7ff4b1d5a38",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Sat, 01 Jun 2019 04:54:32 GMT",
      "x-amzn-requestid": "7aebbfcc-83f7-497f-8d1a-e7ff4b1d5a38",
      "content-length": "98",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


In [58]:
create_solution_version_response = personalize.create_solution_version(
    solutionArn = solution_arn
)

solution_version_arn = create_solution_version_response['solutionVersionArn']
print(json.dumps(create_solution_version_response, indent=2))

{
  "solutionVersionArn": "arn:aws:personalize:us-east-1:261294318658:solution/DEMO-sims-solution-hpo-41593/18b522a8",
  "ResponseMetadata": {
    "RequestId": "37bd77a1-cf9c-4bb6-8e37-540c89c653f8",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Sat, 01 Jun 2019 04:54:52 GMT",
      "x-amzn-requestid": "37bd77a1-cf9c-4bb6-8e37-540c89c653f8",
      "content-length": "114",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


In [59]:
status = None
max_time = time.time() + 3*60*60 # 3 hours
while time.time() < max_time:
    describe_solution_version_response = personalize.describe_solution_version(
        solutionVersionArn = solution_version_arn
    )
    status = describe_solution_version_response["solutionVersion"]["status"]
    print("SolutionVersion: {}".format(status))
    
    if status == "ACTIVE" or status == "CREATE FAILED":
        break
        
    time.sleep(60)

SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_P

In [60]:
get_metrics_response = personalize.get_solution_metrics(
    solutionVersionArn = solution_version_arn
)

print(json.dumps(get_metrics_response, indent=2))

{
  "solutionVersionArn": "arn:aws:personalize:us-east-1:261294318658:solution/DEMO-sims-solution-hpo-41593/18b522a8",
  "metrics": {
    "coverage": 0.4929,
    "mean_reciprocal_rank_at_25": 0.004,
    "normalized_discounted_cumulative_gain_at_10": 0.0057,
    "normalized_discounted_cumulative_gain_at_25": 0.0116,
    "normalized_discounted_cumulative_gain_at_5": 0.004,
    "precision_at_10": 0.001,
    "precision_at_25": 0.0014,
    "precision_at_5": 0.001
  },
  "ResponseMetadata": {
    "RequestId": "cd2365ff-eb40-4497-a822-041880435d4a",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Sat, 01 Jun 2019 05:51:33 GMT",
      "x-amzn-requestid": "cd2365ff-eb40-4497-a822-041880435d4a",
      "content-length": "408",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


In [61]:
movies.head()

Unnamed: 0_level_0,title,genre
ITEM_ID,Unnamed: 1_level_1,Unnamed: 2_level_1
1,Toy Story (1995),Animation|Children's|Comedy
2,Jumanji (1995),Adventure|Children's|Fantasy
3,Grumpier Old Men (1995),Comedy|Romance
4,Waiting to Exhale (1995),Comedy|Drama
5,Father of the Bride Part II (1995),Comedy


In [62]:
rec_response = personalize_runtime.get_recommendations(
        campaignArn = campaign_arn,
        itemId = str(5)
    )
rec_items = [int(x['itemId']) for x in rec_response['itemList']]
movies.loc[rec_items[:5]]

Unnamed: 0_level_0,title,genre
ITEM_ID,Unnamed: 1_level_1,Unnamed: 2_level_1
243,Gordy (1995),Comedy
2350,Heart Condition (1990),Comedy
3313,Class Reunion (1982),Comedy
626,"Thin Line Between Love and Hate, A (1996)",Comedy
1822,Meet the Deedles (1998),Children's|Comedy


In [63]:
rec_response = personalize_runtime.get_recommendations(
        campaignArn = campaign_arn,
        itemId = str(2)
    )
rec_items = [int(x['itemId']) for x in rec_response['itemList']]
movies.loc[rec_items[:5]]

Unnamed: 0_level_0,title,genre
ITEM_ID,Unnamed: 1_level_1,Unnamed: 2_level_1
56,Kids of the Round Table (1995),Adventure|Children's|Fantasy
2079,Kidnapped (1960),Children's|Drama
1520,Commandments (1997),Romance
146,"Amazing Panda Adventure, The (1995)",Adventure|Children's
626,"Thin Line Between Love and Hate, A (1996)",Comedy


Nous constatons que les articles renvoyés par les sims après le HPO sont « plus similaires »