# [Module 3.2] Personalize 솔류션 평가 지표 확인하기

이 노트북은 Module1에서 생성한 데이타셋 그룹, 데이타셋을 바탕으로 아래와 같은 작업을 합니다.

* 레서피(알고리즘) 선택 및 솔류션 생성
* 솔류션 버전 생성
* 솔류션 평가 지표 얻기



In [1]:
# Imports
import boto3
import json
import numpy as np
import pandas as pd
import time


다음으로 여러분의 환경이 Amazon Personalize와 성공적으로 통신할 수 있는지 확인해야 합니다.

In [2]:
# Configure the SDK to Personalize:
personalize = boto3.client('personalize')

아래 코드 셀은 이전 notebook에서 저장했던 공유 변수들을 불러옵니다.

In [3]:
%store -r

생성할 오브젝트의 끝에 임의의 숫자를 부여하기 위해 suffix 정의

## 솔루션 평가 지표 얻기

이번 파트에서는 Amazon Personalize에서 기본으로 제공하는 솔루션에 대한 평가 지표를 확인해 봅니다. 
Amazon Personalize에서는 평가 지표를 생성하기 위해 약 랜덤으로 10% 사용자의 interaction data를 테스트 용으로 활용합니다. 

아래 이미지는 Amazon Personalize가 데이터를 분리하는 방법을 보여줍니다. 사용자가 10 명이고 각각 10 개의 상호 작용이있는 경우 (여기에서 원은 Interaction data를 나타냄) 타임 스탬프를 기준으로 가장 오래된 것부터 최신 것까지 나열된 것입니다. Amazon Personalize는 사용자의 90 % (파란색 원)의 모든 Interaction 데이터를 사용하여 솔루션 버전을 훈련시키고 나머지 10 %는 평가를 위해 사용합니다. 나머지 10 %의 각 사용자에 대해 Interaction data (녹색 원)의 90 %가 훈련 된 모델의 입력값으로 사용됩니다. 데이터의 나머지 10 % (주황색 원)는 모델에서 생성 된 추천 결과물과 비교되고 평가 지표를 계산하는 데 사용됩니다.



![personalize metrics](static/imgs/personalize_metrics.png)

[솔류션 평가 지표 정의](https://docs.aws.amazon.com/personalize/latest/dg/working-with-training-metrics.html)
는 개발자 문서의 링크 참조 바랍니다. 또한 이 링크 [솔류션 평가 정의 예제](http://francescopochetti.com/recommend-expedia-hotels-with-amazon-personalize-the-magic-of-hierarchical-rnns/) 의 페이지 맨 아래 쪽을 보시면 조금 더 직관적인 그림을 보실 수 있습니다.
 <br>
또한 reciprocal_rank_at_5, normalized_discounted_cumulative_gain_at_5,precision_at_5 의 예제는 아래와 같습니다. 
* Exmaple
    * 5 개의 추천리스트를 제공했고, 이 중에 2번째와 5번째가 실제 데이타와 일치 했다고 하면, 쉽게 이렇게 [0,1,0,0,1] 표시 할 수 있습니다.
        * reciprocal_rank
            * 1/2 (0.5) # 가장 빠른 순서의 하나만을 선택 합니다
        * normalized_discounted_cumulative_gain_at_5
            * (1/log(1+2) + 1/log(1+5)) / (1/log(1+1) + 1/log(1+2)) = 0.6241
        * precision_at_5
            * 2/5 (0.4)



#### 조금더 상세하고 Custum 평가 지표를 얻기 위해서 이전에 분리해둔 테스트 데이터를 가지고 캠페인 생성 후 별도 테스트를 진행하도록 합니다.



In [4]:
metrics=[]

def build_metric_matrix(solution,response):
    metrics.append([solution,
                response['metrics']['coverage'],
                response['metrics']['mean_reciprocal_rank_at_25'],
                response['metrics']['normalized_discounted_cumulative_gain_at_5'],
                response['metrics']['normalized_discounted_cumulative_gain_at_10'],
                response['metrics']['normalized_discounted_cumulative_gain_at_25'],
                response['metrics']['precision_at_5'],
                response['metrics']['precision_at_10'],
                response['metrics']['precision_at_25']])

#### Metrics: Popularity

In [5]:
get_solution_metrics_response = personalize.get_solution_metrics(
    solutionVersionArn = popularity_solution_version_arn 
)

print(json.dumps(get_solution_metrics_response, indent=2))
build_metric_matrix('popularity',get_solution_metrics_response)


{
  "solutionVersionArn": "arn:aws:personalize:ap-northeast-2:057716757052:solution/Movielens-popularity-91891/51d69553",
  "metrics": {
    "coverage": 0.0136,
    "mean_reciprocal_rank_at_25": 0.0559,
    "normalized_discounted_cumulative_gain_at_10": 0.067,
    "normalized_discounted_cumulative_gain_at_25": 0.1113,
    "normalized_discounted_cumulative_gain_at_5": 0.0395,
    "precision_at_10": 0.02,
    "precision_at_25": 0.0182,
    "precision_at_5": 0.0163
  },
  "ResponseMetadata": {
    "RequestId": "ce843d46-3002-4326-95a7-94195256c5a4",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Sun, 30 Aug 2020 10:40:46 GMT",
      "x-amzn-requestid": "ce843d46-3002-4326-95a7-94195256c5a4",
      "content-length": "412",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


#### Metrics: User-Personalization

In [6]:
get_solution_metrics_response = personalize.get_solution_metrics(
    solutionVersionArn = user_personalization_solution_version_arn
)

print(json.dumps(get_solution_metrics_response, indent=2))

build_metric_matrix('user_personalization',get_solution_metrics_response)

{
  "solutionVersionArn": "arn:aws:personalize:ap-northeast-2:057716757052:solution/Movielens-user-personalization-91891/5ad2fc30",
  "metrics": {
    "coverage": 0.2832,
    "mean_reciprocal_rank_at_25": 0.2378,
    "normalized_discounted_cumulative_gain_at_10": 0.2313,
    "normalized_discounted_cumulative_gain_at_25": 0.3247,
    "normalized_discounted_cumulative_gain_at_5": 0.1836,
    "precision_at_10": 0.0799,
    "precision_at_25": 0.0627,
    "precision_at_5": 0.0958
  },
  "ResponseMetadata": {
    "RequestId": "61226ca8-8cb8-4689-b505-13e4f961d2b9",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Sun, 30 Aug 2020 10:40:47 GMT",
      "x-amzn-requestid": "61226ca8-8cb8-4689-b505-13e4f961d2b9",
      "content-length": "425",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


#### Metrics: HRNN

In [7]:
get_solution_metrics_response = personalize.get_solution_metrics(
    solutionVersionArn = hrnn_solution_version_arn
)

print(json.dumps(get_solution_metrics_response, indent=2))

build_metric_matrix('hrnn',get_solution_metrics_response)

{
  "solutionVersionArn": "arn:aws:personalize:ap-northeast-2:057716757052:solution/Movielens-hrnn-91891/47b38681",
  "metrics": {
    "coverage": 0.6355,
    "mean_reciprocal_rank_at_25": 0.2351,
    "normalized_discounted_cumulative_gain_at_10": 0.2334,
    "normalized_discounted_cumulative_gain_at_25": 0.3181,
    "normalized_discounted_cumulative_gain_at_5": 0.1914,
    "precision_at_10": 0.0749,
    "precision_at_25": 0.0568,
    "precision_at_5": 0.0926
  },
  "ResponseMetadata": {
    "RequestId": "b5b4e241-3857-4417-bc91-8b393f513146",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Sun, 30 Aug 2020 10:40:47 GMT",
      "x-amzn-requestid": "b5b4e241-3857-4417-bc91-8b393f513146",
      "content-length": "409",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


#### Metrics: HRNN-Meta

In [8]:
get_solution_metrics_response = personalize.get_solution_metrics(
    solutionVersionArn = hrnn_meta_solution_version_arn
)

print(json.dumps(get_solution_metrics_response, indent=2))
build_metric_matrix('hrnn_meta',get_solution_metrics_response)

{
  "solutionVersionArn": "arn:aws:personalize:ap-northeast-2:057716757052:solution/Movielens-aws-hrnn-metadata-91891/adaee5f0",
  "metrics": {
    "coverage": 0.5724,
    "mean_reciprocal_rank_at_25": 0.2234,
    "normalized_discounted_cumulative_gain_at_10": 0.2224,
    "normalized_discounted_cumulative_gain_at_25": 0.3076,
    "normalized_discounted_cumulative_gain_at_5": 0.1674,
    "precision_at_10": 0.077,
    "precision_at_25": 0.0588,
    "precision_at_5": 0.0862
  },
  "ResponseMetadata": {
    "RequestId": "fac30a88-8885-4030-8b21-67e435d2f3ea",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Sun, 30 Aug 2020 10:40:47 GMT",
      "x-amzn-requestid": "fac30a88-8885-4030-8b21-67e435d2f3ea",
      "content-length": "421",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


#### Metrics: HRNN-Coldstart

In [9]:
get_solution_metrics_response = personalize.get_solution_metrics(
    solutionVersionArn = hrnn_coldstart_solution_version_arn
)

print(json.dumps(get_solution_metrics_response, indent=2))
build_metric_matrix('hrnn_coldstart',get_solution_metrics_response)

{
  "solutionVersionArn": "arn:aws:personalize:ap-northeast-2:057716757052:solution/Movielens-hrnn-coldstart-91891/5afbfa75",
  "metrics": {
    "coverage": 0.2363,
    "mean_reciprocal_rank_at_25": 0.0026,
    "normalized_discounted_cumulative_gain_at_10": 0.0039,
    "normalized_discounted_cumulative_gain_at_25": 0.0066,
    "normalized_discounted_cumulative_gain_at_5": 0.0026,
    "precision_at_10": 0.001,
    "precision_at_25": 0.0008,
    "precision_at_5": 0.001
  },
  "ResponseMetadata": {
    "RequestId": "ed17b0d7-b78d-4a51-a9b2-fb1ff2c090a9",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Sun, 30 Aug 2020 10:40:47 GMT",
      "x-amzn-requestid": "ed17b0d7-b78d-4a51-a9b2-fb1ff2c090a9",
      "content-length": "417",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


#### Metrics: SIMS

In [10]:
get_solution_metrics_response = personalize.get_solution_metrics(
    solutionVersionArn = sims_solution_version_arn
)

print(json.dumps(get_solution_metrics_response, indent=2))
build_metric_matrix('sims',get_solution_metrics_response)

{
  "solutionVersionArn": "arn:aws:personalize:ap-northeast-2:057716757052:solution/Movielens-sims-91891/84632efc",
  "metrics": {
    "coverage": 0.7622,
    "mean_reciprocal_rank_at_25": 0.1809,
    "normalized_discounted_cumulative_gain_at_10": 0.1824,
    "normalized_discounted_cumulative_gain_at_25": 0.2514,
    "normalized_discounted_cumulative_gain_at_5": 0.1505,
    "precision_at_10": 0.0533,
    "precision_at_25": 0.0407,
    "precision_at_5": 0.0686
  },
  "ResponseMetadata": {
    "RequestId": "5cbb9564-cb0b-4de0-b4ba-3d74396c49ba",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Sun, 30 Aug 2020 10:40:46 GMT",
      "x-amzn-requestid": "5cbb9564-cb0b-4de0-b4ba-3d74396c49ba",
      "content-length": "409",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


#### Metrics: Ranking

In [11]:
get_solution_metrics_response = personalize.get_solution_metrics(
    solutionVersionArn = ranking_solution_version_arn 
)

print(json.dumps(get_solution_metrics_response, indent=2))
build_metric_matrix('ranking',get_solution_metrics_response)

{
  "solutionVersionArn": "arn:aws:personalize:ap-northeast-2:057716757052:solution/Movielens-ranking-91891/04a98ddc",
  "metrics": {
    "coverage": 0.0136,
    "mean_reciprocal_rank_at_25": 0.0908,
    "normalized_discounted_cumulative_gain_at_10": 0.1187,
    "normalized_discounted_cumulative_gain_at_25": 0.1415,
    "normalized_discounted_cumulative_gain_at_5": 0.0949,
    "precision_at_10": 0.0238,
    "precision_at_25": 0.0145,
    "precision_at_5": 0.0299
  },
  "ResponseMetadata": {
    "RequestId": "e187e9bb-8a84-427a-b202-ca903bf38a80",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Sun, 30 Aug 2020 10:40:47 GMT",
      "x-amzn-requestid": "e187e9bb-8a84-427a-b202-ca903bf38a80",
      "content-length": "412",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


## Summary Metrics

레서피의 종류는 크게 세가지 입니다. 각각에 대해 확인을 해보겠습니다.
#### 1. USER_PERSONALIZATION Recipes
- 5가지 (popularity,user_personalization,hrnn,hrnn_meta,hrnn_coldstart) 있습니다.
- popularity 는 베이스라인의 레서피로서 샤용을 주로 합니다. 모든 지표에서 가장 낮은 수치를 보입니다.
- user_personalization 이 전반적으로 가장 높은 성능을 보여 줍니다. default로 exploration_weight=0.3 입니다. coldstart item 이 거의 없는 상태에서도 가장 높은 성능을 보여 주고 있습니다.
- hrnn_coldstart 는coldstart item 이 거의 없는 상태이기에 성능이 낮게 나오는 것이 정상 으로 보입니다.

#### 2. RELATED_ITEMS Recipes
- sims 가 부류에 속하는 레서피로 커버리지가 높게 나왔습니다.

#### 3. PERSONALIZED_RANKING Recipes
- ranking 이 여기에 속합니다. 


![Fig.3.2.metric_summary.png](static/imgs/Fig.3.2.metric_summary.png)

In [12]:
recipe_metrics=pd.DataFrame(metrics,columns=['recipe','coverage','mrr@25','ndcg@5','ndcg@10','ndcg@25','p@5','p@10','p@25'])

recipe_metrics

Unnamed: 0,recipe,coverage,mrr@25,ndcg@5,ndcg@10,ndcg@25,p@5,p@10,p@25
0,popularity,0.0136,0.0559,0.0395,0.067,0.1113,0.0163,0.02,0.0182
1,user_personalization,0.2832,0.2378,0.1836,0.2313,0.3247,0.0958,0.0799,0.0627
2,hrnn,0.6355,0.2351,0.1914,0.2334,0.3181,0.0926,0.0749,0.0568
3,hrnn_meta,0.5724,0.2234,0.1674,0.2224,0.3076,0.0862,0.077,0.0588
4,hrnn_coldstart,0.2363,0.0026,0.0026,0.0039,0.0066,0.001,0.001,0.0008
5,sims,0.7622,0.1809,0.1505,0.1824,0.2514,0.0686,0.0533,0.0407
6,ranking,0.0136,0.0908,0.0949,0.1187,0.1415,0.0299,0.0238,0.0145


In [14]:
%store user_personalization_solution_version_arn
%store user_personalization_solution_arn


%store hrnn_solution_version_arn
%store hrnn_solution_arn

%store hrnn_meta_solution_version_arn
%store hrnn_meta_solution_arn

%store hrnn_coldstart_solution_version_arn
%store hrnn_coldstart_solution_arn

%store sims_solution_version_arn
%store sims_solution_arn


Stored 'user_personalization_solution_version_arn' (str)
Stored 'user_personalization_solution_arn' (str)
Stored 'hrnn_solution_version_arn' (str)
Stored 'hrnn_solution_arn' (str)
Stored 'hrnn_meta_solution_version_arn' (str)
Stored 'hrnn_meta_solution_arn' (str)
Stored 'hrnn_coldstart_solution_version_arn' (str)
Stored 'hrnn_coldstart_solution_arn' (str)
Stored 'sims_solution_version_arn' (str)
Stored 'sims_solution_arn' (str)
