## Association-based analysis
- 연관 규칙 분석(Association Rule Analysis) : 일련의 거래나 사건 안에 존재하는 **항목 간의 연관 규칙** (함께 구매되는 상품, X -> Y)을 발견하는 분석 방식
- 비지도 학습
- `장바구니 분석(Market Basket Analysis)`
    - 고객의 구매 기록을 분석해 상품 간의 연관성(조건부 확률)측정
    - 고객이 상품 구매 시 함께 구매할 가능성이 높은 상품을 특정할 수 있음
    - ex: 기저귀를 구매하는 고객이 맥주를 함께 구매하는 확률이 높다는 것을 발견하고, 기저귀와 맥주를 가깝게 진열하면 매출이 상승
    

### 연관 규칙 분석의 개념
- 함께 구매하는 상품의 기준, 측정 지표
- 지지도 (Support) : A(조건 Antecedents)와 B(결과 Consequent)가 동시에 발생할 확률
    - 1에 가까울수록 A와 B 관계가 중요
    - $Support(A -> B)$와 $support(B -> A)$ 차이 파악 불가능 : P(A ∩ B) 와 동일
- 신뢰도 (Confidence) : A를 선택한 상태에서 B를 선택할 조건부 확률
    - 1에 가까울수록 A는 B에 영향을 많이 받음
    - $Confidence(A -> B) != Confidence(B -> A)$
- 향상도 (Lift) : 전체 건수에서 B를 구매할 비율 대비 A를 구매했을 때 신뢰도의 증가 비율
    - A와 B의 출현이 어느 정도 상관관계를 가지는지 나타냄, $[0, \infty]$
    - 향상도($A -> B$) = $\frac{confidence(A->B)}{support(B)}$ = $\frac{support(A \cap B)}{support(A) \times support(B)}$
    - 향상도 ($A -> B$) = 1 : 상호 독립
    - 향상도 ($A -> B$) > 1 : 상호 보완 (A와 B는 정비례)
    - 향상도 ($A -> B$) < 1 : 상호 대체 (A와 B는 반비례)
- 레버리지 (Leverage) : P(A)가 발생했을 때, B가 독립적이라면 P(A) * P(B) 크기로 사건 발생
    - $Leverage(A -> B) = support(A \cap B) - support(A) \times support(B)$
    - [-1, 1] : 독립
- 확신도 (Conviction) : A, B가 독립적인 경우, 우연에 의해 발생한 관계성을 구분하지 못함
    - 어떤 상품A는 단순히 판매 빈도가 높아서 함께 자주 구매되고, 높은 신뢰도의 조건절이 될 수 있음
    - 확신도($A -> B$) = $\frac{아래 경우에서 A,B가 독립적일 경우}{A를 구매하였으나 B를 구매하지 않을 확률}$ = $\frac{1 - support(B)}{1 - confidence(A -> B)}$
        - A를 구매하였으나 B를 구매하지 않을 확률 : $P(Not B|A) = 1 - P(B|A) = 1 - 신뢰도(A -> B)$
            - 위에서 A,B가 독립일 경우 : $1 - P(B|A) = 1 - \frac{P(A)*P(B)}{P(A)} = 1 - P(B)$
        - 범위 : [0, $\infty$], 1보다 큰 값이 바람직

### 연관관계 기반 추천
- 평가가 4.0 이상인 영화
- 지지도가 0.1 이상 영화(빈발 영화)
- Lift값이 최소 1인 연관규칙을 Lift내림차순으로 추출

In [1]:
import numpy as np
import pandas as pd

In [2]:
u_cols = ['user_id', 'age', 'sex', 'occupation', 'zip_code']
users = pd.read_csv('/Users/jun/Library/Mobile Documents/com~apple~CloudDocs/Github/ai _recommendation _system/data/u.user', sep='|', names=u_cols, encoding='latin-1')

In [3]:
i_cols = ['movie_id', 'title', 'release date', 'video release date', 'IMDB URL', 'unknown', 
          'Action', 'Adventure', 'Animation', 'Children\'s', 'Comedy', 'Crime', 'Documentary', 
          'Drama', 'Fantasy', 'Film-Noir', 'Horror', 'Musical', 'Mystery', 'Romance', 'Sci-Fi', 
          'Thriller', 'War', 'Western']
movies = pd.read_csv('/Users/jun/Library/Mobile Documents/com~apple~CloudDocs/Github/ai _recommendation _system/data/u.item', sep='|', names=i_cols, encoding='latin-1')
movies.head()

Unnamed: 0,movie_id,title,release date,video release date,IMDB URL,unknown,Action,Adventure,Animation,Children's,...,Fantasy,Film-Noir,Horror,Musical,Mystery,Romance,Sci-Fi,Thriller,War,Western
0,1,Toy Story (1995),01-Jan-1995,,http://us.imdb.com/M/title-exact?Toy%20Story%2...,0,0,0,1,1,...,0,0,0,0,0,0,0,0,0,0
1,2,GoldenEye (1995),01-Jan-1995,,http://us.imdb.com/M/title-exact?GoldenEye%20(...,0,1,1,0,0,...,0,0,0,0,0,0,0,1,0,0
2,3,Four Rooms (1995),01-Jan-1995,,http://us.imdb.com/M/title-exact?Four%20Rooms%...,0,0,0,0,0,...,0,0,0,0,0,0,0,1,0,0
3,4,Get Shorty (1995),01-Jan-1995,,http://us.imdb.com/M/title-exact?Get%20Shorty%...,0,1,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,5,Copycat (1995),01-Jan-1995,,http://us.imdb.com/M/title-exact?Copycat%20(1995),0,0,0,0,0,...,0,0,0,0,0,0,0,1,0,0


In [4]:
r_cols = ['user_id', 'movie_id', 'rating', 'timestamp']
ratings = pd.read_csv('/Users/jun/Library/Mobile Documents/com~apple~CloudDocs/Github/ai _recommendation _system/data/u.data', sep='\t', names=r_cols, encoding='latin-1')
ratings.head()

Unnamed: 0,user_id,movie_id,rating,timestamp
0,196,242,3,881250949
1,186,302,3,891717742
2,22,377,1,878887116
3,244,51,2,880606923
4,166,346,1,886397596


### train, test set 분리
- user_id를 기준으로 일정 비율 (stratify = True)로 학습, 테스트 데이터 분리

In [5]:
from sklearn.model_selection import train_test_split
x = ratings.copy()
y = ratings['user_id']
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.25, random_state= 42, stratify=y)

### 학습 데이터 (사용자 X 영화 X 평점) matrix

In [6]:
# train 데이터로 Full matrix 구하기 
user_movie_matrix_org = x_train.pivot(index='user_id', columns='movie_id', values='rating')
user_movie_matrix = user_movie_matrix_org.copy()

In [7]:
user_movie_matrix

movie_id,1,2,3,4,5,6,7,8,9,10,...,1671,1672,1673,1674,1676,1677,1679,1680,1681,1682
user_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1,5.0,3.0,4.0,,3.0,5.0,4.0,1.0,5.0,3.0,...,,,,,,,,,,
2,4.0,,,,,,,,,2.0,...,,,,,,,,,,
3,,,,,,,,,,,...,,,,,,,,,,
4,,,,,,,,,,,...,,,,,,,,,,
5,4.0,3.0,,,,,,,,,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
939,,,,,,,,,,,...,,,,,,,,,,
940,,,,2.0,,,4.0,5.0,3.0,,...,,,,,,,,,,
941,,,,,,,4.0,,,,...,,,,,,,,,,
942,,,,,,,,,,,...,,,,,,,,,,


### 평점이 4점 이상인 영화만 추출

In [8]:
# mlxtend 입력용 pd를 위해 4 이상의 평갓값은 1, 4 미만의 평갓값과 결손값은 0으로 한다
user_movie_matrix[user_movie_matrix < 4] = 0
user_movie_matrix[user_movie_matrix.isnull()] = 0
user_movie_matrix[user_movie_matrix >= 4] = 1

user_movie_matrix.head()

movie_id,1,2,3,4,5,6,7,8,9,10,...,1671,1672,1673,1674,1676,1677,1679,1680,1681,1682
user_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1,1.0,0.0,1.0,0.0,0.0,1.0,1.0,0.0,1.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [None]:
#!pip install mlxtend -U

### mlxtend 라이브러리
- 일정 지지도 이상의 빈발 항목 집합(frequent item set) 추출
- 일정 신뢰도 이상의 연관 규칙(association rule) 도출
- 입력 데이터 : 거래 X 항목의 집계표 True/False
- 출력 : df으로 제공, 연관규칙의 출력 5가지 지표(지지도, 신뢰도, 리프트, 레버리지, 확신도) 제공
- 리스트 형태의 데이터를 거래 x항목의 집계표로 변환하는 도구 제공


In [9]:
from mlxtend.frequent_patterns import apriori, association_rules

# 지지도가 높은 영화를 표시
freq_movies = apriori(user_movie_matrix, min_support=0.1, use_colnames=True)
freq_movies



Unnamed: 0,support,itemsets
0,0.255567,(1)
1,0.209968,(7)
2,0.125133,(8)
3,0.170732,(9)
4,0.133616,(11)
...,...,...
190,0.117709,"(210, 174)"
191,0.100742,"(174, 318)"
192,0.100742,"(210, 181)"
193,0.114528,"(258, 181)"


In [10]:
# 지지도가 높은 영화 
freq_movies.sort_values('support', ascending=False).head()

Unnamed: 0,support,itemsets
12,0.407211,(50)
27,0.323436,(100)
48,0.314952,(181)
77,0.277837,(258)
32,0.272534,(127)


In [11]:
movies[movies.movie_id == 181].title

180    Return of the Jedi (1983)
Name: title, dtype: object

In [12]:
# 연관규칙 계산(리프트값이 높은 순으로 표시)
# min_threshold = 1 
rules = association_rules(freq_movies, metric='lift', min_threshold=1)
rules.sort_values('lift', ascending=False).head()[['antecedents', 'consequents', 'lift']]

Unnamed: 0,antecedents,consequents,lift
156,(172),"(50, 174)",2.559385
153,"(50, 174)",(172),2.559385
128,(204),(172),2.524831
129,(172),(204),2.524831
130,(210),(172),2.444412


In [13]:
movies[movies.movie_id == 172]

Unnamed: 0,movie_id,title,release date,video release date,IMDB URL,unknown,Action,Adventure,Animation,Children's,...,Fantasy,Film-Noir,Horror,Musical,Mystery,Romance,Sci-Fi,Thriller,War,Western
171,172,"Empire Strikes Back, The (1980)",01-Jan-1980,,http://us.imdb.com/M/title-exact?Empire%20Stri...,0,1,1,0,0,...,0,0,0,0,0,1,1,0,1,0


In [14]:
movies[movies.movie_id == 50]

Unnamed: 0,movie_id,title,release date,video release date,IMDB URL,unknown,Action,Adventure,Animation,Children's,...,Fantasy,Film-Noir,Horror,Musical,Mystery,Romance,Sci-Fi,Thriller,War,Western
49,50,Star Wars (1977),01-Jan-1977,,http://us.imdb.com/M/title-exact?Star%20Wars%2...,0,1,1,0,0,...,0,0,0,0,0,1,1,0,1,0


In [15]:
movies[movies.movie_id == 174]

Unnamed: 0,movie_id,title,release date,video release date,IMDB URL,unknown,Action,Adventure,Animation,Children's,...,Fantasy,Film-Noir,Horror,Musical,Mystery,Romance,Sci-Fi,Thriller,War,Western
173,174,Raiders of the Lost Ark (1981),01-Jan-1981,,http://us.imdb.com/M/title-exact?Raiders%20of%...,0,1,1,0,0,...,0,0,0,0,0,0,0,0,0,0


### 훈련데이터에서 각 사용자가 평가한 영화 목록 추출

In [16]:
from collections import defaultdict, Counter
pred_user2items = defaultdict(list) # 리스트 초기화, 나중에 영화 추천 결과를 사용자가 평가한 영화리스트로 저장할때 사용

# 훈련데이터에서 사용자별로 평가한 영화를 {사용자; [영화id, ...]} dictionary 구성, user_id열을 기준
user_evaluated_movies = x_train.groupby("user_id").agg({"movie_id": list})["movie_id"].to_dict()
user_evaluated_movies

{1: [119,
  205,
  196,
  81,
  243,
  192,
  171,
  191,
  128,
  226,
  169,
  117,
  1,
  104,
  111,
  161,
  231,
  34,
  255,
  82,
  219,
  118,
  198,
  71,
  122,
  185,
  129,
  93,
  245,
  30,
  134,
  87,
  76,
  114,
  83,
  221,
  148,
  138,
  190,
  260,
  47,
  105,
  241,
  213,
  184,
  142,
  24,
  79,
  12,
  32,
  216,
  3,
  251,
  170,
  49,
  159,
  10,
  77,
  236,
  144,
  209,
  175,
  100,
  91,
  95,
  272,
  52,
  220,
  98,
  143,
  67,
  121,
  189,
  25,
  106,
  97,
  256,
  266,
  176,
  46,
  23,
  174,
  36,
  55,
  248,
  180,
  154,
  212,
  45,
  80,
  44,
  99,
  127,
  261,
  13,
  78,
  247,
  54,
  193,
  177,
  253,
  200,
  42,
  90,
  218,
  257,
  238,
  156,
  21,
  173,
  137,
  230,
  29,
  72,
  227,
  160,
  232,
  178,
  70,
  102,
  51,
  84,
  16,
  223,
  150,
  139,
  183,
  140,
  31,
  120,
  53,
  41,
  74,
  5,
  259,
  181,
  115,
  9,
  132,
  252,
  88,
  240,
  268,
  254,
  14,
  43,
  202,
  214,
  258,
  210,
  116,

- `agg({"movie_id": list})` : 각 그룹에 대해 movie_id열의 값들을 리스트로 묶는다. 즉 각 사용자가 평가한 movie_id들을 하나의 리스트로 만든다.
- `user_evaluated_movies` : 각 사용자가 평가한 영화들의 목록을 담고 있는 딕셔너리
- 사용자가 이미 평가한 영화를 제외하고 새로운 영화를 추천하는 데 유용하게 사용가능

In [17]:
x_train.groupby("user_id").agg({"movie_id": list})

Unnamed: 0_level_0,movie_id
user_id,Unnamed: 1_level_1
1,"[119, 205, 196, 81, 243, 192, 171, 191, 128, 2..."
2,"[100, 257, 303, 315, 313, 281, 279, 307, 237, ..."
3,"[328, 346, 320, 181, 330, 321, 348, 260, 333, ..."
4,"[359, 358, 303, 210, 50, 301, 329, 264, 294, 2..."
5,"[418, 368, 423, 404, 445, 436, 169, 384, 80, 1..."
...,...
939,"[546, 689, 121, 274, 409, 257, 275, 222, 106, ..."
940,"[272, 427, 150, 286, 300, 313, 147, 50, 258, 2..."
941,"[7, 117, 475, 1007, 408, 919, 455, 124, 993, 2..."
942,"[357, 514, 300, 318, 520, 498, 661, 892, 50, 2..."


In [18]:
x_train.groupby("user_id").agg({"movie_id": list})["movie_id"]

user_id
1      [119, 205, 196, 81, 243, 192, 171, 191, 128, 2...
2      [100, 257, 303, 315, 313, 281, 279, 307, 237, ...
3      [328, 346, 320, 181, 330, 321, 348, 260, 333, ...
4      [359, 358, 303, 210, 50, 301, 329, 264, 294, 2...
5      [418, 368, 423, 404, 445, 436, 169, 384, 80, 1...
                             ...                        
939    [546, 689, 121, 274, 409, 257, 275, 222, 106, ...
940    [272, 427, 150, 286, 300, 313, 147, 50, 258, 2...
941    [7, 117, 475, 1007, 408, 919, 455, 124, 993, 2...
942    [357, 514, 300, 318, 520, 498, 661, 892, 50, 2...
943    [58, 403, 825, 1188, 576, 92, 415, 406, 117, 1...
Name: movie_id, Length: 943, dtype: object

In [19]:
# 학습용 데이터에서 평갓값이 4 이상인 것만 얻는다
movielens_train_high_rating = x_train[x_train.rating >= 4]
movielens_train_high_rating

Unnamed: 0,user_id,movie_id,rating,timestamp
99479,862,177,4,879305016
19586,70,193,4,884149646
75058,666,527,4,880139253
33525,535,168,5,879618385
45393,603,1240,5,891956058
...,...,...,...,...
1183,313,436,4,891029877
43508,439,591,4,882892818
62749,851,273,5,891961663
14759,283,50,5,879297134


### 연관규칙을 이용한 추천
- 사용자가 직전에 평가한 영화들과 `관련된 영화`를 추천하는 시스템을 구현한 것
- 주요 목표는 사용자가 직전에 본 영화와 연관된 규칙을 찾아서 그 규칙에 따라 새로운 영화를 추천하는 것
- 연관 규칙의 결론부에서 추천 가능한 영화를 찾아내며, 사용자가 이미 평가하지 않은 영화를 추천리스트에 추가한다.
    - 사용자별로 최근에 평가(timestamp)한 5편의 영화 추출 [a, b, c, d, e]
    - 연관규칙에서 [a, b, c, d, e]를 조건으로 가지는 연관규칙 추출
    - 연관규칙의 결론부 영화 list
    - 결론부 영화 list에서 고빈도 영화 중 사용자가 보지 않은 영화 10편

In [20]:
pred_user2items = defaultdict(list)
for user_id, data in movielens_train_high_rating.groupby("user_id"):
    
    # 사용자가 직전에 평가한 5개의 영화를 얻는다
    input_data = data.sort_values("timestamp")["movie_id"].tolist()[-5:]
    
    # 그 영화들이 조건부에 하나라도 포함되는 어소시에이션(연관) 규칙을 검출한다
    matched_flags = rules.antecedents.apply(lambda x: len(set(input_data) & x)) >= 1

    # 연관규칙의 결론부의 영화를 리스트에 저장하고, 등록 빈도 수로 정렬해 사용자가 아직 평가하지 않았다면, 추천 목록에 추가한다
    consequent_movies = []
    for i, row in rules[matched_flags].sort_values("lift", ascending=False).iterrows():
        consequent_movies.extend(row["consequents"])
    
    # 등록 빈도 세기
    counter = Counter(consequent_movies)
    for movie_id, movie_cnt in counter.most_common():
        if movie_id not in user_evaluated_movies[user_id]:
            pred_user2items[user_id].append(movie_id)
        # 추천 리스트가 10이 되면 종료한다
        if len(pred_user2items[user_id]) == 10:
            break


- 연관 규칙 매칭
    - `matched_flags = rules.antecedents.apply(lambda x: len(set(input_data) & x)) >= 1` : 연관 규칙을 저장한 rules 데이터에서 전제부(anecedents)에 직전에 평가한 영화들(input_data)이 포함된 규칙을 찾는다.
        - `rules.antecedents`: 각 연관 규칙의 전제부, 이 전제부에 사용자가 본 영화가 하나라도 포함되어 있으면 해당 규칙을 선택
        - `set(input_data) & x`: 직전에 본 영화와 규칙의 전제부가 교집합을 가지는지 확인, 그 결과가 1개 이상이면 해당 규칙이 매칭됨
- 결론부의 영화 추출 및 정렬
    - 매칭된 규칙들에서 결론부 영화를 추출, 결론부는 연관 규칙이 예측한 영화들이다.
    - `rules[matched_flags]` : 매칭된 규칙을 필터링한 결과이며, 이 규칙들을 lift값에 따라 내림차순으로 정렬
        - lift는 연관 규칙에서 추천된 항목이 얼마나 중요한지 나타내는 지표
    - 정렬된 규칙에서 각 결론부 영화를 consequent_movies리스트에 추가

In [21]:
pred_user2items

defaultdict(list,
            {2: [7, 12, 56, 127, 98, 174, 181],
             3: [172, 210, 50, 174, 121, 173, 79, 1, 98, 56],
             4: [172, 174, 181, 173, 96, 176, 195, 89, 79, 183],
             7: [98],
             9: [172, 174, 181, 210, 173, 96, 176, 195, 89, 79],
             10: [],
             12: [174],
             14: [174, 181, 96, 89, 79, 183, 257, 168, 22, 56],
             17: [181, 50],
             18: [],
             20: [],
             24: [98, 50, 100, 181],
             25: [172, 174, 56, 79, 127, 100, 12, 64, 210, 1],
             26: [172, 210, 174, 121, 173, 79, 98, 56, 100],
             27: [181, 7, 12, 56, 127, 98, 1, 174],
             29: [174, 56, 98, 50, 181],
             30: [195, 96, 210, 89, 176, 22, 168, 173, 79, 204],
             32: [],
             37: [98, 174, 100, 181],
             41: [181, 100],
             42: [],
             46: [50, 98, 100, 172, 210, 174, 121, 173, 79, 1],
             48: [174, 183, 12, 64, 79, 173, 127,

In [22]:
pred_user2items[2]

[7, 12, 56, 127, 98, 174, 181]

In [23]:
user_movie_matrix_org.head()

movie_id,1,2,3,4,5,6,7,8,9,10,...,1671,1672,1673,1674,1676,1677,1679,1680,1681,1682
user_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1,5.0,3.0,4.0,,3.0,5.0,4.0,1.0,5.0,3.0,...,,,,,,,,,,
2,4.0,,,,,,,,,2.0,...,,,,,,,,,,
3,,,,,,,,,,,...,,,,,,,,,,
4,,,,,,,,,,,...,,,,,,,,,,
5,4.0,3.0,,,,,,,,,...,,,,,,,,,,


In [24]:
user_movie_matrix_org[user_movie_matrix_org.index==2]

movie_id,1,2,3,4,5,6,7,8,9,10,...,1671,1672,1673,1674,1676,1677,1679,1680,1681,1682
user_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2,4.0,,,,,,,,,2.0,...,,,,,,,,,,


In [25]:
pred_user2items[202]

[174, 50]