### 추천시스템

#### 콘텐츠 기반 필처링 추천 시스템 (Content-based Filtering)
- 사용자가 특정한 아이템을 선호하는 경우, 그 아이템과 비슷한 아이템을 추천하는 방식 

#### 최근접 이웃 협업 필터링 
- 축적된 사용자 행동 데이터 기반으로 사용자가 아직 평가하지 않은 아이템을 예측 평가
- 사용자 기반 : 당신과 비슷한 고객들이 다음 상품도 구매했음 
- 아이템 기반 : 이 상품을 선택한 다른 고객들은 다음 상품도 구매했음 

- 일반적으로는 사용자 기반보다는 아이템 기반 협업 필터링이 정확도가 높음 

#### 잠재요인 협업 필터링 
- 사용자-아이템 평점 행렬 데이터를 이용해서 "잠재요인"을 도출하는 것 
- 주 요인과 아이템에 대한 잠재요인에 대해 행렬 분해를 하고 다시 행렬곱을 통해 아직 평점을 부여하지 않은 아이템에 대한 예측 평점을 생성하는 것 

<br>
<br>

<img src ='https://miro.medium.com/max/998/1*O_GU8xLVlFx8WweIzKNCNw.png'>

###### Image by <a href = 'https://www.kdnuggets.com/2019/11/content-based-recommender-using-natural-language-processing-nlp.html'>Master of Science in Business Analytics Content-based Recommender Using Natural Language Processing (NLP)</a>

### 콘텐츠 기반 필터링 실습 - TMDB 영화 데이터

In [1]:
import pandas as pd
import numpy as np 

In [2]:
movies = pd.read_csv('https://raw.githubusercontent.com/PinkWink/ML_tutorial/master/dataset/tmdb_5000_movies.csv')
print(movies.shape)

(4803, 20)


In [3]:
movies.head()

Unnamed: 0,budget,genres,homepage,id,keywords,original_language,original_title,overview,popularity,production_companies,production_countries,release_date,revenue,runtime,spoken_languages,status,tagline,title,vote_average,vote_count
0,237000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...",http://www.avatarmovie.com/,19995,"[{""id"": 1463, ""name"": ""culture clash""}, {""id"":...",en,Avatar,"In the 22nd century, a paraplegic Marine is di...",150.437577,"[{""name"": ""Ingenious Film Partners"", ""id"": 289...","[{""iso_3166_1"": ""US"", ""name"": ""United States o...",2009-12-10,2787965087,162.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}, {""iso...",Released,Enter the World of Pandora.,Avatar,7.2,11800
1,300000000,"[{""id"": 12, ""name"": ""Adventure""}, {""id"": 14, ""...",http://disney.go.com/disneypictures/pirates/,285,"[{""id"": 270, ""name"": ""ocean""}, {""id"": 726, ""na...",en,Pirates of the Caribbean: At World's End,"Captain Barbossa, long believed to be dead, ha...",139.082615,"[{""name"": ""Walt Disney Pictures"", ""id"": 2}, {""...","[{""iso_3166_1"": ""US"", ""name"": ""United States o...",2007-05-19,961000000,169.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,"At the end of the world, the adventure begins.",Pirates of the Caribbean: At World's End,6.9,4500
2,245000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...",http://www.sonypictures.com/movies/spectre/,206647,"[{""id"": 470, ""name"": ""spy""}, {""id"": 818, ""name...",en,Spectre,A cryptic message from Bond’s past sends him o...,107.376788,"[{""name"": ""Columbia Pictures"", ""id"": 5}, {""nam...","[{""iso_3166_1"": ""GB"", ""name"": ""United Kingdom""...",2015-10-26,880674609,148.0,"[{""iso_639_1"": ""fr"", ""name"": ""Fran\u00e7ais""},...",Released,A Plan No One Escapes,Spectre,6.3,4466
3,250000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 80, ""nam...",http://www.thedarkknightrises.com/,49026,"[{""id"": 849, ""name"": ""dc comics""}, {""id"": 853,...",en,The Dark Knight Rises,Following the death of District Attorney Harve...,112.31295,"[{""name"": ""Legendary Pictures"", ""id"": 923}, {""...","[{""iso_3166_1"": ""US"", ""name"": ""United States o...",2012-07-16,1084939099,165.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,The Legend Ends,The Dark Knight Rises,7.6,9106
4,260000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...",http://movies.disney.com/john-carter,49529,"[{""id"": 818, ""name"": ""based on novel""}, {""id"":...",en,John Carter,"John Carter is a war-weary, former military ca...",43.926995,"[{""name"": ""Walt Disney Pictures"", ""id"": 2}]","[{""iso_3166_1"": ""US"", ""name"": ""United States o...",2012-03-07,284139100,132.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,"Lost in our world, found in another.",John Carter,6.1,2124


In [5]:
cols = ['id', 'title', 'genres', 'vote_average', 'vote_count', 'popularity', 'keywords', 'overview']
movies_df = movies[cols]
movies_df.head()

Unnamed: 0,id,title,genres,vote_average,vote_count,popularity,keywords,overview
0,19995,Avatar,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...",7.2,11800,150.437577,"[{""id"": 1463, ""name"": ""culture clash""}, {""id"":...","In the 22nd century, a paraplegic Marine is di..."
1,285,Pirates of the Caribbean: At World's End,"[{""id"": 12, ""name"": ""Adventure""}, {""id"": 14, ""...",6.9,4500,139.082615,"[{""id"": 270, ""name"": ""ocean""}, {""id"": 726, ""na...","Captain Barbossa, long believed to be dead, ha..."
2,206647,Spectre,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...",6.3,4466,107.376788,"[{""id"": 470, ""name"": ""spy""}, {""id"": 818, ""name...",A cryptic message from Bond’s past sends him o...
3,49026,The Dark Knight Rises,"[{""id"": 28, ""name"": ""Action""}, {""id"": 80, ""nam...",7.6,9106,112.31295,"[{""id"": 849, ""name"": ""dc comics""}, {""id"": 853,...",Following the death of District Attorney Harve...
4,49529,John Carter,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...",6.1,2124,43.926995,"[{""id"": 818, ""name"": ""based on novel""}, {""id"":...","John Carter is a war-weary, former military ca..."


In [7]:
movies_df.genres[0]

'[{"id": 28, "name": "Action"}, {"id": 12, "name": "Adventure"}, {"id": 14, "name": "Fantasy"}, {"id": 878, "name": "Science Fiction"}]'

In [8]:
from ast import literal_eval

In [12]:
type(movies_df['genres'][0])
type(movies_df['keywords'][0])

str

In [15]:
movies_df['genres'] = movies_df['genres'].apply(literal_eval)
movies_df['keywords'] = movies_df['keywords'].apply(literal_eval)



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  movies_df['genres'] = movies_df['genres'].apply(literal_eval)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  movies_df['keywords'] = movies_df['keywords'].apply(literal_eval)


In [16]:
movies_df.head()

Unnamed: 0,id,title,genres,vote_average,vote_count,popularity,keywords,overview
0,19995,Avatar,"[{'id': 28, 'name': 'Action'}, {'id': 12, 'nam...",7.2,11800,150.437577,"[{'id': 1463, 'name': 'culture clash'}, {'id':...","In the 22nd century, a paraplegic Marine is di..."
1,285,Pirates of the Caribbean: At World's End,"[{'id': 12, 'name': 'Adventure'}, {'id': 14, '...",6.9,4500,139.082615,"[{'id': 270, 'name': 'ocean'}, {'id': 726, 'na...","Captain Barbossa, long believed to be dead, ha..."
2,206647,Spectre,"[{'id': 28, 'name': 'Action'}, {'id': 12, 'nam...",6.3,4466,107.376788,"[{'id': 470, 'name': 'spy'}, {'id': 818, 'name...",A cryptic message from Bond’s past sends him o...
3,49026,The Dark Knight Rises,"[{'id': 28, 'name': 'Action'}, {'id': 80, 'nam...",7.6,9106,112.31295,"[{'id': 849, 'name': 'dc comics'}, {'id': 853,...",Following the death of District Attorney Harve...
4,49529,John Carter,"[{'id': 28, 'name': 'Action'}, {'id': 12, 'nam...",6.1,2124,43.926995,"[{'id': 818, 'name': 'based on novel'}, {'id':...","John Carter is a war-weary, former military ca..."


In [17]:
movies_df['genres'][0]

[{'id': 28, 'name': 'Action'},
 {'id': 12, 'name': 'Adventure'},
 {'id': 14, 'name': 'Fantasy'},
 {'id': 878, 'name': 'Science Fiction'}]

In [18]:
movies_df['keywords'][0]

[{'id': 1463, 'name': 'culture clash'},
 {'id': 2964, 'name': 'future'},
 {'id': 3386, 'name': 'space war'},
 {'id': 3388, 'name': 'space colony'},
 {'id': 3679, 'name': 'society'},
 {'id': 3801, 'name': 'space travel'},
 {'id': 9685, 'name': 'futuristic'},
 {'id': 9840, 'name': 'romance'},
 {'id': 9882, 'name': 'space'},
 {'id': 9951, 'name': 'alien'},
 {'id': 10148, 'name': 'tribe'},
 {'id': 10158, 'name': 'alien planet'},
 {'id': 10987, 'name': 'cgi'},
 {'id': 11399, 'name': 'marine'},
 {'id': 13065, 'name': 'soldier'},
 {'id': 14643, 'name': 'battle'},
 {'id': 14720, 'name': 'love affair'},
 {'id': 165431, 'name': 'anti war'},
 {'id': 193554, 'name': 'power relations'},
 {'id': 206690, 'name': 'mind and soul'},
 {'id': 209714, 'name': '3d'}]

In [19]:
movies_df['genres'] = movies_df['genres'].apply(lambda x : [dict['name'] for dict in x])
movies_df['keywords'] = movies_df['keywords'].apply(lambda x : [dict['name'] for dict in x])

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  movies_df['genres'] = movies_df['genres'].apply(lambda x : [dict['name'] for dict in x])
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  movies_df['keywords'] = movies_df['keywords'].apply(lambda x : [dict['name'] for dict in x])


In [20]:
movies_df.head()

Unnamed: 0,id,title,genres,vote_average,vote_count,popularity,keywords,overview
0,19995,Avatar,"[Action, Adventure, Fantasy, Science Fiction]",7.2,11800,150.437577,"[culture clash, future, space war, space colon...","In the 22nd century, a paraplegic Marine is di..."
1,285,Pirates of the Caribbean: At World's End,"[Adventure, Fantasy, Action]",6.9,4500,139.082615,"[ocean, drug abuse, exotic island, east india ...","Captain Barbossa, long believed to be dead, ha..."
2,206647,Spectre,"[Action, Adventure, Crime]",6.3,4466,107.376788,"[spy, based on novel, secret agent, sequel, mi...",A cryptic message from Bond’s past sends him o...
3,49026,The Dark Knight Rises,"[Action, Crime, Drama, Thriller]",7.6,9106,112.31295,"[dc comics, crime fighter, terrorist, secret i...",Following the death of District Attorney Harve...
4,49529,John Carter,"[Action, Adventure, Science Fiction]",6.1,2124,43.926995,"[based on novel, mars, medallion, space travel...","John Carter is a war-weary, former military ca..."


In [21]:
movies_df['genres_literal'] = movies_df['genres'].apply(lambda x : ' '.join(x))
movies_df.head()

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  movies_df['genres_literal'] = movies_df['genres'].apply(lambda x : ' '.join(x))


Unnamed: 0,id,title,genres,vote_average,vote_count,popularity,keywords,overview,genres_literal
0,19995,Avatar,"[Action, Adventure, Fantasy, Science Fiction]",7.2,11800,150.437577,"[culture clash, future, space war, space colon...","In the 22nd century, a paraplegic Marine is di...",Action Adventure Fantasy Science Fiction
1,285,Pirates of the Caribbean: At World's End,"[Adventure, Fantasy, Action]",6.9,4500,139.082615,"[ocean, drug abuse, exotic island, east india ...","Captain Barbossa, long believed to be dead, ha...",Adventure Fantasy Action
2,206647,Spectre,"[Action, Adventure, Crime]",6.3,4466,107.376788,"[spy, based on novel, secret agent, sequel, mi...",A cryptic message from Bond’s past sends him o...,Action Adventure Crime
3,49026,The Dark Knight Rises,"[Action, Crime, Drama, Thriller]",7.6,9106,112.31295,"[dc comics, crime fighter, terrorist, secret i...",Following the death of District Attorney Harve...,Action Crime Drama Thriller
4,49529,John Carter,"[Action, Adventure, Science Fiction]",6.1,2124,43.926995,"[based on novel, mars, medallion, space travel...","John Carter is a war-weary, former military ca...",Action Adventure Science Fiction


#### 문자열로 변환된 genres를 CountVectorize 수행

In [22]:
from sklearn.feature_extraction.text import CountVectorizer

count_vect = CountVectorizer(min_df=0, ngram_range=(1, 2))
genre_mat = count_vect.fit_transform(movies_df['genres_literal'])

In [23]:
genre_mat.shape

(4803, 276)

#### 코사인 유사도

**코사인 유사도(Cosine Similarity)는 두 벡터 간의 유사도를 계산하는 방법 중 하나입니다. 주로 텍스트 문서나 다차원의 벡터 표현에서 사용됩니다.**<br>
**코사인 유사도는 사용자의 선호도나 상품/콘텐츠의 유사성을 계산하여 추천을 수행하는 데 활용됩니다.**


- 사용자-아이템 행렬 생성: 사용자와 아이템 간의 상호작용 데이터를 행렬로 표현합니다. 사용자를 행으로, 아이템을 열로 하는 행렬을 만듭니다. 이 행렬은 사용자의 선호도를 나타내며, 결측값은 사용자와 아이템 간의 상호작용이 없음을 의미합니다.

- 특성 벡터 생성: 각 사용자나 아이템은 특성 벡터로 표현됩니다. 이 벡터는 해당 사용자나 아이템의 특징을 나타내는 수치값으로 구성됩니다. 일반적으로는 벡터의 각 차원이 해당 특성을 나타내며, 예를 들어 영화 추천 시스템의 경우 장르, 배우, 감독 등이 특성으로 사용될 수 있습니다.

- 코사인 유사도 계산: 사용자나 아이템 간의 코사인 유사도를 계산합니다. 즉, 각 사용자나 아이템 벡터 간의 유사도를 측정하여 유사도 행렬을 생성합니다.

- 추천 아이템 선택: 유사도 행렬을 기반으로 사용자에게 추천할 아이템을 선택합니다. 일반적으로는 유사도가 높은 상위 N개의 아이템을 추천합니다.

In [24]:
from sklearn.metrics.pairwise import cosine_similarity

genre_sim = cosine_similarity(genre_mat, genre_mat)

In [25]:
genre_sim.shape

(4803, 4803)

In [26]:
genre_sim

array([[1.        , 0.59628479, 0.4472136 , ..., 0.        , 0.        ,
        0.        ],
       [0.59628479, 1.        , 0.4       , ..., 0.        , 0.        ,
        0.        ],
       [0.4472136 , 0.4       , 1.        , ..., 0.        , 0.        ,
        0.        ],
       ...,
       [0.        , 0.        , 0.        , ..., 1.        , 0.        ,
        0.        ],
       [0.        , 0.        , 0.        , ..., 0.        , 0.        ,
        0.        ],
       [0.        , 0.        , 0.        , ..., 0.        , 0.        ,
        1.        ]])

In [29]:
genre_sim_sorted_ind = genre_sim.argsort()[:, ::-1]  # 내림차순으로 정렬 , 인덱스로 표시  

In [30]:
genre_sim_sorted_ind

array([[   0, 3494,  813, ..., 3038, 3037, 2401],
       [ 262,    1,  129, ..., 3069, 3067, 2401],
       [   2, 1740, 1542, ..., 3000, 2999, 2401],
       ...,
       [4800, 3809, 1895, ..., 2229, 2230,    0],
       [4802, 1594, 1596, ..., 3204, 3205,    0],
       [4802, 4710, 4521, ..., 3140, 3141,    0]], dtype=int64)

In [31]:
#### 추천 영화를 DataFrame으로 반환하는 함수 

def find_sim_movie(df, sorted_ind, title_name, top_n=10):
    title_movie = df[df['title'] == title_name]

    title_index = title_movie.index.values
    similar_indexes = sorted_ind[title_index, :(top_n)]

    print(similar_indexes)
    similar_indexes = similar_indexes.reshape(-1)

    return df.iloc[similar_indexes]


#### 대부와 비슷한 영화 찾기

In [32]:
similar_movies = find_sim_movie(movies_df, genre_sim_sorted_ind, 'The Godfather', 10)
similar_movies[['title', 'vote_average']]

[[2731 1243 3636 1946 2640 4065 1847 4217  883 3866]]


Unnamed: 0,title,vote_average
2731,The Godfather: Part II,8.3
1243,Mean Streets,7.2
3636,Light Sleeper,5.7
1946,The Bad Lieutenant: Port of Call - New Orleans,6.0
2640,Things to Do in Denver When You're Dead,6.7
4065,Mi America,0.0
1847,GoodFellas,8.2
4217,Kids,6.8
883,Catch Me If You Can,7.7
3866,City of God,8.1


#### 가중치 설정 
- vote_average,  vote_count 

#### 영화 전체 평균평점과 최소 투표 횟수를 60% 지점으로 지정 

In [33]:
C = movies_df['vote_average'].mean()
m = movies_df['vote_count'].quantile(0.6)

In [34]:
def weighted_vote_average(record):
    v = record['vote_count']
    R = record['vote_average']

    return( (v/(v+m)*R) + (m/(v+m)*C))

In [35]:
movies_df['weighted_vote'] = movies_df.apply(weighted_vote_average,axis=1)
movies_df.head()

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  movies_df['weighted_vote'] = movies_df.apply(weighted_vote_average,axis=1)


Unnamed: 0,id,title,genres,vote_average,vote_count,popularity,keywords,overview,genres_literal,weighted_vote
0,19995,Avatar,"[Action, Adventure, Fantasy, Science Fiction]",7.2,11800,150.437577,"[culture clash, future, space war, space colon...","In the 22nd century, a paraplegic Marine is di...",Action Adventure Fantasy Science Fiction,7.166301
1,285,Pirates of the Caribbean: At World's End,"[Adventure, Fantasy, Action]",6.9,4500,139.082615,"[ocean, drug abuse, exotic island, east india ...","Captain Barbossa, long believed to be dead, ha...",Adventure Fantasy Action,6.838594
2,206647,Spectre,"[Action, Adventure, Crime]",6.3,4466,107.376788,"[spy, based on novel, secret agent, sequel, mi...",A cryptic message from Bond’s past sends him o...,Action Adventure Crime,6.284091
3,49026,The Dark Knight Rises,"[Action, Crime, Drama, Thriller]",7.6,9106,112.31295,"[dc comics, crime fighter, terrorist, secret i...",Following the death of District Attorney Harve...,Action Crime Drama Thriller,7.541095
4,49529,John Carter,"[Action, Adventure, Science Fiction]",6.1,2124,43.926995,"[based on novel, mars, medallion, space travel...","John Carter is a war-weary, former military ca...",Action Adventure Science Fiction,6.098838


In [36]:
movies_df[movies_df['vote_count']<10]

Unnamed: 0,id,title,genres,vote_average,vote_count,popularity,keywords,overview,genres_literal,weighted_vote
463,161795,Déjà Vu,"[Romance, Drama]",8.0,1,0.605645,"[love, american, pin, stranger, ruby]",L.A. shop owner Dana and Englishman Sean meet ...,Romance Drama,6.097311
492,293644,Top Cat Begins,"[Comedy, Animation]",5.3,9,0.719996,[3d],Top Cat has arrived to charm his way into your...,Comedy Animation,6.073370
1023,7504,Earth,[Drama],6.6,9,1.246883,"[based on novel, war of independence, period d...",It's 1947 and the borderlines between India an...,Drama,6.104224
1039,113464,Inchon,"[Drama, History, War]",6.5,2,0.146783,[],A noisy and absurd re-telling of the great 195...,Drama History War,6.094363
1453,49478,Warriors of Virtue,"[Fantasy, Family, Action]",4.7,9,0.912395,"[american football, mythology, chinese food, k...","A young man, Ryan, suffering from a disability...",Fantasy Family Action,6.059130
...,...,...,...,...,...,...,...,...,...,...
4795,124606,Bang,[Drama],6.0,1,0.918116,"[gang, audition, police fake, homeless, actress]",A young woman in L.A. is having a bad day: she...,Drama,6.091923
4797,67238,Cavite,"[Foreign, Thriller]",7.5,2,0.022173,[],"Adam, a security guard, travels from Californi...",Foreign Thriller,6.099736
4799,72766,Newlyweds,"[Comedy, Romance]",5.9,5,0.642552,[],A newlywed couple's honeymoon is upended by th...,Comedy Romance,6.089611
4800,231617,"Signed, Sealed, Delivered","[Comedy, Drama, Romance, TV Movie]",7.0,6,1.444476,"[date, love at first sight, narration, investi...","""Signed, Sealed, Delivered"" introduces a dedic...",Comedy Drama Romance TV Movie,6.106650


In [37]:
movies_df[['title', 'vote_average', 'weighted_vote', 'vote_count']].sort_values('weighted_vote', ascending=False)[:10]

Unnamed: 0,title,vote_average,weighted_vote,vote_count
1881,The Shawshank Redemption,8.5,8.396052,8205
3337,The Godfather,8.4,8.263591,5893
662,Fight Club,8.3,8.216455,9413
3232,Pulp Fiction,8.3,8.207102,8428
65,The Dark Knight,8.2,8.13693,12002
1818,Schindler's List,8.3,8.126069,4329
3865,Whiplash,8.3,8.123248,4254
809,Forrest Gump,8.2,8.105954,7927
2294,Spirited Away,8.3,8.105867,3840
2731,The Godfather: Part II,8.3,8.079586,3338


In [38]:
#### 추천 영화를 DataFrame으로 반환하는 함수 + 가중치 추가  

def find_sim_movie(df, sorted_ind, title_name, top_n=10):
    title_movie = df[df['title'] == title_name]

    title_index = title_movie.index.values
    similar_indexes = sorted_ind[title_index, :(top_n)]

    print(similar_indexes)
    similar_indexes = similar_indexes.reshape(-1)

    similar_indexes = similar_indexes[similar_indexes != title_index]


    return df.iloc[similar_indexes].sort_values('weighted_vote', ascending = False)[:top_n]


### 대부와 유사한 영화 찾기

In [39]:
similar_movies = find_sim_movie(movies_df, genre_sim_sorted_ind, 'The Godfather', 10)
similar_movies[['title', 'vote_average', 'weighted_vote']]

[[2731 1243 3636 1946 2640 4065 1847 4217  883 3866]]


Unnamed: 0,title,vote_average,weighted_vote
2731,The Godfather: Part II,8.3,8.079586
1847,GoodFellas,8.2,7.976937
3866,City of God,8.1,7.759693
883,Catch Me If You Can,7.7,7.557097
1243,Mean Streets,7.2,6.626569
4217,Kids,6.8,6.396368
2640,Things to Do in Denver When You're Dead,6.7,6.205672
4065,Mi America,0.0,6.092172
3636,Light Sleeper,5.7,6.0769
1946,The Bad Lieutenant: Port of Call - New Orleans,6.0,6.049012
