# Content Based Filtering

* **Used Dataset :** https://www.kaggle.com/rounakbanik/the-movies-dataset/data
* **Latest MovieLens Dataset :** https://grouplens.org/datasets/movielens/latest/

**Files :**
  * _movies_metadata_ : Features belong to movies (~45k)
  * _keywords_ : Keywords extracted from plot of the movie
  * _credits_ : Cast and crew information
  * _links_ : TMDB and IMDB IDs of all movies
  * _ratings_ : User-Movie interactions

In [5]:
from IPython.display import display, HTML

def show_as_html(doc):
    display(HTML(doc.to_html()))

In [None]:
%%capture
!pip3 install spacy
!python3 -m spacy download en_core_web_sm

In [1]:
import spacy
import pandas as pd
from numpy import dot
from numpy.linalg import norm
import numpy as np
from tqdm.notebook import tqdm
from scipy.spatial import distance

In [3]:
metadata = pd.read_csv('/home/gsoykan20/Desktop/inzva/week2/Recommender Systems/MovieLens/movies_metadata.csv', low_memory=False)
metadata['overview'] = metadata['overview'].fillna('')

In [6]:
show_as_html(metadata.head())

Unnamed: 0,adult,belongs_to_collection,budget,genres,homepage,id,imdb_id,original_language,original_title,overview,popularity,poster_path,production_companies,production_countries,release_date,revenue,runtime,spoken_languages,status,tagline,title,video,vote_average,vote_count
0,False,"{'id': 10194, 'name': 'Toy Story Collection', 'poster_path': '/7G9915LfUQ2lVfwMEEhDsn3kT4B.jpg', 'backdrop_path': '/9FBwqcd9IRruEDUrTdcaafOMKUq.jpg'}",30000000,"[{'id': 16, 'name': 'Animation'}, {'id': 35, 'name': 'Comedy'}, {'id': 10751, 'name': 'Family'}]",http://toystory.disney.com/toy-story,862,tt0114709,en,Toy Story,"Led by Woody, Andy's toys live happily in his room until Andy's birthday brings Buzz Lightyear onto the scene. Afraid of losing his place in Andy's heart, Woody plots against Buzz. But when circumstances separate Buzz and Woody from their owner, the duo eventually learns to put aside their differences.",21.946943,/rhIRbceoE9lR4veEXuwCC2wARtG.jpg,"[{'name': 'Pixar Animation Studios', 'id': 3}]","[{'iso_3166_1': 'US', 'name': 'United States of America'}]",1995-10-30,373554033.0,81.0,"[{'iso_639_1': 'en', 'name': 'English'}]",Released,,Toy Story,False,7.7,5415.0
1,False,,65000000,"[{'id': 12, 'name': 'Adventure'}, {'id': 14, 'name': 'Fantasy'}, {'id': 10751, 'name': 'Family'}]",,8844,tt0113497,en,Jumanji,"When siblings Judy and Peter discover an enchanted board game that opens the door to a magical world, they unwittingly invite Alan -- an adult who's been trapped inside the game for 26 years -- into their living room. Alan's only hope for freedom is to finish the game, which proves risky as all three find themselves running from giant rhinoceroses, evil monkeys and other terrifying creatures.",17.015539,/vzmL6fP7aPKNKPRTFnZmiUfciyV.jpg,"[{'name': 'TriStar Pictures', 'id': 559}, {'name': 'Teitler Film', 'id': 2550}, {'name': 'Interscope Communications', 'id': 10201}]","[{'iso_3166_1': 'US', 'name': 'United States of America'}]",1995-12-15,262797249.0,104.0,"[{'iso_639_1': 'en', 'name': 'English'}, {'iso_639_1': 'fr', 'name': 'Français'}]",Released,Roll the dice and unleash the excitement!,Jumanji,False,6.9,2413.0
2,False,"{'id': 119050, 'name': 'Grumpy Old Men Collection', 'poster_path': '/nLvUdqgPgm3F85NMCii9gVFUcet.jpg', 'backdrop_path': '/hypTnLot2z8wpFS7qwsQHW1uV8u.jpg'}",0,"[{'id': 10749, 'name': 'Romance'}, {'id': 35, 'name': 'Comedy'}]",,15602,tt0113228,en,Grumpier Old Men,"A family wedding reignites the ancient feud between next-door neighbors and fishing buddies John and Max. Meanwhile, a sultry Italian divorcée opens a restaurant at the local bait shop, alarming the locals who worry she'll scare the fish away. But she's less interested in seafood than she is in cooking up a hot time with Max.",11.7129,/6ksm1sjKMFLbO7UY2i6G1ju9SML.jpg,"[{'name': 'Warner Bros.', 'id': 6194}, {'name': 'Lancaster Gate', 'id': 19464}]","[{'iso_3166_1': 'US', 'name': 'United States of America'}]",1995-12-22,0.0,101.0,"[{'iso_639_1': 'en', 'name': 'English'}]",Released,Still Yelling. Still Fighting. Still Ready for Love.,Grumpier Old Men,False,6.5,92.0
3,False,,16000000,"[{'id': 35, 'name': 'Comedy'}, {'id': 18, 'name': 'Drama'}, {'id': 10749, 'name': 'Romance'}]",,31357,tt0114885,en,Waiting to Exhale,"Cheated on, mistreated and stepped on, the women are holding their breath, waiting for the elusive ""good man"" to break a string of less-than-stellar lovers. Friends and confidants Vannah, Bernie, Glo and Robin talk it all out, determined to find a better way to breathe.",3.859495,/16XOMpEaLWkrcPqSQqhTmeJuqQl.jpg,"[{'name': 'Twentieth Century Fox Film Corporation', 'id': 306}]","[{'iso_3166_1': 'US', 'name': 'United States of America'}]",1995-12-22,81452156.0,127.0,"[{'iso_639_1': 'en', 'name': 'English'}]",Released,Friends are the people who let you be yourself... and never let you forget it.,Waiting to Exhale,False,6.1,34.0
4,False,"{'id': 96871, 'name': 'Father of the Bride Collection', 'poster_path': '/nts4iOmNnq7GNicycMJ9pSAn204.jpg', 'backdrop_path': '/7qwE57OVZmMJChBpLEbJEmzUydk.jpg'}",0,"[{'id': 35, 'name': 'Comedy'}]",,11862,tt0113041,en,Father of the Bride Part II,"Just when George Banks has recovered from his daughter's wedding, he receives the news that she's pregnant ... and that George's wife, Nina, is expecting too. He was planning on selling their home, but that's a plan that -- like George -- will have to change with the arrival of both a grandchild and a kid of his own.",8.387519,/e64sOI48hQXyru7naBFyssKFxVd.jpg,"[{'name': 'Sandollar Productions', 'id': 5842}, {'name': 'Touchstone Pictures', 'id': 9195}]","[{'iso_3166_1': 'US', 'name': 'United States of America'}]",1995-02-10,76578911.0,106.0,"[{'iso_639_1': 'en', 'name': 'English'}]",Released,Just When His World Is Back To Normal... He's In For The Surprise Of His Life!,Father of the Bride Part II,False,5.7,173.0


In [6]:
pd.DataFrame(metadata[['original_title','overview']].head())

Unnamed: 0,original_title,overview
0,Toy Story,"Led by Woody, Andy's toys live happily in his ..."
1,Jumanji,When siblings Judy and Peter discover an encha...
2,Grumpier Old Men,A family wedding reignites the ancient feud be...
3,Waiting to Exhale,"Cheated on, mistreated and stepped on, the wom..."
4,Father of the Bride Part II,Just when George Banks has recovered from his ...


In [7]:
movie_docs = pd.DataFrame(metadata[['original_title','overview']])

**distance sources:**
- https://www.sequentix.de/gelquest/help/distance_measures.htm
- https://en.wikipedia.org/wiki/Chebyshev_distance

In [8]:
def get_recommendations(title,
                        recommend_k,
                        df,
                        distance_metric='cosine'):
  vector = df[df["original_title"] == title]["overview vectors"].iloc[0]
  shape = vector.shape[0]
  print("vector shape:" + str(shape))
  list_of_vecs = df["overview vectors"].tolist()

  similarities = []
  for b in list_of_vecs:
    try:
      dist = None
      if distance_metric == "cosine":
        #cos_sim = dot(a, b)/(norm(a)*norm(b))
        dist = np.dot(vector, b) / (np.linalg.norm(vector) * np.linalg.norm(b))
      elif distance_metric == "euclidian":
        dist = distance.euclidean(vector, b)
      elif distance_metric == "dice":
        dist = distance.dice(vector, b)
      elif distance_metric == "hamming":
        dist = distance.hamming(vector, b)
      elif distance_metric == "sokalmichener":
        dist = distance.sokalmichener(vector, b)
      elif distance_metric == "sqeuclidean":
        dist = distance.sqeuclidean(vector, b)
      elif distance_metric == "chebyshev":
        dist = distance.chebyshev(vector, b)
      similarities.append(dist)
    except:
      print("exception in similarity metric")
      pass
  indices = np.array(similarities).argsort()[1:recommend_k+1][::-1]

  return [df.iloc[index, 0] for index in indices]

## Step 0 (Baseline): Use Overview and extract movie feature vector with spacy


In [None]:
nlp = spacy.load('en_core_web_sm')

In [19]:
movie_docs_vectors = []

for idx, row in tqdm(metadata.iterrows(), total=len(metadata)):
  doc = nlp(row["overview"])
  movie_docs_vectors.append(doc.vector)

movie_docs["overview vectors"] = movie_docs_vectors

HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=45466.0), HTML(value='')))




In [21]:
with open('./movie_lens_overview_word_vecs.npy', 'wb') as f:
    np.save(f, np.array(movie_docs['overview vectors']), allow_pickle=True)

In [16]:
with open('./movie_lens_overview_word_vecs.npy', 'rb') as f:
    word_vecs = np.load(f, allow_pickle=True)

In [23]:
movie_docs["overview vectors"] = word_vecs
movie_docs.head()

Unnamed: 0,original_title,overview,overview vectors
0,Toy Story,"Led by Woody, Andy's toys live happily in his ...","[0.5870929, 0.022778913, -0.108451754, -0.2917..."
1,Jumanji,When siblings Judy and Peter discover an encha...,"[0.31707045, 0.1843554, -0.0031469888, -0.2288..."
2,Grumpier Old Men,A family wedding reignites the ancient feud be...,"[0.5276116, -0.01129298, -0.110098004, -0.0950..."
3,Waiting to Exhale,"Cheated on, mistreated and stepped on, the wom...","[-0.0035403734, 0.08286476, -0.11184327, -0.27..."
4,Father of the Bride Part II,Just when George Banks has recovered from his ...,"[0.33208913, -0.13714488, -0.0551012, -0.15579..."


In [25]:
get_recommendations("The Departed", 5, movie_docs)

vector shape:96


['Pioneer Woman',
 '티끌모아 로맨스',
 'Under the Boardwalk: The Monopoly Story',
 'Gli occhi freddi della paura',
 'Los cronocrímenes']

In [27]:
get_recommendations("Toy Story", 5, movie_docs)

vector shape:96


['Gli occhi freddi della paura',
 '티끌모아 로맨스',
 'Under the Boardwalk: The Monopoly Story',
 'Pioneer Woman',
 'Los cronocrímenes']

## Step 1: Use Overview and extract movie feature vector from DistilBertForSequenceClassification
**source:** https://jalammar.github.io/a-visual-guide-to-using-bert-for-the-first-time/

In [9]:
from transformers import DistilBertTokenizer, DistilBertModel
import torch

In [10]:
distilbert_tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-uncased')
distilbert_model = model = DistilBertModel.from_pretrained('distilbert-base-uncased')
distilbert_model.cuda()

Some weights of the model checkpoint at distilbert-base-uncased were not used when initializing DistilBertModel: ['vocab_projector.bias', 'vocab_transform.weight', 'vocab_layer_norm.weight', 'vocab_transform.bias', 'vocab_projector.weight', 'vocab_layer_norm.bias']
- This IS expected if you are initializing DistilBertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing DistilBertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


DistilBertModel(
  (embeddings): Embeddings(
    (word_embeddings): Embedding(30522, 768, padding_idx=0)
    (position_embeddings): Embedding(512, 768)
    (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
    (dropout): Dropout(p=0.1, inplace=False)
  )
  (transformer): Transformer(
    (layer): ModuleList(
      (0): TransformerBlock(
        (attention): MultiHeadSelfAttention(
          (dropout): Dropout(p=0.1, inplace=False)
          (q_lin): Linear(in_features=768, out_features=768, bias=True)
          (k_lin): Linear(in_features=768, out_features=768, bias=True)
          (v_lin): Linear(in_features=768, out_features=768, bias=True)
          (out_lin): Linear(in_features=768, out_features=768, bias=True)
        )
        (sa_layer_norm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
        (ffn): FFN(
          (dropout): Dropout(p=0.1, inplace=False)
          (lin1): Linear(in_features=768, out_features=3072, bias=True)
          (lin2): Linear(i

In [60]:
def generate_movie_docs_vectors_from_distilbert(save_name="movie_lens_overview_distilbert_vecs.npy"):
  movie_docs_vectors = []
  for idx, row in tqdm(metadata.iterrows(), total=len(metadata)):
    overview =  ' '.join(row["overview"].split()[:512])
    token_inputs = distilbert_tokenizer(overview, return_tensors="pt", max_length=512).to('cuda')
    cls_feature_vec = distilbert_model(**token_inputs).last_hidden_state[0, 0].cpu().detach().numpy()
    movie_docs_vectors.append(cls_feature_vec)

  with open('./' + save_name, 'wb') as f:
    np.save(f, np.array(movie_docs_vectors), allow_pickle=True)

In [13]:
with open('./movie_lens_overview_distilbert_vecs.npy', 'rb') as f:
    word_vecs = np.load(f, allow_pickle=True)

In [9]:
generate_movie_docs_vectors_from_distilbert()

HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=45466.0), HTML(value='')))

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.





ValueError: Wrong number of items passed 768, placement implies 1

In [68]:
def insert_movie_feature_vectors(word_vecs):
    list_word_vecs = []
    for i in range(len(word_vecs)):
        list_word_vecs.append(np.array(word_vecs[i]))
    len(list_word_vecs)
    movie_docs["overview vectors"] = list_word_vecs
    movie_docs.head()

In [None]:
insert_movie_feature_vectors()

In [17]:
get_recommendations("The Departed", 5, movie_docs)

vector shape:768


['Дура', 'Про любоff', 'На семи ветрах', 'Ι-4: Λούφα Και Απαλλαγή', 'Bad']

In [71]:
get_recommendations("The Departed", 5, movie_docs, "euclidian")

vector shape:768


['Donnie Brasco',
 'Private Hell 36',
 'Kounterfeit',
 'The Alibi',
 'The Departed']

In [58]:
get_recommendations("Toy Story", 5, movie_docs)

vector shape:768


['Ganes', 'Buffalo Running', 'Kalteva Torni', 'Bad', 'Endurance']

In [72]:
get_recommendations("Toy Story", 5, movie_docs, "euclidian")

vector shape:768


["Christmas Vacation 2: Cousin Eddie's Island Adventure",
 'Frankenweenie',
 'Toy Story 3',
 'Toy Story 2',
 'Toy Story']

In [59]:
get_recommendations("Happy Feet", 5, movie_docs)

vector shape:768


['Il segno di Venere',
 'Water & Power: A California Heist',
 'Endurance',
 'Downeast',
 'Bad']

In [73]:
get_recommendations("Happy Feet", 5, movie_docs, "euclidian")

vector shape:768


['Monster High: Welcome to Monster High',
 'The Wayshower',
 'The Trumpet Of The Swan',
 'Jack Frost',
 'Happy Feet']

In [88]:
get_recommendations("Whiplash", 5, movie_docs)

vector shape:768


['Дура', 'Про любоff', 'На семи ветрах', 'Kalteva Torni', 'Koroli i kapusta']

In [74]:
get_recommendations("Whiplash", 5, movie_docs, "euclidian")

vector shape:768


['Young Man with a Horn',
 'The Derby Stallion',
 'A Man Called Adam',
 'Sympathy for Delicious',
 'Whiplash']

In [73]:
# "Kill Bill: Vol. 1"
def recommend_for_all_metrics(movie_name):
  for metric in ["chebyshev",
                 "sokalmichener",
                 "sqeuclidean",
                 "hamming",
                 "cosine",
                 "euclidian",
                 "dice"]:
    recommendations = get_recommendations(movie_name,
                                          10,
                                          movie_docs,
                                          metric)
    print(recommendations)

In [19]:
recommend_for_all_metrics("The Matrix")


vector shape:768
['Interstellar', 'Flatland', 'Firebase', 'Lockout', 'Predestination']
vector shape:768
['Salvation Boulevard', 'Scavengers', 'Surrogates', 'Cyborg X', 'The Matrix']
vector shape:768
['Surrogates', 'Predestination', 'Lockout', '7 Below', 'Starship Troopers']
vector shape:768
['Foxtrot', 'Die Farbe', 'The Prospects', 'Festival', 'Alone in the Wilderness']
vector shape:768
['Наша Russia: Яйца судьбы', 'Ι-4: Λούφα Και Απαλλαγή', 'Про любоff', 'На семи ветрах', 'V Boy Idut Odni Stariki']
vector shape:768
['Surrogates', 'Predestination', 'Lockout', '7 Below', 'Starship Troopers']
vector shape:768
['صراع في الوادي', 'Горько!', 'മുംബൈ പോലീസ്', 'พี่ชาย', 'Metsän Tarina']


In [110]:
movie_row = movie_docs[movie_docs["original_title"] == "The Matrix"]
print(movie_row["overview"])

2458    Set in the 22nd century, The Matrix tells the ...
Name: overview, dtype: object


In [111]:
display(HTML(movie_row.to_html()))

Unnamed: 0,original_title,overview,overview vectors
2458,The Matrix,"Set in the 22nd century, The Matrix tells the story of a computer hacker who joins a group of underground insurgents fighting the vast and powerful computers who now rule the earth.","[-0.35084718, -0.24244294, -0.1840358, -0.30705032, -0.15726152, -0.184679, 0.09660693, 0.35997182, 0.16252281, -0.33846983, -0.18280007, -0.14451303, -0.3491705, 0.7385647, 0.23709188, 0.35123006, -0.08983389, 0.28762984, 0.01069252, -0.005493727, -0.01478569, -0.21081218, 0.31973347, 0.57102394, -0.090251975, 0.024521315, -0.07400369, 0.1604857, 0.0006978912, 0.23801477, 0.15782556, -0.23305541, -0.33934858, -0.7046923, 0.3888474, 0.018715596, -0.1590166, 0.0396753, -0.0075969887, 0.14665444, -0.08141522, 0.13906074, -0.12336385, 0.18838067, 0.2794778, -0.43430367, -3.0222368, -0.112657234, -0.48416227, -0.0768291, 0.27748024, 0.13323325, 0.17570113, 0.12851852, 0.5941701, 0.508074, -0.4200568, 0.25307485, -0.11302263, 0.24359815, 0.39850366, 0.18041219, -0.25556263, -0.15677671, -0.12146113, 0.11462549, 0.03206719, 0.18937133, -0.38975123, 0.31198576, -0.60671777, 0.12976122, -0.18699339, -0.08338334, -0.19498147, 0.09981582, 0.045266017, 0.4789428, -0.2761243, -0.025746373, -0.24038945, 0.7126396, 0.5870525, 0.22910309, -0.16905083, 0.031426813, -0.43050867, -0.0044919495, 0.20402199, 0.5396829, -0.42927772, -0.33674845, -0.014269916, 0.5349731, 0.28014466, -0.39612612, 0.049040712, 0.15552673, -0.030129947, 0.6691768, ...]"


In [114]:
for column in metadata.columns:
  print(column)

adult
belongs_to_collection
budget
genres
homepage
id
imdb_id
original_language
original_title
overview
popularity
poster_path
production_companies
production_countries
release_date
revenue
runtime
spoken_languages
status
tagline
title
video
vote_average
vote_count


In [116]:
display(HTML(metadata.head().to_html()))


Unnamed: 0,adult,belongs_to_collection,budget,genres,homepage,id,imdb_id,original_language,original_title,overview,popularity,poster_path,production_companies,production_countries,release_date,revenue,runtime,spoken_languages,status,tagline,title,video,vote_average,vote_count
0,False,"{'id': 10194, 'name': 'Toy Story Collection', 'poster_path': '/7G9915LfUQ2lVfwMEEhDsn3kT4B.jpg', 'backdrop_path': '/9FBwqcd9IRruEDUrTdcaafOMKUq.jpg'}",30000000,"[{'id': 16, 'name': 'Animation'}, {'id': 35, 'name': 'Comedy'}, {'id': 10751, 'name': 'Family'}]",http://toystory.disney.com/toy-story,862,tt0114709,en,Toy Story,"Led by Woody, Andy's toys live happily in his room until Andy's birthday brings Buzz Lightyear onto the scene. Afraid of losing his place in Andy's heart, Woody plots against Buzz. But when circumstances separate Buzz and Woody from their owner, the duo eventually learns to put aside their differences.",21.946943,/rhIRbceoE9lR4veEXuwCC2wARtG.jpg,"[{'name': 'Pixar Animation Studios', 'id': 3}]","[{'iso_3166_1': 'US', 'name': 'United States of America'}]",1995-10-30,373554033.0,81.0,"[{'iso_639_1': 'en', 'name': 'English'}]",Released,,Toy Story,False,7.7,5415.0
1,False,,65000000,"[{'id': 12, 'name': 'Adventure'}, {'id': 14, 'name': 'Fantasy'}, {'id': 10751, 'name': 'Family'}]",,8844,tt0113497,en,Jumanji,"When siblings Judy and Peter discover an enchanted board game that opens the door to a magical world, they unwittingly invite Alan -- an adult who's been trapped inside the game for 26 years -- into their living room. Alan's only hope for freedom is to finish the game, which proves risky as all three find themselves running from giant rhinoceroses, evil monkeys and other terrifying creatures.",17.015539,/vzmL6fP7aPKNKPRTFnZmiUfciyV.jpg,"[{'name': 'TriStar Pictures', 'id': 559}, {'name': 'Teitler Film', 'id': 2550}, {'name': 'Interscope Communications', 'id': 10201}]","[{'iso_3166_1': 'US', 'name': 'United States of America'}]",1995-12-15,262797249.0,104.0,"[{'iso_639_1': 'en', 'name': 'English'}, {'iso_639_1': 'fr', 'name': 'Français'}]",Released,Roll the dice and unleash the excitement!,Jumanji,False,6.9,2413.0
2,False,"{'id': 119050, 'name': 'Grumpy Old Men Collection', 'poster_path': '/nLvUdqgPgm3F85NMCii9gVFUcet.jpg', 'backdrop_path': '/hypTnLot2z8wpFS7qwsQHW1uV8u.jpg'}",0,"[{'id': 10749, 'name': 'Romance'}, {'id': 35, 'name': 'Comedy'}]",,15602,tt0113228,en,Grumpier Old Men,"A family wedding reignites the ancient feud between next-door neighbors and fishing buddies John and Max. Meanwhile, a sultry Italian divorcée opens a restaurant at the local bait shop, alarming the locals who worry she'll scare the fish away. But she's less interested in seafood than she is in cooking up a hot time with Max.",11.7129,/6ksm1sjKMFLbO7UY2i6G1ju9SML.jpg,"[{'name': 'Warner Bros.', 'id': 6194}, {'name': 'Lancaster Gate', 'id': 19464}]","[{'iso_3166_1': 'US', 'name': 'United States of America'}]",1995-12-22,0.0,101.0,"[{'iso_639_1': 'en', 'name': 'English'}]",Released,Still Yelling. Still Fighting. Still Ready for Love.,Grumpier Old Men,False,6.5,92.0
3,False,,16000000,"[{'id': 35, 'name': 'Comedy'}, {'id': 18, 'name': 'Drama'}, {'id': 10749, 'name': 'Romance'}]",,31357,tt0114885,en,Waiting to Exhale,"Cheated on, mistreated and stepped on, the women are holding their breath, waiting for the elusive ""good man"" to break a string of less-than-stellar lovers. Friends and confidants Vannah, Bernie, Glo and Robin talk it all out, determined to find a better way to breathe.",3.859495,/16XOMpEaLWkrcPqSQqhTmeJuqQl.jpg,"[{'name': 'Twentieth Century Fox Film Corporation', 'id': 306}]","[{'iso_3166_1': 'US', 'name': 'United States of America'}]",1995-12-22,81452156.0,127.0,"[{'iso_639_1': 'en', 'name': 'English'}]",Released,Friends are the people who let you be yourself... and never let you forget it.,Waiting to Exhale,False,6.1,34.0
4,False,"{'id': 96871, 'name': 'Father of the Bride Collection', 'poster_path': '/nts4iOmNnq7GNicycMJ9pSAn204.jpg', 'backdrop_path': '/7qwE57OVZmMJChBpLEbJEmzUydk.jpg'}",0,"[{'id': 35, 'name': 'Comedy'}]",,11862,tt0113041,en,Father of the Bride Part II,"Just when George Banks has recovered from his daughter's wedding, he receives the news that she's pregnant ... and that George's wife, Nina, is expecting too. He was planning on selling their home, but that's a plan that -- like George -- will have to change with the arrival of both a grandchild and a kid of his own.",8.387519,/e64sOI48hQXyru7naBFyssKFxVd.jpg,"[{'name': 'Sandollar Productions', 'id': 5842}, {'name': 'Touchstone Pictures', 'id': 9195}]","[{'iso_3166_1': 'US', 'name': 'United States of America'}]",1995-02-10,76578911.0,106.0,"[{'iso_639_1': 'en', 'name': 'English'}]",Released,Just When His World Is Back To Normal... He's In For The Surprise Of His Life!,Father of the Bride Part II,False,5.7,173.0


## Step 2: Overviews are short and not descriptive enough so Let's use plot summaries to get feature vectors
**source:** https://www.kaggle.com/jrobischon/wikipedia-movie-plots


In [20]:
movie_plots = pd.read_csv('/home/gsoykan20/Desktop/inzva/week2/homework-20210808T093349Z-001/homework/Wikipedia Movie Plots/wiki_movie_plots_deduped.csv', low_memory=False)
# metadata['overview'] = metadata['overview'].fillna('')


In [21]:
display(HTML(movie_plots.head().to_html()))

Unnamed: 0,Release Year,Title,Origin/Ethnicity,Director,Cast,Genre,Wiki Page,Plot
0,1901,Kansas Saloon Smashers,American,Unknown,,unknown,https://en.wikipedia.org/wiki/Kansas_Saloon_Smashers,"A bartender is working at a saloon, serving drinks to customers. After he fills a stereotypically Irish man's bucket with beer, Carrie Nation and her followers burst inside. They assault the Irish man, pulling his hat over his eyes and then dumping the beer over his head. The group then begin wrecking the bar, smashing the fixtures, mirrors, and breaking the cash register. The bartender then sprays seltzer water in Nation's face before a group of policemen appear and order everybody to leave.[1]"
1,1901,Love by the Light of the Moon,American,Unknown,,unknown,https://en.wikipedia.org/wiki/Love_by_the_Light_of_the_Moon,"The moon, painted with a smiling face hangs over a park at night. A young couple walking past a fence learn on a railing and look up. The moon smiles. They embrace, and the moon's smile gets bigger. They then sit down on a bench by a tree. The moon's view is blocked, causing him to frown. In the last scene, the man fans the woman with his hat because the moon has left the sky and is perched over her shoulder to see everything better."
2,1901,The Martyred Presidents,American,Unknown,,unknown,https://en.wikipedia.org/wiki/The_Martyred_Presidents,"The film, just over a minute long, is composed of two shots. In the first, a girl sits at the base of an altar or tomb, her face hidden from the camera. At the center of the altar, a viewing portal displays the portraits of three U.S. Presidents—Abraham Lincoln, James A. Garfield, and William McKinley—each victims of assassination.\r\nIn the second shot, which runs just over eight seconds long, an assassin kneels feet of Lady Justice."
3,1901,"Terrible Teddy, the Grizzly King",American,Unknown,,unknown,"https://en.wikipedia.org/wiki/Terrible_Teddy,_the_Grizzly_King","Lasting just 61 seconds and consisting of two shots, the first shot is set in a wood during winter. The actor representing then vice-president Theodore Roosevelt enthusiastically hurries down a hillside towards a tree in the foreground. He falls once, but rights himself and cocks his rifle. Two other men, bearing signs reading ""His Photographer"" and ""His Press Agent"" respectively, follow him into the shot; the photographer sets up his camera. ""Teddy"" aims his rifle upward at the tree and fells what appears to be a common house cat, which he then proceeds to stab. ""Teddy"" holds his prize aloft, and the press agent takes notes. The second shot is taken in a slightly different part of the wood, on a path. ""Teddy"" rides the path on his horse towards the camera and out to the left of the shot, followed closely by the press agent and photographer, still dutifully holding their signs."
4,1902,Jack and the Beanstalk,American,"George S. Fleming, Edwin S. Porter",,unknown,https://en.wikipedia.org/wiki/Jack_and_the_Beanstalk_(1902_film),"The earliest known adaptation of the classic fairytale, this films shows Jack trading his cow for the beans, his mother forcing him to drop them in the front yard, and beig forced upstairs. As he sleeps, Jack is visited by a fairy who shows him glimpses of what will await him when he ascends the bean stalk. In this version, Jack is the son of a deposed king. When Jack wakes up, he finds the beanstalk has grown and he climbs to the top where he enters the giant's home. The giant finds Jack, who narrowly escapes. The giant chases Jack down the bean stalk, but Jack is able to cut it down before the giant can get to safety. He falls and is killed as Jack celebrates. The fairy then reveals that Jack may return home as a prince."


In [124]:
show_as_html(movie_plots[movie_plots["Title"] == "The Matrix"])


Unnamed: 0,Release Year,Title,Origin/Ethnicity,Director,Cast,Genre,Wiki Page,Plot
13509,1999,The Matrix,American,"Andy Wachowski, Larry Wachowski","Keanu Reeves, Laurence Fishburne, Carrie-Anne Moss, Hugo Weaving, Joe Pantoliano",science fiction,https://en.wikipedia.org/wiki/The_Matrix,"A woman is cornered by police in an abandoned hotel; after overpowering them with superhuman abilities, a group of sinister superhuman grey green-suited Agents leads the police in a rooftop pursuit. She answers a ringing public telephone and vanishes.\r\nComputer programmer Thomas Anderson, living a double life as the hacker ""Neo"", feels something is wrong with the world and is puzzled by repeated online encounters with the cryptic phrase ""the Matrix"". The woman, Trinity, contacts him, saying that a man named Morpheus can explain its meaning; however, the Agents, led by Agent Smith, apprehend Neo and attempt to threaten him into helping them capture the ""terrorist"" Morpheus. Undeterred, Neo meets Morpheus, who offers him a choice between a red pill that will show him the truth about the Matrix, and a blue pill that will return him to his former life. After swallowing the red pill, his reality disintegrates and Neo awakens, naked, weak and hairless, in a liquid-filled pod, one of countless others connected by cables to an elaborate electrical system. He is rescued and brought aboard Morpheus' hovercraft, the Nebuchadnezzar.\r\nAs Neo recuperates, Morpheus explains the truth: in the 21st century, intelligent machines waged war against their human creators. When humans blocked the machines' access to solar energy, the machines retaliated by harvesting the humans' bioelectric power. The Matrix is a shared simulation of the world, in which the minds of the harvested humans are trapped and pacified. All free humans live in Zion, the last refuge in the real world. Morpheus and his crew are a group of rebels who hack into the Matrix to ""unplug"" enslaved humans and recruit them; their understanding of the simulated reality enables them to bend its physical laws, granting them superhuman abilities. Morpheus warns Neo that death within the Matrix also kills the physical body, and that the Agents are powerful sentient programs that eliminate threats to the system. Neo's prowess during virtual combat training lends credence to Morpheus' belief that Neo is ""the One"", an especially powerful human prophesied to free humans and end the war.\r\nThe group enters the Matrix to visit the Oracle, a prophet who predicted the emergence of the One. She implies that Neo is not the One and warns Neo that he will have to choose between Morpheus' life and his own. Before they can leave the Matrix, the group is ambushed by Agents and tactical police alerted by Cypher, a crew member who betrayed Morpheus to Smith in exchange for a comfortable life back in the Matrix. Morpheus allows himself to be captured so Neo and the rest of the crew can escape. Cypher exits the Matrix and murders several crew members as they lie defenseless in the real world. As he prepares to disconnect Neo and Trinity, Tank, a crewman whom he had left for dead, kills him.\r\nIn the Matrix, the Agents interrogate Morpheus to learn his access codes to the mainframe computer in Zion. Tank proposes killing Morpheus to prevent this, but Neo, believing that he is not the One, resolves to return to the Matrix to rescue Morpheus; Trinity insists she accompany him. While rescuing Morpheus, Neo gains confidence in his abilities, performing feats comparable to the Agents'. Morpheus and Trinity exit the Matrix, but Smith ambushes and kills Neo before he can leave. In the real world, machines called Sentinels attack the Nebuchadnezzar. Trinity whispers to Neo that he can't be dead because she loves him and the Oracle told her that she would fall in love with the One. She kisses Neo and he revives with the power to perceive and control the Matrix. He effortlessly defeats Smith and leaves the Matrix just as the ship's electromagnetic pulse weapon disables the attacking Sentinels.\r\nLater, Neo makes a telephone call inside the Matrix, promising the machines that he will show their prisoners ""a world where anything is possible"". He hangs up and flies into the sky."


In [22]:
intersected = pd.DataFrame(columns=movie_docs.columns)
for idx, row in tqdm(movie_docs.iterrows(), total=len(movie_docs)):
    original_title_in_movie_docs = row["original_title"]
    try:
        intersected_row = movie_plots[movie_plots["Title"] == original_title_in_movie_docs]
        intersected = intersected.append(intersected_row, ignore_index=True)
    except:
        print("not match for " + original_title_in_movie_docs)

HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=45466.0), HTML(value='')))




In [54]:
for idx, row in tqdm(movie_docs.iterrows(), total=len(movie_docs)):
    original_title_in_movie_docs = row["original_title"]
    try:
        intersected_row = movie_plots[movie_plots["Title"] == original_title_in_movie_docs]
        movie_docs.iloc[idx]["Plot"] = intersected_row["Plot"].values[0]
    except:
        movie_docs.iloc[idx]["Plot"] = None
        print("not match for " + original_title_in_movie_docs)


HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=45466.0), HTML(value='')))

not match for La Cité des Enfants Perdus
not match for 摇啊摇，摇到外婆桥
not match for Guillaumet, les ailes du courage
not match for Across the Sea of Time
not match for How To Make An American Quilt
not match for Se7en
not match for Guardian Angel
not match for Lamerica
not match for Georgia
not match for Kids of the Round Table
not match for Il postino
not match for Le confessionnal
not match for Two If by Sea
not match for Lawnmower Man 2: Beyond Cyberspace
not match for Gazon maudit
not match for Kicking and Screaming
not match for Les misérables
not match for Nico Icon
not match for بادکنک سفید
not match for Antonia
not match for Once Upon a Time... When We Were Colored
not match for Last Summer in the Hamptons
not match for The Journey of August King
not match for La Haine
not match for Shopping
not match for Heidi Fleiss: Hollywood Madam
not match for Keiner liebt mich
not match for Catwalk
not match for Headless Body in Topless Bar
not match for 紅番區
not match for Le Bonheur est dans l

In [97]:
movie_plots[movie_plots["Title"] == "black sun"]

Unnamed: 0,Release Year,Title,Origin/Ethnicity,Director,Cast,Genre,Wiki Page,Plot


In [104]:
def generate_movie_docs_with_plots_vectors_from_distilbert(mix_overview_and_plot=False,
                                                           save_name="movie_lens_overview_with_plots_distilbert_vecs.npy"):
    def get_cls_vec(text):
            token_inputs = distilbert_tokenizer(text, return_tensors="pt", max_length=512).to('cuda')
            return distilbert_model(**token_inputs).last_hidden_state[0, 0].cpu().detach().numpy()

    movie_docs_vectors = []
    for idx, row in tqdm(movie_docs.iterrows(), total=len(movie_docs)):
        if row["Plot"] != None and len(row["Plot"]) >= len(row["overview"]):
            if mix_overview_and_plot:
                plot_vec = get_cls_vec(row["Plot"])
                overview_vec = get_cls_vec(row["overview"])
                cls_feature_vec = (plot_vec + overview_vec) / 2
            else:
                cls_feature_vec = get_cls_vec(row["Plot"])
            #print("Plot is used for " + str(row["original_title"]))
        else:
            overview = row["overview"]
            cls_feature_vec = get_cls_vec(overview)
        movie_docs_vectors.append(cls_feature_vec)

    with open('./' + save_name, 'wb') as f:
        np.save(f, np.array(movie_docs_vectors), allow_pickle=True)

In [65]:
generate_movie_docs_with_plots_vectors_from_distilbert()


HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=45466.0), HTML(value='')))

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.


Plot is used for Toy Story
Plot is used for Jumanji
Plot is used for Grumpier Old Men
Plot is used for Waiting to Exhale
Plot is used for Father of the Bride Part II
Plot is used for Heat
Plot is used for Sabrina
Plot is used for Tom and Huck
Plot is used for Sudden Death
Plot is used for GoldenEye
Plot is used for The American President
Plot is used for Dracula: Dead and Loving It
Plot is used for Balto
Plot is used for Nixon
Plot is used for Cutthroat Island
Plot is used for Casino
Plot is used for Sense and Sensibility
Plot is used for Four Rooms
Plot is used for Ace Ventura: When Nature Calls
Plot is used for Money Train
Plot is used for Get Shorty
Plot is used for Copycat
Plot is used for Assassins
Plot is used for Powder
Plot is used for Leaving Las Vegas
Plot is used for Othello
Plot is used for Now and Then
Plot is used for Persuasion
Plot is used for Dangerous Minds
Plot is used for Twelve Monkeys
Plot is used for Babe
Plot is used for Dead Man Walking
Plot is used for It Take

In [66]:
with open('./movie_lens_overview_with_plots_distilbert_vecs.npy', 'rb') as f:
    word_vecs = np.load(f, allow_pickle=True)

In [69]:
insert_movie_feature_vectors(word_vecs)

In [74]:
recommend_for_all_metrics("The Matrix")

vector shape:768
['Metropolis', 'Metropolis', 'Android Cop', 'I, Robot', 'Clockstoppers', 'Clockstoppers', 'Death Machine', 'The Terminator', 'They Live', 'The Matrix Revolutions']
vector shape:768
['Suicide Squad', 'Batman', 'Batman', 'Moonraker', 'X-Men: Days of Future Past', 'Star Trek: Insurrection', 'Captain America: The Winter Soldier', 'Star Trek III: The Search for Spock', 'Cyborg X', 'Batman: Mystery of the Batwoman']
vector shape:768
['Flash Gordon', 'Resident Evil: The Final Chapter', 'Prince of Darkness', 'Sleeper', 'Minority Report', 'Terminator 2: Judgment Day', 'Hellraiser: Bloodline', 'Resident Evil: Retribution', 'The Matrix Revolutions', 'The Terminator']
vector shape:768
['Duck Amuck', 'Paranormal Whacktivity', 'Promoción Fantasma', 'Foxtrot', 'Die Farbe', 'The Prospects', 'Festival', 'Ōgon Bat', 'Comedy Central Roast of Pamela Anderson', 'Toy Story']
vector shape:768
['Ganes', 'Бой с Тенью', 'Метель', 'Дура', 'Наша Russia: Яйца судьбы', 'Ι-4: Λούφα Και Απαλλαγή', 'V

In [75]:
recommend_for_all_metrics("Whiplash")

vector shape:768
['Gladiator', 'The Boss', 'The Boss', 'Gun Crazy', 'Redbelt', 'DodgeBall: A True Underdog Story', 'The Lone Wolf Spy Hunt', 'None But the Lonely Heart', 'Whiplash', 'Whiplash']
vector shape:768
['Sat sau ji wong', 'French Roast', 'Tokyo Joe', 'Modern Problems', 'Une Chambre en Ville', 'The Con', "The Man Who Wasn't There", "The Man Who Wasn't There", 'Sonny Boy', 'Save the Date']
vector shape:768
['DodgeBall: A True Underdog Story', 'The Boss', 'The Boss', 'The Big Night', 'Any Which Way You Can', 'They Made Me a Criminal', 'The Prizefighter and the Lady', 'Gladiator', 'Whiplash', 'Whiplash']
vector shape:768
['Foxtrot', 'Die Farbe', 'The Prospects', 'Festival', 'Ōgon Bat', 'Toy Story', 'My Little Eye', 'Touch', 'Whiplash', 'Whiplash']
vector shape:768
['Дура', 'Kalteva Torni', 'Passage de Venus', 'Про любоff', 'Batman: Gotham Knight', 'Buffalo Running', 'На семи ветрах', 'Ι-4: Λούφα Και Απαλλαγή', 'Downeast', 'Koroli i kapusta']
vector shape:768
['DodgeBall: A True Un

In [76]:
recommend_for_all_metrics("Toy Story")

vector shape:768
['Looney Tunes: Back in Action', 'The Adventures of Elmo in Grouchland', "We're Back! A Dinosaur's Story", 'TerrorVision', 'The Care Bears Movie', 'Once Upon a Forest', 'Casper Meets Wendy', 'Elf', 'Toy Story 2', 'Toy Story 3']
vector shape:768
['Big Hero 6', 'Winnie the Pooh', 'Robots', 'Superman IV: The Quest for Peace', 'Batman', 'Batman', 'Batman Forever', 'Pinocchio', 'Pinocchio', 'Pinocchio']
vector shape:768
['Frankenweenie', 'The Jetsons Meet the Flintstones', 'The Angry Birds Movie', 'Jimmy Neutron: Boy Genius', 'Runaway Brain', 'The Adventures of Elmo in Grouchland', 'TerrorVision', 'Elf', 'Toy Story 2', 'Toy Story 3']
vector shape:768
['5 Flights Up', 'Duck Amuck', 'Promoción Fantasma', 'Foxtrot', 'Die Farbe', 'The Prospects', 'Festival', 'Ōgon Bat', 'Comedy Central Roast of Pamela Anderson', 'The Wrong Road']
vector shape:768
['Sunes sommar', 'Io, Emmanuelle', 'Stiller Sturm', 'Семь кабинок', "Les fragments d'Antonin", 'Кентервильское привидение', 'Diario d

In [105]:
generate_movie_docs_with_plots_vectors_from_distilbert(mix_overview_and_plot=True,
                                                       save_name="movie_lens_overview_with_plots_mixed_distilbert_vecs.npy")

HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=45466.0), HTML(value='')))




In [106]:
with open('./movie_lens_overview_with_plots_mixed_distilbert_vecs.npy', 'rb') as f:
    word_vecs = np.load(f, allow_pickle=True)

In [107]:
insert_movie_feature_vectors(word_vecs)

In [108]:
recommend_for_all_metrics("The Matrix")

vector shape:768
['ARQ', '屍者の帝国', 'Antiviral', 'Garm Wars: The Last Druid', 'Starchaser: The Legend of Orin', 'Lockout', 'Prometheus', 'The Dark Knight', 'Interstellar', '9']
vector shape:768
['Transformers Prime Beast Hunters: Predacons Rising', 'Sand Castle', 'Sons of Liberty', 'Trancers 3: Deth Lives', 'Star Wars: The Force Awakens', 'The X Files', 'アベンジャーズ コンフィデンシャル：ブラック・ウィドウ ＆ パニッシャー', 'The Matrix', 'Scavengers', 'Batman: Mystery of the Batwoman']
vector shape:768
['The Terminator', 'Predestination', 'Resident Evil: Apocalypse', 'Babylon A.D.', 'Supernova', 'Garm Wars: The Last Druid', 'Starship Troopers', 'Men in Black', 'Lockout', 'Surrogates']
vector shape:768
['Duck Amuck', 'Paranormal Whacktivity', 'Promoción Fantasma', 'Foxtrot', 'Die Farbe', 'The Prospects', 'Festival', 'Ōgon Bat', 'Comedy Central Roast of Pamela Anderson', 'Toy Story']
vector shape:768
['Сатисфакция', 'Бабло', 'Kalteva Torni', 'Бой с Тенью', 'Дура', 'Наша Russia: Яйца судьбы', 'Ι-4: Λούφα Και Απαλλαγή', 'П

In [109]:
recommend_for_all_metrics("Whiplash")

vector shape:768
['The Girl Next Door', 'Stardust', 'Up in Smoke', "Child's Play", 'Young Man with a Horn', 'The Gambler', 'The Prizefighter and the Lady', 'Wolf', 'The Recruiter', 'Magnificent Obsession']
vector shape:768
['High Art', 'French Roast', 'Save the Date', 'Un petit boulot', 'The Con', 'Hamnstad', 'Sat sau ji wong', 'Len and Company', 'Sonny Boy', 'Heaven Knows What']
vector shape:768
['Purple Rain', 'School of Rock', 'Whiplash', 'Weary River', 'Body and Soul', 'The Derby Stallion', 'Wild Guitar', 'Wolf', 'Body and Soul', 'Young Man with a Horn']
vector shape:768
['Promoción Fantasma', 'Foxtrot', 'Die Farbe', 'The Prospects', 'Festival', 'Ōgon Bat', 'Donnie Brasco', "The Great St. Trinian's Train Robbery", 'Cheyenne Autumn', 'Carson City']
vector shape:768
['Salut cousin !', 'Downeast', 'Бой с Тенью', 'Наша Russia: Яйца судьбы', 'Дура', 'Ι-4: Λούφα Και Απαλλαγή', 'Про любоff', 'Kalteva Torni', 'На семи ветрах', 'Bad']
vector shape:768
['Purple Rain', 'School of Rock', 'Whip

In [110]:
recommend_for_all_metrics("Toy Story")

vector shape:768
["Charlotte's Web", "Charlotte's Web", "Now You See Him, Now You Don't", 'Frankenweenie', 'Soup to Nuts', 'Oliver & Company', 'The Peanuts Movie', 'Trick or Treat', 'Toy Story 3', 'Toy Story 2']
vector shape:768
['Leaves of Grass', 'Cyborg X', 'Muppets Most Wanted', 'Batman & Mr. Freeze: SubZero', 'Monster High', 'Freaked', 'Batman', "Garfield's Halloween Adventure", 'Ein Freund von mir', 'The X Files']
vector shape:768
['Fred Claus', 'Frankenweenie', 'Elf', 'Snoopy, Come Home', 'Dino Time', 'Frankenweenie', 'Stuart Little 2', 'Runaway Brain', 'Toy Story 3', 'Toy Story 2']
vector shape:768
['5 Flights Up', 'Duck Amuck', 'Promoción Fantasma', 'Foxtrot', 'Die Farbe', 'The Prospects', 'Festival', 'Ōgon Bat', 'Comedy Central Roast of Pamela Anderson', 'Oh, God! You Devil']
vector shape:768
['Lille Fridolf Och Jag', 'A Wake in Providence', 'Frau Müller muss weg!', 'A Lovasíjász', 'Путь к себе', '¡Asu Mare! 2', 'Buffalo Running', 'Ι-4: Λούφα Και Απαλλαγή', 'Kalteva Torni', '

In [113]:
recommend_for_all_metrics("Forrest Gump")

vector shape:768
['Swimming Upstream', 'The Stranger', 'The Raven', 'A Family Thing', 'September 30, 1955', 'Saint Ralph', 'Fallen', 'Unbreakable', 'All My Sons', 'Seabiscuit']
vector shape:768
['The Most Dangerous Man in America: Daniel Ellsberg and the Pentagon Papers', 'Far Out Man', 'The Con', 'David and Bathsheba', 'Winter Meeting', 'Man Down', 'Liliomfi', 'The Clown and The Kid', 'Heaven Knows What', 'The World Made Straight']
vector shape:768
['Harvie Krumpet', 'All My Sons', "It's a Dog's Life", 'Sergeant York', 'Malcolm X', 'The Long Gray Line', 'Little Boy', 'My Dog Skip', 'Unbreakable', 'Of Human Hearts']
vector shape:768
['Duck Amuck', 'Promoción Fantasma', 'Foxtrot', 'Die Farbe', 'The Prospects', 'Festival', 'Ōgon Bat', 'Comedy Central Roast of Pamela Anderson', 'Paranormal Whacktivity', 'Toy Story']
vector shape:768
['Two Years at Sea', 'Le Corbeau', 'Justice League vs. Teen Titans', 'The Mayor of Casterbridge', 'Mosquinha', 'Bad', 'На семи ветрах', 'Kalteva Torni', 'Down