# PROJET APPLICATION DE RECOMMANDATION DE FILMS

## PARTIE 1 : KPI

<img src="./images/Djangounchained.webp" width="700" height="500"/>

![](./images/Djangounchained.webp)

#### **title.akas.tsv.gz** : Ce fichier contient des informations sur les différents titres d’un film ou d’une série TV dans différentes régions et langues.

- **titleId** : Identifiant unique du titre.
 
- **ordering** : Un nombre pour identifier de manière unique les lignes pour un titleId donné.
 
- **title** : Le titre localisé.
 
- **region** : La région pour cette version du titre.
 
- **language** : La langue du titre.
 
- **types** : Ensemble énuméré d’attributs pour ce titre alternatif.
 
- **attributes** : Termes supplémentaires pour décrire ce titre alternatif, non énumérés.

- **isOriginalTitle** : 0 si ce n’est pas le titre original ; 1 si c’est le titre original.
	

#### **title.basics.tsv.gz** : Ce fichier contient des informations de base sur les titres.

- **tconst** : Identifiant unique du titre.
 
- **titleType** : Le type/format du titre (par exemple, film, court métrage, série TV, épisode TV, vidéo, etc).
 
- **primaryTitle** : Le titre le plus populaire / le titre utilisé par les réalisateurs sur les supports promotionnels lors de la sortie.

- **originalTitle** : Titre original, dans la langue originale.

- **isAdult** : 0 pour un titre non adulte ; 1 pour un titre adulte.

- **startYear** : Représente l’année de sortie d’un titre. Dans le cas d’une série TV, il s’agit de l’année de début de la série.
 
- **endYear** : Année de fin de la série TV. ‘\N’ pour tous les autres types de titres.
 
- **genres** : Comprend jusqu’à trois genres associés au titre.

#### **title.crew.tsv.gz** : Ce fichier contient des informations sur l’équipe de tournage d’un titre.

- **tconst** : Identifiant unique du titre.
 
- **directors** : Réalisateur(s) du titre donné.
 
- **writers** : Scénariste(s) du titre donné.

#### **title.episode.tsv.gz** : Ce fichier contient des informations sur les épisodes d’une série TV.

- **tconst** : Identifiant de l’épisode.
 
- **parentTconst** : Identifiant de la série TV parente.
 
- **seasonNumber** : Numéro de la saison à laquelle appartient l’épisode.
 
- **episodeNumber** : Numéro de l’épisode dans la série TV.
	
#### **title.principals.tsv.gz** : Ce fichier contient des informations sur les principaux participants d’un titre.

- **tconst** : Identifiant unique du titre.
 
- **ordering** : Un nombre pour identifier de manière unique les lignes pour un titleId donné.
 
- **nconst** : Identifiant unique de la personne.
 
- **category** : La catégorie de travail dans laquelle cette personne était.
 
- **job** : Le titre spécifique du travail si applicable, sinon ‘\N’.

- **characters** : Le nom du personnage joué si applicable, sinon ‘\N’.

#### **title.ratings.tsv.gz** : Ce fichier contient des informations sur les notes d’un titre.

- **tconst** : Identifiant unique du titre.
 
- **averageRating** : Moyenne pondérée de toutes les notes des utilisateurs.
 
- *numVotes** : Nombre de votes que le titre a reçus.

#### **name.basics.tsv.gz** : Ce fichier contient des informations de base sur une personne.

- **nconst** : Identifiant unique de la personne.
 
- **primaryName** : Nom par lequel la personne est le plus souvent créditée.
 
- **birthYear** : En format YYYY.
 
- **deathYear** : En format YYYY si applicable, sinon ‘\N’.
 
- **primaryProfession** : Les 3 principales professions de la personne.
 
- **knownForTitles** : Titres pour lesquels la personne est connue.

In [2]:
import os
import io
import gzip
import random
import secrets
import datetime
import requests
import numpy as np
import pandas as pd
import seaborn as sns
from sklearn import svm
import plotly.io as pio
from sklearn import tree
import plotly.express as px
from bs4 import BeautifulSoup
import matplotlib.pyplot as plt
import plotly.graph_objects as go
from sklearn.cluster import KMeans
from scipy.cluster import hierarchy
from sklearn.decomposition import PCA
from sklearn.pipeline import make_pipeline
from sklearn.metrics import accuracy_score
from sklearn.metrics import silhouette_score
from sklearn.compose import ColumnTransformer
from sklearn.linear_model import SGDRegressor
from sklearn.tree import DecisionTreeRegressor
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import GridSearchCV
from sklearn.linear_model import LinearRegression
from sklearn.neighbors import KNeighborsRegressor
from sklearn.metrics import classification_report
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import cross_validate
from sklearn.linear_model import LogisticRegression
from sklearn.cluster import AgglomerativeClustering
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, r2_score
from scipy.spatial.distance import pdist, squareform
from sklearn.ensemble import HistGradientBoostingRegressor
from scipy.cluster.hierarchy import dendrogram, linkage, fcluster
from flask import Flask, request, render_template, session, url_for, redirect
from sklearn.preprocessing import (
    MaxAbsScaler,
    MinMaxScaler,
    Normalizer,
    PowerTransformer,
    QuantileTransformer,
    RobustScaler,
    StandardScaler,
    minmax_scale,
)
from sklearn.preprocessing import OneHotEncoder, OrdinalEncoder, TargetEncoder


# Chargement des datasets de type TSV
les fichiers étant mis à jour quotidiennement, nous faisons le choix d'une fonction
 permetant de les importer et de les sauvegarder automatiquement

In [3]:
# scrapping des noms des fichiers
url = "https://datasets.imdbws.com/"

# contournement d'éventuels protection anti-scrapping
navigator = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3"

response = requests.get(url, headers={"User-Agent": navigator})
# response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")

# Trouver tous les liens sur la page
liens = soup.find_all("a")

# création de la liste des fichiers
fichiers = []
for lien in liens:
    #  choisir uniquement les liens de type href
    nom_fichier = lien.get("href")
    #  choisir uniquement les liens se terminant par .tsv.gz
    if nom_fichier.endswith(".tsv.gz"):
        # print(nom_fichier)
        fichiers.append(nom_fichier)
# print(f"\n fichiers:\n{fichiers} \n")


In [4]:
# Création du dictionnaire pour stocker les DataFrames
dfs = {}
liste_noms_df = []

for fichier in fichiers:
    # url = f'https://datasets.imdbws.com/{fichier}'
    response = requests.get(fichier)
    gzip_file = gzip.open(io.BytesIO(response.content), "rt", encoding="utf-8")
    df = pd.read_csv(gzip_file, sep="\t", low_memory=False)
    # Chaque DataFrame porte le nom du fichier sans l'extension
    nom_sans_extension = os.path.splitext(os.path.basename(fichier))[0]
    # Suppression du tz et remplacement des . par des _
    nom_sans_extension = os.path.splitext(nom_sans_extension)[
        0].replace(".", "_")
    dfs[nom_sans_extension] = df
    liste_noms_df.append(nom_sans_extension)
    print(f"\nNom du DataFrame : {nom_sans_extension}")



Nom du DataFrame : name_basics

Nom du DataFrame : title_akas

Nom du DataFrame : title_basics

Nom du DataFrame : title_crew

Nom du DataFrame : title_episode

Nom du DataFrame : title_principals

Nom du DataFrame : title_ratings


In [None]:
# sauvegarde des datafdrames sur le disque au format csv
# for nom, df in dfs.items():
#     df.to_csv(f"data_csv/{nom}.csv", index=False)


#### sauvegarde des dataframes au format pickle et copie de chacun accessibles globalement

In [20]:
import pandas as pd
import os
#  séparation des df de dfs
for nom_df in dfs:
    df_nom_df = dfs[nom_df].apply(pd.Series)
    # print(f"\nDataFrame :{nom_df} :\n{df_nom_df.shape}")
    # Sauvegarde du DataFrame en format pickle
    df_nom_df.to_pickle(f"data_pickle/df_{nom_df}.pkl")
    df_nom_df2 = pd.read_pickle(f"data_pickle/df_{nom_df}.pkl")

#chemin des sauvegardes des dataframe au format pickle
dossier = 'data_pickle'

# liste des fichiers dans le dossier
fichiers = os.listdir(dossier)

# on ne garde que les fichiers .pkl
fichiers_pickle = [f for f in fichiers if f.endswith('.pkl')]

# Charge chaque fichier pickle en tant que DataFrame
for fichier in fichiers_pickle:
    # suppression extension .pkl du nom du fichier pour l'utiliser comme nom de variable
    nom_df = fichier[:-4]
    # Charge le DataFrame à partir du fichier pickle
    df = pd.read_pickle(os.path.join(dossier, fichier))
    # Ajoutez le DataFrame à l'espace de noms global
    # par sécu on va travailler sur des copies des dfs
    df2=df.copy()
    # Ajoute le DataFrame à l'espace de noms global avec le suffixe "2"
    #  Attention il ne faut pas que les noms puissent se confondre avec d'autres variables globales
    globals()[nom_df+'2'] = df2


In [None]:
# liste des df accessibles globalement
# df_title_akas2
# df_title_basics2
# df_title_crew2
# df_title_episode2
# df_title_principals2
# df_name_basics2


# Echantillonnage

## traitement title_akas

#### **title.akas.tsv.gz** : Ce fichier contient des informations sur les différents titres d’un film ou d’une série TV dans différentes régions et langues.

- **titleId** : Identifiant unique du titre.
 
- **ordering** : Un nombre pour identifier de manière unique les lignes pour un titleId donné.
 
- **title** : Le titre localisé.
 
- **region** : La région pour cette version du titre.
 
- **language** : La langue du titre.
 
- **types** : Ensemble énuméré d’attributs pour ce titre alternatif.
 
- **attributes** : Termes supplémentaires pour décrire ce titre alternatif, non énumérés.

- **isOriginalTitle** : 0 si ce n’est pas le titre original ; 1 si c’est le titre original.

In [23]:
# par securité on travaille sur un double du df
df_title_akas3=df_title_akas2.copy()


- Traitement de la colonne region on ne garde que les films francais

In [24]:
#  Traitement de la colonne region
#  on ne filtre que les films francais
print(f"\n title_akas: {df_title_akas3.shape} ")
liste_region = df_title_akas3['region'].unique().tolist()
print(f"\nliste_region :{liste_region} ")
filtre_fr = df_title_akas3['region'] == "FR"
df_title_akas_fr = df_title_akas3[filtre_fr]
print(f"\n title_akas: {df_title_akas_fr.shape} ")



 title_akas: (38389856, 8) 

liste_region :['UA', 'DE', 'HU', 'GR', 'RU', 'US', '\\N', 'JP', 'FR', 'RO', 'GB', 'CA', 'PT', 'AU', 'ES', 'FI', 'PL', 'AR', 'RS', 'UY', 'IT', 'BR', 'DK', 'TR', 'XWW', 'XEU', 'SK', 'CZ', 'SE', 'NZ', 'KZ', 'MX', 'NO', 'XYU', 'AT', 'VE', 'CSHH', 'SI', 'SUHH', 'IN', 'CN', 'LT', 'TW', 'NL', 'HR', 'CO', 'IR', 'BG', 'SG', 'BE', 'VN', 'PH', 'DZ', 'CH', 'BF', 'EC', 'XWG', 'HK', 'XSA', 'EE', 'IS', 'PR', 'DDDE', 'IL', 'EG', 'XKO', 'CL', 'IE', 'JM', 'KR', 'PE', 'GE', 'BY', 'BA', 'DO', 'AE', 'PA', 'TH', 'ZA', 'TJ', 'XSI', 'MY', 'LV', 'ID', 'PK', 'BD', 'CU', 'AL', 'BO', 'XAS', 'CR', 'PY', 'GT', 'SV', 'KP', 'UZ', 'BUMM', 'MM', 'YUCS', 'XPI', 'BJ', 'AZ', 'NG', 'CM', 'MA', 'GL', 'MN', 'LI', 'LU', 'MZ', 'MK', 'BM', 'MD', 'ME', 'LB', 'IQ', 'TM', 'TN', 'HT', 'AM', 'LK', 'CG', 'CI', 'SY', 'NP', 'QA', 'TO', 'SN', 'GH', 'JO', 'KG', 'NE', 'GN', 'VDVN', 'TD', 'SO', 'SD', 'MC', 'TT', 'GA', 'BS', 'LY', 'AO', 'KH', 'MR', 'AF', 'MG', 'ML', 'GY', 'CY', 'ET', 'GU', 'SR', 'MT', 'TG', 'PG

- Traitement des colonnes à garder

_Plutot que de supprimer des colonnes il est preferable de garder celles qui nous interessent car  les BDD du site imdb sont mises à jour quotidiennement et peuvent changer de type ou de nom_

In [26]:
# liste colonnes: ['titleId', 'ordering', 'title', 'region', 'language', 'types', 'attributes', 'isOriginalTitle']`
# on garde que les  colonnes suivantes ['titleId', 'title', 'region', 'types']`
df_title_akas_fr = df_title_akas_fr[['titleId', 'title', 'types']]


In [27]:
df_title_akas_fr


Unnamed: 0,titleId,title,types
9,tt0000002,Le clown et ses chiens,imdbDisplay
23,tt0000003,Pauvre Pierrot,imdbDisplay
26,tt0000004,Un bon bock,imdbDisplay
84,tt0000010,La sortie de l'usine Lumière,alternative
86,tt0000010,La sortie des usines Lumière,alternative
...,...,...,...
38389822,tt9916844,Épisode #3.15,\N
38389826,tt9916846,Épisode #3.18,\N
38389833,tt9916848,Épisode #3.17,\N
38389842,tt9916850,Épisode #3.19,\N


- Traitement de la colonne types  
on ne garde que  ['imdbDisplay','tv', 'video','festival']

In [28]:
types_film = df_title_akas_fr["types"].unique().tolist()
print(f"\ntypes_film :\n{types_film}")
# surla liste des types ['imdbDisplay', ['imdbDisplay','alternative', '\\N', 'tv', 'video', 'dvd', 'working', 'festival']
# on ne garde que  ['imdbDisplay','tv', 'video','festival']
types_to_keep = ['imdbDisplay', 'tv', 'video', 'festival']
filtre_type_film = df_title_akas_fr['types'].isin(types_to_keep)
df_title_akas_fr = df_title_akas_fr[filtre_type_film]



types_film :
['imdbDisplay', 'alternative', '\\N', 'tv', 'video', 'dvd', 'working', 'festival']


suppression des \N

In [30]:
# suppression de toutes les colonnes ayant des \N
df_title_akas_fr = df_title_akas_fr.loc[:, ~(df_title_akas_fr == r'\N').any()]


## traitement title_principals

#### **title.principals.tsv.gz** : Ce fichier contient des informations sur les principaux participants d’un titre.

- **tconst** : Identifiant unique du titre.
 
- **ordering** : Un nombre pour identifier de manière unique les lignes pour un titleId donné.
 
- **nconst** : Identifiant unique de la personne.
 
- **category** : La catégorie de travail dans laquelle cette personne était.
 
- **job** : Le titre spécifique du travail si applicable, sinon ‘\N’.

- **characters** : Le nom du personnage joué si applicable, sinon ‘\N’.

In [29]:
# par securité on travaille sur un double du df
df_title_principals3 = df_title_principals2.copy()


traitement choix des colonnes

In [None]:
# liste des colonnes['tconst', 'ordering', 'nconst', 'category', 'job', 'characters']
# Résumé des informations du dataframe
your_dataframe = df_title_principals3
print("#"+"-"*79)
print("valeurs uniques des colonnes:")
for col in your_dataframe.columns:
    print("#"+"#"*20)
    print(f"====colonne====: {col} \n====nb valeur uniques====:{
          your_dataframe[col].nunique()} ")
    if your_dataframe[col].nunique() < 15:
        print(r", ".join(
            f"'{item}'" for item in your_dataframe[col].unique().tolist()))
    else:
        unique_values = your_dataframe[col].unique()[:15]
        print("====15 premiers====: \n" +
              ", ".join(f"'{item}'" for item in unique_values))
print("#"+"-"*79)


In [32]:
df_title_principals3


Unnamed: 0,tconst,ordering,nconst,category,job,characters
0,tt0000001,1,nm1588970,self,\N,"[""Self""]"
1,tt0000001,2,nm0005690,director,\N,\N
2,tt0000001,3,nm0374658,cinematographer,director of photography,\N
3,tt0000002,1,nm0721526,director,\N,\N
4,tt0000002,2,nm1335271,composer,\N,\N
...,...,...,...,...,...,...
59959393,tt9916880,5,nm0584014,director,\N,\N
59959394,tt9916880,6,nm0996406,director,principal director,\N
59959395,tt9916880,7,nm1482639,writer,\N,\N
59959396,tt9916880,8,nm2586970,writer,books,\N


In [33]:
# on ne garde que les colonnes ['tconst', 'nconst', 'category']
col_to_keep = ["tconst", "nconst", "category"]
df_title_principals3 = df_title_principals3[col_to_keep]


In [113]:
df_title_principals3


Unnamed: 0,tconst,nconst,category
0,tt0000001,nm1588970,self
1,tt0000001,nm0005690,director
3,tt0000002,nm0721526,director
5,tt0000003,nm0721526,director
6,tt0000003,nm1770680,producer
...,...,...,...
59941345,tt9916880,nm0584014,director
59941346,tt9916880,nm0996406,director
59941347,tt9916880,nm1482639,writer
59941348,tt9916880,nm2586970,writer


##### traitement de la colonne category

###### liste des categories uniques

-**self** :  faire référence à un film ou à une émission de télévision où les acteurs jouent leur propre rôle.

-**director** : Le réalisateur est la personne qui supervise la production artistique et dramatique d’un film, et qui en dirige le tournage.

-**cinematographer** : Le directeur de la photographie est la personne qui est en charge de la prise de vue, de l’éclairage et de la composition visuelle du film.

-**composer** : Le compositeur crée la musique originale du film.

-**producer** : Le producteur supervise tous les aspects de la production d’un film, de la conception à la distribution.

-**editor** : L’éditeur assemble les images filmées pour créer le produit final.

-**actor** : Un acteur joue un personnage dans un film.

-**actress** : Une actrice joue un personnage dans un film.

-**writer** : Le scénariste écrit le scénario du film.

-**production_designer** : Le chef décorateur est responsable de la conception visuelle globale du film.

-**archive_footage** :  fait référence à un film qui utilise des images d’archives.

-**archive_sound** : fait référence à un film qui utilise des sons d’archives.

In [110]:
# liste des categories uniques 'self', 'director', 'cinematographer', 'composer', 'producer', 'editor', 'actor', 'actress', 'writer', 'production_designer', 'archive_footage', 'archive_sound'

# on ne garde que les ['self', 'director', 'producer', 'editor', 'actor', 'actress', 'writer']
title_principals2


Unnamed: 0,tconst,nconst,category
0,tt0000001,nm1588970,self
1,tt0000001,nm0005690,director
2,tt0000001,nm0374658,cinematographer
3,tt0000002,nm0721526,director
4,tt0000002,nm1335271,composer
...,...,...,...
59941345,tt9916880,nm0584014,director
59941346,tt9916880,nm0996406,director
59941347,tt9916880,nm1482639,writer
59941348,tt9916880,nm2586970,writer


In [111]:
types_category = title_principals2["category"].unique().tolist()

# sur la liste des types  ['imdbDisplay','alternative', '\\N', 'tv', 'video', 'dvd', 'working', 'festival']
# on ne garde que  ['imdbDisplay','tv', 'video','festival']
filtre_to_keep = ['self', 'director', 'producer',
                  'editor', 'actor', 'actress', 'writer']
filtre_type_category = title_principals2['category'].isin(filtre_to_keep)
df_title_principals3 = title_principals2[filtre_type_category]


In [114]:
df_title_principals3
# Résumé des informations du dataframe
your_dataframe = df_title_principals3
print("#"+"-"*79)
print("valeurs uniques des colonnes:")
for col in your_dataframe.columns:
   print("#"+"#"*20)
   print(f"====colonne====: {col} \n====nb valeur uniques====:{your_dataframe[col].nunique()} ")
   if your_dataframe[col].nunique()< 15:
       print(r", ".join(f"'{item}'" for item in your_dataframe[col].unique().tolist()))
   else:
       unique_values = your_dataframe[col].unique()[:15]
       print("====15 premiers====: \n" + ", ".join(f"'{item}'" for item in unique_values))
print("#"+"-"*79)


#-------------------------------------------------------------------------------
valeurs uniques des colonnes:
#####################
====colonne====: tconst 
====nb valeur uniques====:9415528 
====15 premiers====: 
'tt0000001', 'tt0000002', 'tt0000003', 'tt0000004', 'tt0000005', 'tt0000006', 'tt0000007', 'tt0000008', 'tt0000009', 'tt0000010', 'tt0000011', 'tt0000012', 'tt0000013', 'tt0000014', 'tt0000015'
#####################
====colonne====: nconst 
====nb valeur uniques====:4745712 
====15 premiers====: 
'nm1588970', 'nm0005690', 'nm0721526', 'nm1770680', 'nm5442200', 'nm0443482', 'nm0653042', 'nm0249379', 'nm0179163', 'nm0183947', 'nm0374658', 'nm0653028', 'nm0063086', 'nm0183823', 'nm1309758'
#####################
====colonne====: category 
====nb valeur uniques====:7 
'self', 'director', 'producer', 'editor', 'actor', 'actress', 'writer'
#-------------------------------------------------------------------------------
#--------------------------------------------------------------

### traitement title.basics

#### traitement title.basics  : Ce fichier contient des informations de base sur les titres.

- **tconst** : Identifiant unique du titre.
 
- **titleType** : Le type/format du titre (par exemple, film, court métrage, série TV, épisode TV, vidéo, etc).
 
- **primaryTitle** : Le titre le plus populaire / le titre utilisé par les réalisateurs sur les supports promotionnels lors de la sortie.

- **originalTitle** : Titre original, dans la langue originale.

- **isAdult** : 0 pour un titre non adulte ; 1 pour un titre adulte.

- **startYear** : Représente l’année de sortie d’un titre. Dans le cas d’une série TV, il s’agit de l’année de début de la série.
 
- **endYear** : Année de fin de la série TV. ‘\N’ pour tous les autres types de titres.
 
- **genres** : Comprend jusqu’à trois genres associés au titre.


In [301]:
liste_noms_df


['name_basics',
 'title_akas',
 'title_basics',
 'title_crew',
 'title_episode',
 'title_principals',
 'title_ratings']

In [385]:
df_title_basics3=title_basics.copy()


In [386]:
df_title_basics3


Unnamed: 0,tconst,titleType,primaryTitle,originalTitle,isAdult,startYear,endYear,runtimeMinutes,genres
0,"(tt0000001,)","([short],)","(Carmencita,)",Carmencita,0,1894,\N,1,"Documentary,Short"
1,"(tt0000002,)","([short],)","(Le clown et ses chiens,)",Le clown et ses chiens,0,1892,\N,5,"Animation,Short"
2,"(tt0000003,)","([short],)","(Pauvre Pierrot,)",Pauvre Pierrot,0,1892,\N,4,"Animation,Comedy,Romance"
3,"(tt0000004,)","([short],)","(Un bon bock,)",Un bon bock,0,1892,\N,12,"Animation,Short"
4,"(tt0000005,)","([short],)","(Blacksmith Scene,)",Blacksmith Scene,0,1893,\N,1,"Comedy,Short"
...,...,...,...,...,...,...,...,...,...
10458178,"(tt9916848,)","([tvEpisode],)","(Episode #3.17,)",Episode #3.17,0,2009,\N,\N,"Action,Drama,Family"
10458179,"(tt9916850,)","([tvEpisode],)","(Episode #3.19,)",Episode #3.19,0,2010,\N,\N,"Action,Drama,Family"
10458180,"(tt9916852,)","([tvEpisode],)","(Episode #3.20,)",Episode #3.20,0,2010,\N,\N,"Action,Drama,Family"
10458181,"(tt9916856,)","([short],)","(The Wind,)",The Wind,0,2015,\N,27,Short


In [None]:
##------------------------------------------------------------------------------
# valeurs uniques des colonnes:
# #####################
# ====colonne====: tconst
# ====nb valeur uniques====:10458183
# ====15 premiers====:
# 'tt0000001', 'tt0000002', 'tt0000003', 'tt0000004', 'tt0000005', 'tt0000006', 'tt0000007', 'tt0000008', 'tt0000009', 'tt0000010', 'tt0000011', 'tt0000012', 'tt0000013', 'tt0000014', 'tt0000015'
# #####################
# ====colonne====: titleType
# ====nb valeur uniques====:11
# 'short', 'movie', 'tvShort', 'tvMovie', 'tvSeries', 'tvEpisode', 'tvMiniSeries', 'tvSpecial', 'video', 'videoGame', 'tvPilot'
# #####################
# ====colonne====: primaryTitle
# ====nb valeur uniques====:4693969
# ====15 premiers====:
# 'Carmencita', 'Le clown et ses chiens', 'Pauvre Pierrot', 'Un bon bock', 'Blacksmith Scene', 'Chinese Opium Den', 'Corbett and Courtney Before the Kinetograph', 'Edison Kinetoscopic Record of a Sneeze', 'Miss Jerry', 'Leaving the Factory', 'Akrobatisches Potpourri', 'The Arrival of a Train', 'The Photographical Congress Arrives in Lyon', 'The Waterer Watered', 'Autour d'une cabine'
# #####################
# ====colonne====: originalTitle
# ====nb valeur uniques====:4717070
# ====15 premiers====:
# 'Carmencita', 'Le clown et ses chiens', 'Pauvre Pierrot', 'Un bon bock', 'Blacksmith Scene', 'Chinese Opium Den', 'Corbett and Courtney Before the Kinetograph', 'Edison Kinetoscopic Record of a Sneeze', 'Miss Jerry', 'La sortie de l'usine Lumière à Lyon', 'Akrobatisches Potpourri', 'L'arrivée d'un train à La Ciotat', 'Le débarquement du congrès de photographie à Lyon', 'L'arroseur arrosé', 'Autour d'une cabine'
# #####################
# ====colonne====: isAdult
# ====nb valeur uniques====:12
# '0', '1', '2019', '1981', '2020', '2017', '\N', '2023', '2022', '2011', '2014', '2005'
# #####################
# ====colonne====: startYear
# ====nb valeur uniques====:153
# ====15 premiers====:
# '1894', '1892', '1893', '1895', '1896', '1898', '1897', '1900', '1899', '1901', '1902', '1903', '1905', '1904', '1912'
# #####################
# ====colonne====: endYear
# ====nb valeur uniques====:97
# ====15 premiers====:
# '\N', '1947', '1945', '1955', '1949', '1958', '1951', '1950', '1952', '1954', '1957', '1953', '1956', '1967', '1971'
# #####################
# ====colonne====: runtimeMinutes
# ====nb valeur uniques====:940
# ====15 premiers====:
# '1', '5', '4', '12', '45', '2', '\N', '3', '100', '13', '6', '40', '11', '9', '10'
# #####################
# ====colonne====: genres
# ====nb valeur uniques====:2359
# ====15 premiers====:
# 'Documentary,Short', 'Animation,Short', 'Animation,Comedy,Romance', 'Comedy,Short', 'Short', 'Short,Sport', 'Romance', 'Documentary,Short,Sport', 'News,Short', 'News,Short,Sport', 'Comedy,Documentary,Short', 'Drama,Short', 'Fantasy,Short', 'Horror,Short', 'Comedy,Horror,Short'
# #-------------------------------------------------------------------------------


In [45]:
# fonction de traitement des colonnes et des valeurs d'un dataframe
df_title_basics3=df_title_basics2.copy()
my_dataframe=df_title_basics3
# liste des colonnes ['tconst', 'titleType', 'primaryTitle', 'originalTitle', 'isAdult', 'startYear', 'endYear', 'runtimeMinutes', 'genres']

col_to_keep = ['tconst', 'titleType', 'primaryTitle',  'startYear',  'runtimeMinutes', 'genres']

# tableau des valeurs à garder
liste_valeur_to_keep ={
    "titleType":['movie'],
    "genres": ['Action', 'Thriller', 'Adventure', 'Sci-Fi', 'Fantasy', 'Animation', 'War', 'Family', 'Musical', 'Mystery', 'Comedy',  'Drama'],
    "startYear":['2000'],
    "runtimeMinutes":['120','190']
    }
# col_to_keep = ['tconst', 'titleType']



def traitement_dataframe(my_dataframe, colonne_a_garder, liste_valeur_a_garder):
    # on ne garde que les colonnes ['tconst', 'nconst', 'category']
    print(f"\n===col_to_keep=== :\n{colonne_a_garder}")
    print(f"\n==valeur_a_garder=== :\n{liste_valeur_to_keep}")
    my_dataframe = my_dataframe[colonne_a_garder]
    print(f"\n===nom=== : \n{my_dataframe.describe()[0:2]}")
    # on ne garde que les valeurs de la liste liste_valeur_a_garder
    for col in liste_valeur_a_garder:
        # traitement pour ne garder que les films à partir de l'an 2000
        if col=="startYear":
            print(f"\nliste_valeur_a_garder[col] :\n{int(liste_valeur_a_garder[col][0])} \n")
            # Converti la colonne 'startYear' en entier
            my_dataframe[col] = pd.to_numeric(my_dataframe[col], errors='coerce')
            # Gardez uniquement les films sortis à partir de l'année 2000
            my_dataframe = my_dataframe[my_dataframe[col] >= int(liste_valeur_a_garder[col][0])]

        # on ne garde que les films dont la durées est entre 2h et 2h30
        elif col=="runtimeMinutes":
            print(f"\nliste_valeur_a_garder[col][0] :\n{int(liste_valeur_a_garder[col][0])} \n")
            print(f"\nliste_valeur_a_garder[col][1] :\n{int(liste_valeur_a_garder[col][1])} \n")
            # Convertissez la colonne 'runtimeMinutes' en entier
            my_dataframe[col] = pd.to_numeric(my_dataframe[col], errors='coerce')
            # Gardez uniquement les films qui durent entre 120 et 180 minutes
            my_dataframe = my_dataframe[(my_dataframe[col] >= int(liste_valeur_a_garder[col][0])) & (my_dataframe[col] <= int(liste_valeur_a_garder[col][1]))]

        # pour chaque colonne de la liste garde uniquement les valeurs choisies
        else:
            filtre_to_keep= liste_valeur_a_garder[col]
            print(f"\n col {col} liste_valeur_to_keep :\n{filtre_to_keep} \n")
            filtre_type_category = my_dataframe[col].isin(filtre_to_keep)
            my_dataframe = my_dataframe[filtre_type_category]
    # supression des \N des colonnes restantes
    my_dataframe = my_dataframe.loc[:, ~(my_dataframe == r'\N').any()]
    print(f"\n head :\n{my_dataframe.head(5)} \n")
    #  retourne le dateframe traité
    return my_dataframe


df_title_basics3=traitement_dataframe(my_dataframe,col_to_keep,liste_valeur_to_keep)

df_title_basics3



===col_to_keep=== :
['tconst', 'titleType', 'primaryTitle', 'startYear', 'runtimeMinutes', 'genres']

==valeur_a_garder=== :
{'titleType': ['movie'], 'genres': ['Action', 'Thriller', 'Adventure', 'Sci-Fi', 'Fantasy', 'Animation', 'War', 'Family', 'Musical', 'Mystery', 'Comedy', 'Drama'], 'startYear': ['2000'], 'runtimeMinutes': ['120', '190']}

===nom=== : 
          tconst titleType primaryTitle startYear runtimeMinutes    genres
count   10461115  10461115     10461098  10461115       10461115  10461097
unique  10461115        11      4695109       153            940      2359

 col titleType liste_valeur_to_keep :
['movie'] 


 col genres liste_valeur_to_keep :
['Action', 'Thriller', 'Adventure', 'Sci-Fi', 'Fantasy', 'Animation', 'War', 'Family', 'Musical', 'Mystery', 'Comedy', 'Drama'] 


liste_valeur_a_garder[col] :
2000 


liste_valeur_a_garder[col][0] :
120 


liste_valeur_a_garder[col][1] :
190 


 head :
           tconst titleType                primaryTitle  startYear  \
676

Unnamed: 0,tconst,titleType,primaryTitle,startYear,runtimeMinutes,genres
67660,tt0069049,movie,The Other Side of the Wind,2018.0,122.0,Drama
93924,tt0096056,movie,Crime and Punishment,2002.0,126.0,Drama
135392,tt0139500,movie,In Vanda's Room,2000.0,171.0,Drama
142147,tt0146592,movie,Pál Adrienn,2010.0,136.0,Drama
155405,tt0160550,movie,Moscow,2000.0,139.0,Drama
...,...,...,...,...,...,...
10455813,tt9905412,movie,Ottam,2019.0,120.0,Drama
10457525,tt9909086,movie,Pheriaa Come Back,2018.0,137.0,Drama
10458385,tt9911006,movie,Ormma,2019.0,127.0,Drama
10458758,tt9911774,movie,Padmavyuhathile Abhimanyu,2019.0,130.0,Drama


In [None]:

for i, col_keep in enumerate(col_to_keep):
    df = traitement_dataframe(my_dataframe, col_keep, valeur_to_keep_isadult)
    if col_keep==liste_valeur_to_keep[0]:
        filtre_to_keep = liste_valeur_to_keep[1]
        filtre_type_category = df_title_principals3['category'].isin(filtre_to_keep)
        df_title_principals3 = df_title_principals3[filtre_type_category]


In [None]:
['Documentary','Short'], ['Animation','Short'], ['Animation','Comedy','Romance'], ['Comedy,Short'], 'Short', ['Short','Sport'], 'Romance', ['Documentary','Short','Sport'], ['News','Short'], ['News','Short','Sport']


FD

In [305]:
df_title_basics3["tconst"].dtype


dtype('O')

In [303]:
liste_valeurs = []
for i in range(0, df_title_basics3.shape[1]):
    split_values = [item.split(",") for item in df_title_basics3["genres"].values[0:i+10]]
    liste_valeurs.append(split_values[i])
print(f"\nliste_valeurs :{liste_valeurs} \n")



liste_valeurs :[['Documentary', 'Short'], ['Animation', 'Short'], ['Animation', 'Comedy', 'Romance'], ['Animation', 'Short'], ['Comedy', 'Short'], ['Short'], ['Short', 'Sport'], ['Documentary', 'Short'], ['Romance']] 



In [291]:
your_dataframe=df_title_basics3
for col in your_dataframe.columns:
    liste_valeurs = []
    for i in range(0, your_dataframe.shape[1]):
        print(f"\ncol :{col} \n")
        col=str(col)
        print(f"\ncol :{type(col)} {i}\n")
        split_values = [item.split(",")
                        for item in your_dataframe[col].values[0:i+10]]
        liste_valeurs.append(split_values[i])
    print(f"\nliste_valeurs :{liste_valeurs} \n")



col :tconst 


col :<class 'str'> 0



AttributeError: 'tuple' object has no attribute 'split'

In [323]:
# split_values = [item.split(" ") for item in title_basics2["genres"].values[0:5]]
# str_split_values="['"
# str_split_values += "','".join(split_values[1])
# str_split_values=str_split_values.replace(",", "','")
# str_split_values += "']"
# print(f"\nsplit_values :\n{split_values} \n")
# print(f"\nsplit_values :\n{str_split_values} \n")
df_title_basics3 = df_title_basics2.copy()

liste_valeurs=[]
for i in range(0,df_title_basics3.shape[0]):
    split_values = [item.split(" ") for item in df_title_basics3["genres"].values[0:i+10]]
    str_split_values="['"
    str_split_values += "','".join(split_values[i])
    str_split_values=str_split_values.replace(",", "','")
    str_split_values += "']"
    liste_valeurs.append(str_split_values)
    liste_valeurs[i].replace("[\"","[]")
    # print(f"\nsplit_values {i} :\n{split_values} \n")
    # print(f"\nsplit_values {i} :\n{str_split_values} \n")
print(f"\nliste_valeurs :\n{liste_valeurs} \n")
    # print(f"\n{i} :{title_basics2["genres"].values[0:5].split(" ")} \n")


# desc = title_basics2["genres"].describe()
# print(f"Count: {desc['count']}")
# print(f"Unique: {desc['unique']}")
# print(f"Top: {desc['top']}")
# print(f"Freq: {desc['freq']}")


TypeError: 'int' object is not subscriptable

In [1]:
df_title_basics3 = title_basics.copy()


NameError: name 'title_basics' is not defined

In [380]:
# Résumé des informations du dataframe
# your_dataframe = title_basics2["genres"]
liste_valeur_unique = r", ".join(f"'{item}'" for item in df_title_basics3["genres"].unique().tolist())

print(f"\nliste_valeur_unique :\n{liste_valeur_unique} \n")

# for col in your_dataframe.columns:
#    if your_dataframe[col].nunique() < 15:
#        print(r", ".join(f"'{item}'" for item in your_dataframe[col].unique().tolist()))
#    else:
#        unique_values = your_dataframe[col].unique()[:15]
#        print("====15 premiers====: \n" +", ".join(f"'{item}'" for item in unique_values))



liste_valeur_unique :
'Documentary,Short', 'Animation,Short', 'Animation,Comedy,Romance', 'Comedy,Short', 'Short', 'Short,Sport', 'Romance', 'Documentary,Short,Sport', 'News,Short', 'News,Short,Sport', 'Comedy,Documentary,Short', 'Drama,Short', 'Fantasy,Short', 'Horror,Short', 'Comedy,Horror,Short', 'Biography,Short', 'Music,Short', 'Documentary,News,Sport', 'Fantasy,Horror,Short', 'Short,War', 'Crime,Short', 'Short,Western', 'Comedy,Short,Sport', 'Comedy,Fantasy,Horror', 'Biography,Drama,Short', 'Romance,Short', 'Family,Fantasy,Romance', 'Drama,Short,War', 'Drama,Family,Fantasy', 'Adventure,Fantasy,Horror', 'Comedy,Romance,Short', 'Action,Crime,Drama', 'Comedy,Fantasy,Short', 'Animation,Comedy,Fantasy', 'Family,Short', 'Documentary,News,Short', 'Drama,History,Short', 'Action,Drama,Short', 'Crime,Drama,Short', 'Fantasy,Romance,Short', 'Drama,Fantasy,Horror', 'Drama,Horror,Short', 'Drama,Fantasy,Short', 'History,Short', 'Action,Adventure,Comedy', 'Family,Fantasy,Short', 'Action,Adventu

In [None]:
liste_valeur_unique.replace(",","','")
liste_valeur_unique
liste_valeur_unique


In [384]:
len(liste_valeur_unique)


59009

In [311]:
#  traitement des valeurs de la colonne "genres"
listes_genres_uniques=set(['Documentary','Short', 'Animation','Short', 'Animation','Comedy','Romance', 'Comedy','Short', 'Short', 'Short','Sport', 'Romance', 'Documentary','Short','Sport', 'News','Short', 'News','Short','Sport', 'Comedy','Documentary','Short', 'Drama','Short', 'Fantasy','Short', 'Horror','Short', 'Comedy','Horror','Short', 'Biography','Short', 'Music','Short', 'Documentary','News','Sport', 'Fantasy','Horror','Short', 'Short','War', 'Crime','Short', 'Short','Western', 'Comedy','Short','Sport', 'Comedy','Fantasy','Horror', 'Biography','Drama','Short', 'Romance','Short', 'Family','Fantasy','Romance', 'Drama','Short','War', 'Drama','Family','Fantasy', 'Adventure','Fantasy','Horror', 'Comedy','Romance','Short', 'Action','Crime','Drama', 'Comedy','Fantasy','Short', 'Animation','Comedy','Fantasy', 'Family','Short', 'Documentary','News','Short', 'Drama','History','Short', 'Action','Drama','Short', 'Crime','Drama','Short', 'Fantasy','Romance','Short', 'Drama','Fantasy','Horror', 'Drama','Horror','Short', 'Drama','Fantasy','Short', 'History','Short', 'Action','Adventure','Comedy', 'Family','Fantasy','Short', 'Action','Adventure','Crime', 'Adventure','Drama','Short', 'Action','Short', 'Comedy','Music','Short', 'Adventure','Fantasy','Short', 'Comedy','Family','Short', 'Drama','Short','Western', 'Crime','Drama','Family', 'Action','Adventure','Family', 'Crime','Drama','Mystery', 'Animation','Fantasy','Sci-Fi', 'Biography','Crime','Short', 'Adventure','Romance','Short', 'Comedy','Fantasy','Sci-Fi', 'Animation','Comedy','Family', 'Drama','Romance','Short', 'Animation','Family','Fantasy', 'Documentary','History','Short', 'Action','Adventure','Biography', 'Adventure','Fantasy','Sci-Fi', 'Drama', 'Documentary','Drama','Short', 'Drama','Music','Musical', 'Drama','Music','Romance', 'Adventure','Short', 'Drama','History','Romance', 'Comedy','Drama','Short', 'Adventure','Fantasy', 'Action','Short','Western', 'Action','Short','War', 'Animation','Comedy','Short', 'Comedy','Short','Western', 'Fantasy','Sci-Fi','Short', 'Mystery','Short', 'Drama','Mystery','Short', 'Action','Romance','Short', 'Adventure','Drama','Romance', 'Documentary','Short','War', 'Comedy','Drama','Romance', 'Drama','Fantasy','Romance', 'Action','Drama','History', 'Comedy', 'Drama','War', 'Documentary', 'Adventure','Comedy','Drama', 'Crime', 'Drama','Romance', 'Romance','Short','Western', 'Adventure','Drama', 'Animation','Fantasy','Horror', 'Fantasy','Horror','Sci-Fi', 'Drama','Short','Thriller', 'Biography','Drama','Family', 'Musical','Short', 'Drama','History', 'Documentary','Short','Western', 'Short','Thriller','Western', 'Action','Adventure','Drama', 'Drama','Family','Short', 'Adventure','Short','Western', 'War', 'Comedy','Sci-Fi','Short', 'Short','Thriller', 'History','Short','Western', 'Crime','History','Short', 'Biography','Drama','History', 'Sci-Fi', 'Crime','Short','Thriller', 'Crime','Romance','Short', 'Adventure','Biography','Drama', 'Adventure','Horror','Sci-Fi', 'Adventure','Drama','Fantasy', 'Comedy','Crime','Short', 'Drama','Horror','Sci-Fi', 'Biography','Drama', 'Action','Comedy','Short', 'Animation','Fantasy','Short', 'Short','Thriller','War', 'Documentary','War', 'Biography','Drama','Romance', 'History','War', 'Crime','Short','Western', 'Action','Drama','Thriller', 'Crime','Thriller', 'Adventure', 'Drama','Thriller', 'Western', 'Drama','Fantasy', 'Action','Adventure','Short', 'Adventure','Drama','Sci-Fi', 'Comedy','Drama', 'Comedy','Short','War', 'Crime','Drama', 'Comedy','Family','Romance', 'Crime','Drama','Romance', 'Comedy','Fantasy', 'Drama','Sport', 'Biography', 'Adventure','Short','Thriller', 'Action','Adventure', 'Comedy','Crime', 'Adventure','Comedy', 'Documentary','Western', 'Crime','Drama','Horror', 'Adventure','Biography','Western', 'Adventure','Crime', 'Adventure','Drama','History', 'Action','Drama','Romance', 'Crime','Short','Sport', 'Drama','Music','Western', 'Action', 'Fantasy', 'Adventure','Horror', 'Horror', 'Drama','Mystery', 'Adventure','Comedy','Family', 'Crime','Horror','Mystery', 'History','Western', 'Biography','Romance','Short', 'Comedy','Crime','Romance', 'Adventure','Family','Fantasy', 'Thriller', 'Adventure','Mystery','Romance', 'Mystery','Romance','Short', 'Mystery', 'Action','Adventure','Romance', 'Crime','Drama','Western', 'Drama','Western', 'Drama','Mystery','Romance', 'Action','Crime', 'Drama','History','War', 'Adventure','Mystery', 'Documentary','Sport', 'Comedy','Thriller', 'Drama','Sci-Fi', 'Comedy','Romance', 'Drama','Romance','War', 'Comedy','Western', 'Comedy','Family','Fantasy', 'Adventure','Romance', 'Adventure','Short','War', 'Drama','Music','Short', 'Adventure','Crime','Drama', 'Mystery','Short','Thriller', 'Romance','Western', 'Adventure','Comedy','Western', 'Animation','Drama','Short', 'Crime','Mystery', 'Adventure','Drama','Mystery', 'Drama','Horror', 'Mystery','Thriller', 'Adventure','Romance','Western', 'Biography','Crime','Drama', 'Comedy','Drama','Sport', 'Adventure','Crime','Thriller', 'Drama','Sci-Fi','Short', 'Comedy','Short','Thriller', 'Action','Adventure','Sci-Fi', 'Action','Drama', 'Adventure','Comedy','Romance', 'Action','Drama','War', 'Adventure','History', 'Crime','Drama','Thriller', 'Horror','Mystery', 'Biography','Comedy','Drama', 'Action','Biography','Western', 'Adventure','Western', 'Comedy','Crime','Drama', 'Comedy','Sport', 'Drama','Romance','Thriller', 'Horror','Sci-Fi', 'Action','Adventure','War', 'Drama','Romance','Western', 'Action','Comedy','Romance', 'Comedy','Sci-Fi', 'Comedy','Drama','Mystery', 'Comedy','Romance','Western', 'Action','Romance','Western', 'Action','Comedy','Drama', 'Adventure','Romance','Thriller', 'Fantasy','Sci-Fi', 'Adventure','Drama','Western', 'Fantasy','Romance', 'Drama','Thriller','War', 'Sci-Fi','Short', 'History', 'Comedy','Drama','Western', 'Animation','Comedy','Drama', 'Adventure','Comedy','Short', 'Fantasy','Thriller', 'Biography','Western', 'Comedy','Fantasy','Romance', 'Comedy','Mystery', 'Adventure','Family', 'Action','Thriller', 'Drama','Horror','Mystery', 'Drama','History','Western', 'Adventure','War', 'Drama','Romance','Sport', 'Comedy','Drama','Family', 'Adventure','Crime','Romance', 'Comedy','Mystery','Thriller', 'Drama','Fantasy','Sci-Fi', 'Music', 'Mystery','War', 'History','Romance', 'Documentary','History','War', 'Adventure','Comedy','Crime', 'Animation', 'Family','Fantasy', 'Action','War','Western', 'Biography','Drama','War', 'Comedy','Romance','War', 'Crime','Thriller','Western', 'Comedy','War', 'Drama','Family','Romance', 'Drama','Family', 'War','Western', 'Crime','Drama','War', 'Animation','Comedy', 'Adventure','Fantasy','Romance', 'Action','Western', 'Comedy','Drama','War', 'Animation','Short','War', 'Crime','Mystery','Thriller', 'Crime','Drama','History', 'Action','Crime','Thriller', 'Action','Drama','Western', 'Comedy','Crime','Western', 'Crime','Western', 'Romance','Thriller', 'Drama','Horror','War', 'Horror','Mystery','Thriller', 'Adventure','Horror','Romance', 'Action','Adventure','Mystery', 'Drama','History','Horror', 'Drama','Horror','Romance', 'Comedy','Drama','Music', 'Action','Mystery','Thriller', 'Fantasy','Horror','Mystery', 'Adventure','Mystery','Western', 'Action','Adventure','Fantasy', 'Adventure','Fantasy','Mystery', 'Action','Adventure','Thriller', 'Adventure','Drama','War', 'Fantasy','Horror', 'Comedy','Drama','Horror', 'Adventure','Sci-Fi', 'Crime','Horror', 'Drama','Music','War', 'Crime','Drama','Fantasy', 'Action','Adventure','Western', 'Horror','Sci-Fi','Short', 'Animation','Comedy','Mystery', 'Sport', 'Adventure','Documentary', 'Action','Crime','Mystery', 'Animation','Comedy','Sci-Fi', 'Drama','Horror','Thriller', 'Comedy','Drama','Thriller', 'Adventure','Comedy','Fantasy', 'Comedy','Drama','Fantasy', 'Animation','Documentary','Short', 'Fantasy','Mystery', 'Crime','Romance','Thriller', 'Action','Sport','Thriller', 'Mystery','Western', 'Action','Drama','Family', 'Animation','Fantasy','Romance', 'Comedy','Crime','Mystery', 'Comedy','Horror', 'Biography','Documentary', 'Comedy','Family', 'Documentary','Fantasy','Horror', 'Crime','Romance', 'Mystery','Sci-Fi', 'Comedy','Horror','Mystery', 'Adventure','Family','Romance', 'Crime','Drama','Sport', 'Action','Comedy', 'Action','Drama','Mystery', 'Adventure','Crime','Mystery', 'Animation','Comedy','History', 'Adventure','Comedy','Mystery', 'Adventure','Mystery','Thriller', 'Action','Comedy','Sport', 'Documentary','Music','Short', 'Comedy','Romance','Thriller', 'Biography','Music', 'Action','Comedy','Thriller', 'Action','Mystery','Romance', 'Comedy','Romance','Sport', 'Drama','Romance','Sci-Fi', 'Mystery','Romance', 'Action','Drama','Sport', 'Comedy','Mystery','Romance', 'Sport','Western', 'Action','Romance','Sport', 'Animation','Music','Short', 'Adventure','Animation','Family', 'Adventure','War','Western', 'Drama','History','Thriller', 'Comedy','Horror','Sci-Fi', 'Drama','Thriller','Western', 'Comedy','Family','Musical', 'Action','Drama','Sci-Fi', 'Documentary','Drama', 'Adventure','Family','Western', 'Adventure','Drama','Horror', 'Adventure','Romance','War', 'Animation','Family','Short', 'Action','Comedy','Crime', 'Action','History','Romance', 'Biography','History','Western', 'Action','Crime','Romance', 'Comedy','Musical', 'Action','Comedy','War', 'Action','Drama','Fantasy', 'Horror','Thriller', 'Animation','Comedy','Music', 'Adventure','Documentary','Drama', 'Drama','Family','War', 'Adventure','Drama','Thriller', 'Drama','Fantasy','War', 'Documentary','History', 'Action','Comedy','Western', 'Fantasy','Music','Romance', 'Crime','Drama','Film-Noir', 'Horror','Mystery','Sci-Fi', 'Adventure','Mystery','War', 'Musical', 'Crime','Drama','Music', 'Family', 'Drama','Musical', 'Action','Comedy','Family', 'Drama','Musical','Romance', 'Comedy','Musical','Short', 'Crime','Musical','War', 'Action','Drama','Music', 'Crime','Music','Romance', 'Comedy','Music','Romance', 'Comedy','Drama','Musical', 'Mystery','Romance','Thriller', 'Comedy','Musical','Romance', 'Action','Musical','Romance', 'Music','Romance','War', 'Animation','Documentary','Music', 'Adventure','Drama','Music', 'Musical','Romance','Western', 'Comedy','Music', 'Musical','Romance', 'Drama','Mystery','Thriller', 'Adventure','Romance','Sci-Fi', 'Crime','Mystery','Romance', 'Comedy','Musical','Western', 'Documentary','History','Sport', 'Music','Romance','Sport', 'Action','Comedy','Music', 'Mystery','Romance','Western', 'Drama','Music', 'Fantasy','Musical','Short', 'Animation','Comedy','Horror', 'Drama','Musical','Mystery', 'Music','Romance', 'Comedy','Fantasy','Musical', 'Adventure','Fantasy','Musical', 'Biography','Drama','Musical', 'Comedy','Crime','Sci-Fi', 'Comedy','Crime','Family', 'Family','Western', 'Biography','Musical', 'Animation','Family','Musical', 'Drama','Family','Western', 'Action','Adventure','Music', 'Animation','Comedy','Musical', 'Music','Musical','Romance', 'Music','Romance','Short', 'Crime','Drama','Musical', 'Adventure','Animation','Comedy', 'Action','Romance','Thriller', 'Action','Adventure','Musical', 'Comedy','Musical','Sport', 'Action','Mystery', 'Adventure','Comedy','Music', 'Adventure','Musical','Romance', 'Crime','Musical', 'Drama','Family','Sport', 'Drama','Musical','Short', 'Adventure','Drama','Family', 'Drama','Short','Sport', 'Comedy','Horror','Thriller', 'Adventure','Comedy','Musical', 'Crime','Documentary', 'Action','Adventure','Horror', 'Comedy','History','Musical', 'Comedy','Film-Noir', 'Drama','Mystery','Western', 'Animation','Family','Music', 'Mystery','Sport', 'Drama','Fantasy','Mystery', 'Comedy','Crime','Horror', 'Animation','Musical','Short', 'Drama','War','Western', 'Comedy','Crime','Musical', 'Crime','Film-Noir','Romance', 'Animation','Short','Western', 'Adventure','Family','Short', 'Comedy','Fantasy','Music', 'Adventure','Drama','Film-Noir', 'Drama','Mystery','Sci-Fi', 'Adventure','Thriller','Western', 'Adventure','Documentary','Short', 'Musical','Romance','Short', 'Animation','Family','Horror', 'Drama','Film-Noir','Romance', 'History','Musical', 'Comedy','Music','Mystery', 'Action','Drama','Horror', 'Action','Crime','Music', 'Film-Noir','Horror','Sci-Fi', 'Biography','Drama','Music', 'Comedy','Documentary', 'Romance','Sport', 'Adventure','Music','Romance', 'Drama','Sci-Fi','War', 'Drama','Family','Mystery', 'Crime','Music','Mystery', 'Music','Romance','Western', 'Comedy','Drama','History', 'Adventure','Family','Horror', 'Biography','Music','Romance', 'Comedy','Music','Musical', 'Action','Crime','Family', 'Action','Romance', 'Comedy','Family','Mystery', 'Drama','History','Music', 'Comedy','Musical','Mystery', 'Crime','Horror','Romance', 'Comedy','Mystery','Short', 'Drama','Mystery','Sport', 'Crime','Sci-Fi', 'Action','Comedy','Mystery', 'Action','Crime','Western', 'Adventure','Animation','Drama', 'Biography','Comedy', 'Comedy','Family','Music', 'Crime','Music','Musical', 'Drama','Music','Mystery', 'Drama','History','Musical', 'Crime','Musical','Romance', 'Biography','Drama','Western', 'Action','Comedy','Documentary', 'Documentary','Music', 'Adventure','Drama','Musical', 'Music','Musical','Short', 'Action','Adventure','History', 'Animation','Fantasy','Musical', 'Family','Musical','Romance', 'Action','Biography','Drama', 'Comedy','History','Romance', 'Horror','Romance','Sci-Fi', 'Musical','Romance','War', 'Biography','Musical','Short', 'Comedy','Crime','Music', 'Musical','Short','Western', 'Comedy','History','Mystery', 'Adventure','Comedy','Sci-Fi', 'Drama','News','War', 'Family','Musical', 'Musical','Mystery','Romance', 'Family','Sci-Fi', 'Comedy','Crime','Film-Noir', 'Comedy','Family','War', 'Animation','Comedy','Crime', 'Horror','Sci-Fi','Thriller', 'Biography','Musical','Romance', 'Drama','Family','Musical', 'Comedy','Family','Sport', 'Crime','Film-Noir','Mystery', 'Adventure','Family','Musical', 'Adventure','Family','Sci-Fi', 'Comedy','Music','War', 'Action','Drama','Musical', 'Adventure','Crime','Sci-Fi', 'Action','Music','Romance', 'Drama','History','Mystery', 'Adventure','History','Romance', 'History','Musical','Romance', 'Mystery','Romance','Sport', 'Crime','Mystery','Sci-Fi', 'Adventure','Music','Musical', 'Comedy','Thriller','War', 'Animation','Drama','Family', 'Musical','Western', 'Documentary','Musical','Short', 'Crime','Film-Noir','Thriller', 'Comedy','History', 'Drama','Film-Noir','Mystery', 'Comedy','Family','Western', 'Action','Music','Western', 'Adventure','Biography','Romance', 'Comedy','Musical','War', 'Comedy','Music','Sport', 'Biography','War', 'Drama','Fantasy','Musical', 'Musical','Romance','Sport', 'Comedy','Family','Horror', 'Romance','War', 'Romance','Sci-Fi', 'Drama','Film-Noir', 'Action','History', 'Biography','Drama','Sport', 'Animation','Music','Romance', 'Documentary','Drama','News', 'Animation','Short','Sport', 'Crime','Film-Noir', 'Comedy','Crime','Sport', 'Biography','Short','War', 'Biography','Documentary','History', 'Biography','Music','Musical', 'Adventure','Crime','Family', 'Adventure','Comedy','War', 'Crime','Horror','Sci-Fi', 'History','War','Western', 'Adventure','Thriller', 'Drama','Film-Noir','Thriller', 'Action','Family','Horror', 'Musical','Romance','Thriller', 'Animation','Horror','Short', 'Thriller','War', 'Biography','Comedy','Music', 'Drama','Family','Music', 'Action','Music', 'Adventure','Comedy','Horror', 'Drama','Fantasy','History', 'Drama','Fantasy','Music', 'Biography','Documentary','Drama', 'Romance','Thriller','War', 'Sport','Thriller', 'Comedy','Romance','Sci-Fi', 'Comedy','Crime','Thriller', 'Action','Music','Sci-Fi', 'Animation','Family','Romance', 'History','Music','Romance', 'Music','Short','Western', 'Animation','Romance','Short', 'Documentary','Family','Music', 'Action','Comedy','Musical', 'Mystery','Thriller','War', 'Action','Adventure','Animation', 'Comedy','Family','Sci-Fi', 'Drama','Mystery','War', 'Action','Documentary', 'Film-Noir','Mystery','Thriller', 'Comedy','Fantasy','Mystery', 'Horror','Mystery','Romance', 'Action','Crime','Musical', 'Fantasy','Horror','Thriller', 'Biography','History', 'Adventure','Animation','Short', 'Comedy','Drama','Film-Noir', 'Animation','Documentary','Family', 'History','Short','War', 'Action','Adventure','Documentary', 'Family','History','Short', 'Animation','Comedy','Documentary', 'Film-Noir','Mystery','Romance', 'Drama','Musical','Western', 'Comedy','History','Music', 'History','Romance','Sci-Fi', 'Music','Short','War', 'Film-Noir','Horror','Mystery', 'Action','Documentary','Drama', 'Documentary','News', 'Film-Noir','Mystery', 'Fantasy','Music','Short', 'Adventure','Biography','War', 'Film-Noir','Horror','Thriller', 'Action','War', 'Fantasy','Mystery','Short', 'Musical','War', 'Action','Crime','Film-Noir', 'Drama','Horror','Music', 'Film-Noir','Thriller', 'Biography','Drama','Mystery', 'Animation','Documentary','History', 'Crime','Horror','Thriller', 'Horror','Music','Thriller', 'Comedy','Horror','Romance', 'Adventure','Musical','War', 'Film-Noir','Romance','Thriller', 'Adventure','Sci-Fi','War', 'Drama','Fantasy','Thriller', 'Drama','Film-Noir','Horror', 'Mystery','Romance','War', 'Action','Fantasy','Horror', 'Adventure','Comedy','Film-Noir', 'Adventure','Crime','Western', 'Drama','Horror','Western', 'Adventure','Film-Noir','Mystery', 'Adventure','Animation','Musical', 'Adventure','Comedy','Thriller', 'Action','Crime','Horror', 'Action','Sci-Fi', 'Comedy','Music','Western', 'Music','Musical', 'Drama','Musical','War', 'Fantasy','Musical', 'Fantasy','Musical','Romance', 'Talk-Show', 'Adventure','Crime','Film-Noir', 'Comedy','Sci-Fi','War', 'Adventure','Film-Noir','Romance', 'Drama','Film-Noir','Music', 'Adventure','Crime','Horror', 'Drama','Musical','Thriller', 'Adventure','Comedy','History', 'Comedy','Fantasy','History', 'Comedy','Film-Noir','Mystery', 'Family','Game-Show', 'Documentary','Musical', 'Family','Fantasy','Musical', 'Drama','History','Sport', 'Drama','Film-Noir','Sport', 'Action','Comedy','Fantasy', 'Adventure','Comedy','Sport', 'Mystery','Sci-Fi','Thriller', 'Adventure','Family','Mystery', 'Family','Talk-Show', 'Comedy','Family','Reality-TV', 'Game-Show', 'Music','Talk-Show', 'Action','Biography','Crime', 'Biography','Crime','History', 'Action','Drama','Film-Noir', 'Documentary','Family','Short', 'Comedy','History','War', 'Drama','Family','History', 'Adventure','Documentary','Thriller', 'Biography','Comedy','Musical', 'Family','Music', 'Comedy','Family','Talk-Show', 'Comedy','Family','Game-Show', 'Game-Show','Music', 'Drama','Fantasy','Film-Noir', 'Adventure','Animation', 'Action','Crime','Sci-Fi', 'History','Romance','Thriller', 'Crime','Film-Noir','Sport', 'Drama','Sport','Thriller', 'Comedy','Music','Talk-Show', 'Family','Music','Western', 'Sci-Fi','Thriller', 'Drama','Sport','War', 'Animation','Musical', 'Comedy','Game-Show', 'News','Talk-Show', 'Drama','Game-Show','Mystery', 'Family','Romance', 'Action','Family','Sci-Fi', 'Horror','Mystery','Short', 'Fantasy','Music','Musical', 'Biography','History','Music', 'Action','Sci-Fi','Thriller', 'Comedy','Crime','Fantasy', 'Family','Game-Show','Sport', 'Documentary','Family', 'Family','Game-Show','Music', 'Biography','Family','Reality-TV', 'News', 'Comedy','History','Short', 'Music','Western', 'Adventure','Horror','Mystery', 'Film-Noir','Western', 'Biography','Family','Musical', 'Comedy','Drama','Sci-Fi', 'Animation','History','Short', 'Thriller','War','Western', 'Comedy','Reality-TV', 'Documentary','News','Talk-Show', 'Animation','Family', 'Family','Fantasy','Music', 'Drama','Sport','Western', 'Crime','History','Musical', 'Horror','Romance', 'Crime','Family','Mystery', 'Biography','Drama','Film-Noir', 'History','Romance','Western', 'Comedy','Horror','Musical', 'Animation','Fantasy', 'Adventure','Biography','History', 'Adventure','Documentary','Fantasy', 'Animation','Crime','Horror', 'Romance','War','Western', 'Comedy','Talk-Show', 'Action','Documentary','War', 'Action','Short','Sport', 'Adventure','History','War', 'Crime','Drama','Sci-Fi', 'Adventure','Family','History', 'Comedy','Fantasy','Sport', 'Action','Fantasy','Sci-Fi', 'Animation','Drama', 'Comedy','Musical','Sci-Fi', 'History','Romance','War', 'Adventure','Musical', 'Documentary','Horror','Short', 'History','Thriller', 'Crime','Documentary','Short', 'Game-Show','Reality-TV', 'Crime','Sci-Fi','Thriller', 'Action','Family','Short', 'Biography','Documentary','Short', 'Action','Horror','Sci-Fi', 'Film-Noir','Horror', 'Crime','Family', 'Comedy','Documentary','Music', 'Crime','Romance','Western', 'Biography','Drama','Thriller', 'Adventure','Crime','Fantasy', 'Biography','Documentary','Music', 'Family','Music','Talk-Show', 'Crime','War', 'Family','Horror','Sci-Fi', 'Adventure','Fantasy','History', 'Action','Romance','War', 'Adventure','Family','War', 'Fantasy','Musical','Mystery', 'Comedy','Music','Thriller', 'Crime','Fantasy','Horror', 'Horror','Sci-Fi','Western', 'Drama','Sci-Fi','Thriller', 'Biography','Crime','Film-Noir', 'Musical','Thriller', 'Horror','Mystery','Western', 'Documentary','Drama','Romance', 'Biography','Sci-Fi', 'Action','Biography','History', 'Documentary','Romance', 'Action','Horror','Romance', 'Animation','Sci-Fi','Short', 'Documentary','Drama','War', 'Adventure','Sci-Fi','Thriller', 'Horror','Western', 'Documentary','Talk-Show', 'Game-Show','Sport', 'Family','War', 'Documentary','History','News', 'Biography','Fantasy', 'Adventure','Comedy','Documentary', 'Family','Fantasy','History', 'Animation','Documentary', 'Comedy','Sci-Fi','Thriller', 'Action','Horror', 'Adventure','History','Mystery', 'Mystery','Thriller','Western', 'Action','Animation','Drama', 'Animation','Fantasy','Music', 'Family','Fantasy','Horror', 'Musical','Mystery','Thriller', 'Drama','Musical','Sport', 'Action','Fantasy', 'Fantasy','Romance','Sci-Fi', 'Adventure','Animation','Biography', 'Biography','Sport', 'Action','Animation','Short', 'Documentary','Horror', 'Documentary','Thriller', 'Comedy','Documentary','Drama', 'Fantasy','Mystery','Western', 'Comedy','War','Western', 'Adventure','Biography','Comedy', 'Fantasy','Music', 'Biography','Documentary','War', 'Horror','Musical', 'Adventure','Fantasy','War', 'Documentary','Music','News', 'Documentary','History','Music', 'Action','Animation','Comedy', 'Game-Show','Romance', 'Animation','Crime','Family', 'Action','Mystery','Sci-Fi', 'Biography','Crime','Thriller', 'Action','Horror','Mystery', 'Family','Sport', 'Adventure','Animation','Fantasy', 'Adventure','Biography', 'Biography','Drama','Horror', 'Action','Thriller','War', 'Adventure','Crime','History', 'Action','Crime','Fantasy', 'Action','Animation','Crime', 'Animation','Family','Western', 'Action','Animation','Sci-Fi', 'Animation','Family','Sci-Fi', 'Adventure','Biography','Family', 'Adventure','Documentary','Family', 'Adventure','Animation','Sci-Fi', 'Comedy','Music','Sci-Fi', 'Action','Comedy','Horror', 'Animation','Mystery','Short', 'Comedy','Fantasy','Western', 'Action','Sport', 'Comedy','Documentary','Family', 'Action','Family','Fantasy', 'Action','Animation','History', 'Drama','Family','Thriller', 'Action','Comedy','Sci-Fi', 'Crime','Musical','Mystery', 'Adult','Short', 'Fantasy','Sci-Fi','Thriller', 'Biography','History','Romance', 'Adult','Drama', 'Documentary','Drama','History', 'Action','Thriller','Western', 'Biography','Comedy','Documentary', 'Action','History','War', 'Adventure','Musical','Mystery', 'Biography','Comedy','History', 'Animation','History', 'Biography','Horror', 'Drama','Fantasy','Sport', 'Adventure','Biography','Crime', 'Animation','Family','Sport', 'Adult', 'Animation','Drama','Sport', 'Adult','Fantasy','Horror', 'Thriller','Western', 'Musical','Sci-Fi', 'Documentary','Family','Sport', 'Documentary','Music','Thriller', 'Adult','Documentary', 'Adult','Comedy','Drama', 'Adult','Comedy', 'Adventure','Family','Sport', 'Comedy','Documentary','Sport', 'Fantasy','Mystery','Thriller', 'Biography','Documentary','Western', 'Biography','Romance', 'Adult','Adventure','Comedy', 'Action','Animation','Family', 'Adventure','Crime','Musical', 'Action','Adult','Western', 'Documentary','Drama','Music', 'Biography','Crime','Documentary', 'Adult','Horror', 'Adventure','Documentary','Romance', 'Horror','Romance','Thriller', 'Action','Adult','Drama', 'Family','History','Sci-Fi', 'Action','Musical','Thriller', 'Crime','Horror','Musical', 'Animation','Family','Mystery', 'Adventure','Documentary','History', 'Action','Documentary','Sport', 'Family','Fantasy','Mystery', 'Fantasy','Horror','Romance', 'Horror','Thriller','Western', 'Action','Biography','Comedy', 'Adventure','Animation','Crime', 'Animation','Documentary','Drama', 'Reality-TV', 'Family','Game-Show','Mystery', 'Adventure','History','Musical', 'Adult','Drama','Fantasy', 'Horror','Musical','Sci-Fi', 'Animation','Biography','Short', 'Biography','Documentary','Sport', 'Documentary','Drama','Fantasy', 'Animation','Sci-Fi', 'Adult','Drama','Romance', 'Adult','Crime','Horror', 'Biography','Documentary','Family', 'Adult','Comedy','Sci-Fi', 'Animation','Drama','Fantasy', 'Action','Horror','Thriller', 'Documentary','Mystery', 'Comedy','Documentary','Horror', 'Adult','Comedy','Romance', 'Adult','Adventure', 'Adult','Western', 'Music','Sci-Fi', 'Documentary','Family','Musical', 'Adult','Animation','Drama', 'Family','Fantasy','Sci-Fi', 'Documentary','History','Reality-TV', 'Adult','Horror','Thriller', 'Adult','Biography','Documentary', 'Animation','Documentary','Musical', 'Adult','Fantasy', 'Biography','Comedy','Crime', 'Adult','Crime', 'Adult','Crime','Drama', 'Fantasy','Western', 'Action','Sci-Fi','Sport', 'Adult','Drama','Thriller', 'Adult','Comedy','Horror', 'Documentary','Sci-Fi', 'Crime','History', 'Comedy','News', 'Adult','Talk-Show', 'Adult','Comedy','Fantasy', 'Horror','Thriller','War', 'Adult','Drama','Horror', 'Adult','Crime','Mystery', 'Horror','War', 'Adventure','Horror','Thriller', 'Adult','Comedy','Musical', 'Action','Fantasy','Musical', 'Comedy','Mystery','Sci-Fi', 'Documentary','Horror','Music', 'Adult','Romance', 'Biography','History','Talk-Show', 'Family','News','Sport', 'Family','Romance','Sci-Fi', 'Animation','Horror','Mystery', 'Fantasy','Musical','Sci-Fi', 'Adult','Adventure','Drama', 'Comedy','Crime','History', 'Adult','Thriller', 'Action','Animation','Fantasy', 'Biography','Family', 'Comedy','Documentary','Romance', 'Documentary','Mystery','Sci-Fi', 'Family','Musical','Short', 'Adult','Drama','War', 'Biography','History','War', 'Adult','Drama','Musical', 'Crime','Mystery','Western', 'Adult','Mystery', 'Fantasy','Horror','Music', 'Family','Musical','Mystery', 'Adult','Comedy','Western', 'Crime','History','Mystery', 'Adult','Comedy','Crime', 'Documentary','Reality-TV', 'Adventure','Crime','Sport', 'Documentary','Drama','Western', 'Family','Mystery', 'Drama','Sci-Fi','Sport', 'Comedy','Drama','News', 'Animation','Romance','Sci-Fi', 'Adventure','Mystery','Sci-Fi', 'Fantasy','History','Romance', 'Fantasy','Mystery','Sci-Fi', 'Music','News', 'Music','Musical','Sci-Fi', 'Action','Comedy','History', 'Adult','Crime','Romance', 'Action','Adult','Adventure', 'Horror','Music','Short', 'Crime','Horror','Music', 'Action','Crime','Sport', 'Family','Horror','Mystery', 'Comedy','Game-Show','Music', 'Animation','Drama','History', 'Crime','Music','Thriller', 'Action','Romance','Sci-Fi', 'Fantasy','Romance','Thriller', 'Adult','Sci-Fi', 'Comedy','Game-Show','Talk-Show', 'Documentary','Family','History', 'Drama','Music','Sport', 'Action','Fantasy','Romance', 'Comedy','Documentary','Musical', 'Crime','Fantasy','Mystery', 'Drama','Horror','Musical', 'Music','Sci-Fi','Short', 'Crime','Sport', 'Animation','Biography','Drama', 'Adult','Drama','Mystery', 'Adult','Crime','Thriller', 'Animation','Biography','Documentary', 'Comedy','Family','History', 'Adult','Drama','Music', 'Action','Musical', 'Horror','Sport', 'Horror','Music', 'Action','Fantasy','Thriller', 'Fantasy','Mystery','Romance', 'Documentary','Horror','Thriller', 'Animation','Crime','Sci-Fi', 'Adult','Music', 'Action','Crime','Short', 'Adventure','Sci-Fi','Sport', 'Comedy','Horror','Music', 'History','Mystery', 'Biography','Crime', 'Action','Sci-Fi','Western', 'Music','Mystery','Thriller', 'Animation','Drama','War', 'Adult','Fantasy','Sci-Fi', 'Adventure','Animation','Western', 'Crime','Fantasy', 'Action','Crime','War', 'Action','Family','Sport', 'Action','Animation','Music', 'Documentary','Family','Fantasy', 'Action','Animation','Horror', 'Animation','Crime','Documentary', 'Animation','Drama','Horror', 'Crime','Family','Thriller', 'Animation','Drama','Sci-Fi', 'Drama','Music','Sci-Fi', 'Adventure','Sci-Fi','Short', 'Action','Horror','War', 'Animation','Crime','Drama', 'Documentary','Romance','Short', 'Comedy','Reality-TV','Talk-Show', 'Crime','Documentary','Drama', 'Action','Horror','Western', 'Adult','Adventure','Mystery', 'Adventure','Drama','Sport', 'Horror','Music','Mystery', 'Action','History','Thriller', 'Animation','Biography','Comedy', 'Action','Game-Show','Sport', 'Action','Crime','Reality-TV', 'Action','Mystery','Western', 'Action','Biography', 'Biography','Fantasy','Mystery', 'Adult','Comedy','Mystery', 'Comedy','Horror','Western', 'Documentary','History','Romance', 'Horror','Short','Thriller', 'Game-Show','Music','Reality-TV', 'Comedy','Game-Show','News', 'Documentary','Drama','Mystery', 'Sci-Fi','Short','Thriller', 'Action','Animation','Romance', 'Comedy','Horror','Sport', 'Adventure','Family','Thriller', 'Music','Thriller', 'Adventure','Animation','Horror', 'Adult','Drama','Western', 'Comedy','Fantasy','Thriller', 'Crime','Reality-TV', 'Crime','Drama','Game-Show', 'Biography','Documentary','Romance', 'Animation','Drama','Romance', 'Drama','Music','Thriller', 'Crime','Documentary','History', 'Documentary','History','Western', 'Drama','Reality-TV', 'Documentary','History','Mystery', 'Crime','Documentary','Mystery', 'Family','Thriller', 'Biography','Horror','Mystery', 'Action','Musical','Mystery', 'Adult','Documentary','Short', 'Horror','Romance','Western', 'Documentary','Drama','Sci-Fi', 'Documentary','Drama','Sport', 'Animation','Fantasy','Mystery', 'Animation','Drama','Thriller', 'Action','Biography','Documentary', 'Biography','Drama','Fantasy', 'Biography','Comedy','Fantasy', 'Documentary','Fantasy','Short', 'Biography','Comedy','Thriller', 'Adult','Animation','Comedy', 'Adventure','History','Western', 'Adult','Animation','Fantasy', 'Action','Fantasy','History', 'Drama','Talk-Show', 'Comedy','News','Talk-Show', 'Comedy','Documentary','News', 'Comedy','Sci-Fi','Talk-Show', 'Action','Family', 'Romance','Sci-Fi','Thriller', 'Adventure','Documentary','Sport', 'Animation','Drama','Mystery', 'Biography','Documentary','Mystery', 'Documentary','Family','Reality-TV', 'Crime','Horror','Short', 'Adult','History', 'Biography','Comedy','Romance', 'Animation','Comedy','Thriller', 'Family','News', 'Reality-TV','Sport', 'Adventure','Animation','Romance', 'Comedy','Sci-Fi','Western', 'Biography','Thriller', 'Comedy','Game-Show','Sport', 'Sci-Fi','Western', 'Biography','Documentary','Horror', 'Animation','Romance', 'Adult','Drama','Sci-Fi', 'Animation','Sci-Fi','Thriller', 'Biography','Drama','Sci-Fi', 'Family','Short','Sport', 'Adventure','Sci-Fi','Western', 'Action','Adult','Crime', 'Drama','Musical','Sci-Fi', 'Documentary','Romance','War', 'Action','Sci-Fi','War', 'Action','Animation', 'Comedy','Crime','Reality-TV', 'Documentary','Thriller','War', 'Adult','Adventure','Fantasy', 'Drama','Family','Horror', 'Reality-TV','Sci-Fi', 'Action','Sci-Fi','Short', 'Reality-TV','Thriller', 'Documentary','Drama','Thriller', 'Comedy','Family','Thriller', 'Comedy','Documentary','History', 'Action','Comedy','Talk-Show', 'Reality-TV','Romance', 'Comedy','Drama','Talk-Show', 'Adult','Fantasy','Romance', 'Game-Show','Mystery', 'Action','Family','Game-Show', 'Adult','Animation', 'Action','Adult','Fantasy', 'Adult','Fantasy','Mystery', 'Documentary','Horror','Sci-Fi', 'Adult','Biography','Comedy', 'Adult','Romance','Sport', 'Adult','Adventure','Sci-Fi', 'Adult','Mystery','Thriller', 'Adult','Comedy','Music', 'Adult','Sci-Fi','Thriller', 'Adventure','Sport', 'Comedy','Crime','Documentary', 'Adult','Romance','Sci-Fi', 'Adult','Crime','Short', 'Adult','Comedy','War', 'Adult','Horror','Mystery', 'Adult','Horror','Short', 'Adult','Romance','Short', 'Adventure','Family','Music', 'Action','Adult', 'Sci-Fi','Short','Sport', 'Adult','Mystery','Romance', 'Action','Horror','Short', 'Adult','Biography','Drama', 'Action','Adult','War', 'Animation','Biography','History', 'Action','Adult','Sci-Fi', 'Adult','Mystery','Sci-Fi', 'Animation','Thriller', 'Adult','Sport', 'Animation','Family','History', 'Adult','Comedy','History', 'Adult','Adventure','Romance', 'Action','Game-Show','Sci-Fi', 'Family','Horror','Music', 'Adult','Horror','Sci-Fi', 'Adventure','Biography','Documentary', 'Adventure','Documentary','News', 'Crime','News', 'Fantasy','Short','Thriller', 'Biography','History','Short', 'Animation','Sport', 'Comedy','Music','News', 'Family','Music','Short', 'Adult','Animation','Horror', 'Documentary','Mystery','Short', 'Animation','Biography','Family', 'Family','Fantasy','Sport', 'Reality-TV','Short', 'Adult','Animation','Short', 'Crime','Music', 'Biography','Music','Short', 'News','Sport', 'Game-Show','Music','Talk-Show', 'News','Sport','Talk-Show', 'Adult','Comedy','Short', 'Adult','Animation','Sci-Fi', 'Adult','Comedy','Thriller', 'Action','Adult','Comedy', 'Family','Reality-TV','Talk-Show', 'Crime','Documentary','War', 'Documentary','Music','Romance', 'Crime','Documentary','Fantasy', 'Crime','Mystery','Short', 'Drama','Family','Sci-Fi', 'Action','Game-Show','Reality-TV', 'Adult','Comedy','Documentary', 'Animation','Comedy','Western', 'Animation','Drama','Musical', 'Adventure','Music','Sci-Fi', 'Documentary','Drama','Family', 'Animation','Crime','Short', 'Action','Adult','Animation', 'Animation','Short','Thriller', 'Biography','Comedy','Short', 'Adult','Crime','Fantasy', 'Action','Musical','War', 'Adventure','History','Thriller', 'Documentary','Fantasy','Romance', 'Drama','Fantasy','Western', 'Animation','Horror','Sci-Fi', 'Action','Animation','Mystery', 'Comedy','Sport','Talk-Show', 'Game-Show','Short', 'Animation','Comedy','Sport', 'Adult','Adventure','Crime', 'Animation','Music','Sci-Fi', 'Adult','Drama','Sport', 'Adult','Biography', 'Adult','Fantasy','Music', 'Sport','Talk-Show', 'Adult','Reality-TV', 'Comedy','Documentary','Talk-Show', 'Horror','Music','Sci-Fi', 'Game-Show','Talk-Show', 'Action','Game-Show', 'Crime','Fantasy','Musical', 'Documentary','News','War', 'Mystery','Sci-Fi','Short', 'Animation','Family','Game-Show', 'Biography','Talk-Show', 'Short','Talk-Show', 'Documentary','News','Reality-TV', 'Adventure','Romance','Sport', 'Animation','Crime', 'Animation','News', 'Fantasy','History','Horror', 'Adventure','Animation','Game-Show', 'Adventure','Game-Show', 'Crime','Mystery','Sport', 'Biography','Short','Western', 'Action','Family','Thriller', 'Game-Show','Reality-TV','Sport', 'Adventure','Game-Show','Reality-TV', 'Family','Reality-TV', 'Family','Music','Reality-TV', 'Documentary','Reality-TV','Short', 'Adult','Documentary','History', 'Family','Horror', 'Game-Show','Sport','Western', 'Adventure','Crime','Short', 'Comedy','Documentary','Fantasy', 'Fantasy','History', 'Adventure','Crime','Documentary', 'Adult','Comedy','Sport', 'Comedy','Drama','Game-Show', 'Adult','Musical', 'Adult','War', 'Adult','Adventure','Animation', 'History','Music','Short', 'Action','Crime','Documentary', 'Adult','Romance','Thriller', 'Adult','Animation','Romance', 'Documentary','History','Musical', 'Drama','Sci-Fi','Western', 'Family','Sci-Fi','Short', 'Fantasy','War', 'Adult','Documentary','Horror', 'Adult','Drama','History', 'Animation','Documentary','War', 'Adult','Romance','Western', 'Family','Music','Musical', 'Adult','Adventure','Short', 'History','Mystery','Short', 'Documentary','Music','Sport', 'Game-Show','Music','Mystery', 'Adult','Crime','Talk-Show', 'Comedy','Horror','News', 'Documentary','Family','Game-Show', 'Family','Romance','Short', 'Documentary','Horror','Mystery', 'Documentary','Fantasy','Music', 'Biography','Horror','Thriller', 'Family','Sci-Fi','Thriller', 'Action','Reality-TV','Sci-Fi', 'Fantasy','Sci-Fi','Talk-Show', 'Crime','Family','Short', 'Documentary','Sci-Fi','Sport', 'Action','Animation','Sport', 'Crime','Fantasy','Romance', 'Adventure','History','Short', 'Adventure','Reality-TV', 'History','Mystery','Thriller', 'Comedy','Reality-TV','Short', 'Crime','News','Reality-TV', 'Documentary','Music','Reality-TV', 'Comedy','Sport','War', 'Comedy','History','Sci-Fi', 'Biography','Comedy','War', 'Animation','Horror', 'Action','Biography','Thriller', 'Animation','Music', 'Documentary','Fantasy', 'Action','Family','Music', 'Action','Documentary','Short', 'Comedy','History','Sport', 'Comedy','Reality-TV','Romance', 'Music','Reality-TV', 'Music','Short','Sport', 'Mystery','Romance','Sci-Fi', 'Action','Fantasy','Short', 'Documentary','Sci-Fi','Short', 'Game-Show','Horror','Reality-TV', 'Drama','News','Short', 'Music','Mystery', 'Animation','Musical','Romance', 'Crime','Fantasy','Short', 'Family','Game-Show','Talk-Show', 'Music','Reality-TV','Talk-Show', 'Action','Short','Thriller', 'News','Reality-TV', 'Adventure','Mystery','Short', 'Crime','Documentary','Reality-TV', 'Comedy','Musical','Thriller', 'Action','Adventure','Game-Show', 'Biography','Fantasy','Music', 'Animation','Fantasy','History', 'Action','Crime','History', 'Reality-TV','Talk-Show', 'Documentary','Music','War', 'Family','History','Reality-TV', 'Documentary','History','Horror', 'Action','Animation','War', 'Game-Show','Reality-TV','Romance', 'Comedy','Game-Show','Romance', 'Action','Family','Romance', 'Family','History','Musical', 'History','News','Short', 'Mystery','Short','Western', 'Action','Documentary','Reality-TV', 'Music','Sport', 'Documentary','War','Western', 'Comedy','News','Reality-TV', 'Action','Music','Thriller', 'Action','Animation','Biography', 'Comedy','Documentary','Mystery', 'Comedy','Game-Show','Sci-Fi', 'Music','Sci-Fi','Thriller', 'Documentary','Short','Talk-Show', 'Comedy','Documentary','War', 'Action','Comedy','Game-Show', 'Adult','Fantasy','Thriller', 'Documentary','Music','Musical', 'Crime','Game-Show','Mystery', 'Biography','Romance','War', 'Animation','Mystery', 'Family','Game-Show','History', 'Action','Comedy','Reality-TV', 'Drama','History','Sci-Fi', 'Family','Fantasy','Western', 'Action','Adventure','Sport', 'Family','Music','Sport', 'Family','History', 'Documentary','Drama','Musical', 'Documentary','Game-Show','News', 'Adventure','Game-Show','Mystery', 'Adult','Sport','War', 'Action','Documentary','Sci-Fi', 'Adventure','Fantasy','Thriller', 'Documentary','Game-Show','Reality-TV', 'Family','Reality-TV','Sport', 'Documentary','Game-Show','Music', 'Family','News','Talk-Show', 'Adventure','Fantasy','Music', 'Adventure','Family','Game-Show', 'Romance','Short','Thriller', 'Adult','Adventure','Horror', 'Action','Adult','Mystery', 'Family','Mystery','Sport', 'Action','Musical','Western', 'Adult','Drama','Short', 'Fantasy','Horror','Western', 'Biography','Reality-TV', 'Adventure','Fantasy','Western', 'Action','Family','Mystery', 'History','News', 'Game-Show','Reality-TV','Thriller', 'Documentary','Horror','War', 'Horror','Mystery','Reality-TV', 'Biography','History','Sport', 'Family','Game-Show','Reality-TV', 'Crime','Drama','Reality-TV', 'Adventure','Animation','Documentary', 'Animation','Music','Musical', 'Family','Mystery','Short', 'Family','Game-Show','Musical', 'Adventure','Family','Reality-TV', 'Action','Documentary','Fantasy', 'Comedy','News','Sport', 'Romance','Short','War', 'Adventure','Horror','Music', 'History','Sci-Fi', 'Comedy','Game-Show','Short', 'Game-Show','Mystery','Reality-TV', 'History','Music', 'Sci-Fi','Talk-Show', 'Action','Reality-TV', 'Adventure','Crime','War', 'Adult','Biography','Romance', 'Animation','Drama','Music', 'History','Reality-TV', 'Music','News','Talk-Show', 'Action','Animation','Thriller', 'Game-Show','History', 'Documentary','Reality-TV','Thriller', 'Comedy','Game-Show','Mystery', 'Action','Mystery','Short', 'Family','Mystery','Thriller', 'Adult','Documentary','Music', 'Adventure','Animation','Sport', 'Adult','Animation','Mystery', 'Biography','Crime','Horror', 'Documentary','Family','News', 'Adventure','Animation','History', 'Comedy','Music','Reality-TV', 'Documentary','Family','Mystery', 'Action','Adult','Romance', 'Sci-Fi','Short','Western', 'Horror','Reality-TV','Thriller', 'Sci-Fi','Short','War', 'Action','Documentary','Music', 'Drama','News', 'Comedy','Documentary','Reality-TV', 'Action','Adult','Sport', 'Action','Fantasy','Mystery', 'Comedy','History','News', 'Family','Musical','Western', 'Family','Sport','Talk-Show', 'Action','Documentary','Family', 'Adult','Documentary','Romance', 'Adventure','Documentary','War', 'Adult','Comedy','Talk-Show', 'History','Talk-Show', 'Biography','News','Talk-Show', 'Comedy','Documentary','Sci-Fi', 'Music','Musical','Sport', 'Animation','Biography', 'Adult','Animation','Thriller', 'Biography','Documentary','News', 'Drama','News','Talk-Show', 'Documentary','Fantasy','Sci-Fi', 'Horror','Musical','Mystery', 'Family','News','Short', 'Musical','Talk-Show', 'Biography','Reality-TV','Talk-Show', 'Comedy','Documentary','Thriller', 'Crime','Drama','News', 'Musical','News','Reality-TV', 'Musical','Sci-Fi','Thriller', 'Biography','History','Musical', 'Comedy','Horror','Reality-TV', 'Animation','Documentary','Fantasy', 'Horror','Romance','Short', 'Music','Romance','Thriller', 'Action','Adventure','Reality-TV', 'Game-Show','News','Talk-Show', 'Action','Biography','War', 'Music','Mystery','Short', 'History','Horror', 'Adventure','Musical','Short', 'Crime','Documentary','Thriller', 'Adult','Horror','Romance', 'Family','Short','War', 'Biography','Documentary','Musical', 'Family','Short','Western', 'Action','Adult','Thriller', 'Documentary','Family','Western', 'Action','Family','Musical', 'Documentary','Sport','Talk-Show', 'Horror','Short','Western', 'Action','Documentary','History', 'Crime','Documentary','Music', 'Fantasy','Sci-Fi','War', 'Comedy','Game-Show','Musical', 'Adult','Sci-Fi','Short', 'Music','Mystery','Romance', 'Musical','Mystery', 'Adult','Fantasy','Short', 'Action','Documentary','Horror', 'Horror','Reality-TV', 'Family','Reality-TV','Romance', 'Adult','Animation','Crime', 'History','Short','Sport', 'Fantasy','Music','Mystery', 'Game-Show','News', 'Comedy','Game-Show','Reality-TV', 'Family','History','News', 'Reality-TV','Sci-Fi','Short', 'Adult','Adventure','Biography', 'Adventure','Animation','Music', 'Crime','Documentary','Horror', 'Crime','Game-Show', 'Documentary','Family','Romance', 'Documentary','Musical','Romance', 'Game-Show','Musical', 'Action','Animation','Documentary', 'Family','Music','Romance', 'Comedy','Documentary','Western', 'Reality-TV','Romance','Talk-Show', 'Family','Musical','Talk-Show', 'Family','Game-Show','Sci-Fi', 'Musical','Romance','Sci-Fi', 'Adventure','Documentary','Reality-TV', 'Fantasy','Horror','Musical', 'Adult','Reality-TV','Talk-Show', 'Crime','Music','News', 'Action','Fantasy','War', 'Documentary','Mystery','Reality-TV', 'Biography','News','Short', 'Adult','Mystery','War', 'Animation','Family','Talk-Show', 'Crime','Talk-Show', 'Horror','News','Talk-Show', 'Crime','Documentary','Family', 'Adventure','Reality-TV','Short', 'Adventure','Comedy','Game-Show', 'Adventure','Reality-TV','Thriller', 'Drama','Family','Reality-TV', 'Mystery','Reality-TV', 'Documentary','Reality-TV','Sport', 'Comedy','Short','Talk-Show', 'Family','History','War', 'Romance','Sci-Fi','Short', 'Crime','Documentary','News', 'Adult','Fantasy','History', 'Documentary','History','Sci-Fi', 'Comedy','History','Reality-TV', 'Action','Adult','Short', 'Documentary','Game-Show', 'Fantasy','Sport', 'Biography','Fantasy','History', 'Comedy','Drama','Reality-TV', 'Crime','Mystery','Reality-TV', 'Adventure','Crime','Reality-TV', 'Biography','Family','History', 'Drama','Game-Show','Reality-TV', 'Documentary','Drama','Reality-TV', 'Game-Show','Mystery','Short', 'Adventure','Mystery','Sport', 'Sci-Fi','Sport', 'Animation','Mystery','Thriller', 'Adventure','Comedy','Reality-TV', 'Action','Musical','Short', 'Family','Horror','Short', 'Adventure','Horror','Short', 'Animation','Horror','Music', 'Documentary','Music','Talk-Show', 'Game-Show','Music','Musical', 'Music','War', 'Action','Music','Musical', 'Action','Reality-TV','Sport', 'Biography','History','Reality-TV', 'Adventure','Documentary','Mystery', 'Family','Mystery','Sci-Fi', 'Crime','Sci-Fi','Short', 'Fantasy','Horror','War', 'Game-Show','Reality-TV','Short', 'Animation','Documentary','Sci-Fi', 'Documentary','Reality-TV','War', 'Documentary','Musical','War', 'Biography','Comedy','Horror', 'Comedy','Documentary','Game-Show', 'Family','Fantasy','Game-Show', 'Biography','Comedy','Sci-Fi', 'Biography','Drama','Talk-Show', 'Music','Reality-TV','Romance', 'Biography','Mystery', 'Documentary','Family','Sci-Fi', 'Music','News','Sport', 'Fantasy','Reality-TV', 'Adventure','Talk-Show', 'Adventure','Animation','Mystery', 'Mystery','Short','War', 'Comedy','Game-Show','Horror', 'History','Sport', 'Crime','Musical','Short', 'Music','Musical','Talk-Show', 'Mystery','Talk-Show', 'Documentary','Reality-TV','Romance', 'Action','Biography','Sport', 'History','Music','War', 'Biography','Fantasy','War', 'Action','Drama','Reality-TV', 'Crime','Documentary','Western', 'Music','News','Reality-TV', 'Adventure','Fantasy','Sport', 'Biography','Documentary','Talk-Show', 'Action','Music','Short', 'Horror','Musical','Short', 'Adventure','Fantasy','Game-Show', 'Music','Sport','Talk-Show', 'Sci-Fi','War', 'Comedy','Musical','Talk-Show', 'Documentary','History','Thriller', 'Adventure','Comedy','Talk-Show', 'Documentary','Fantasy','Mystery', 'Horror','Sport','Thriller', 'Documentary','Family','Talk-Show', 'Adult','Music','Romance', 'Fantasy','Sci-Fi','Sport', 'Adult','Short','Sport', 'Documentary','Drama','Horror', 'Adventure','Biography','Horror', 'Crime','News','Talk-Show', 'Romance','Talk-Show', 'Adventure','Sport','Talk-Show', 'Short','Sport','Thriller', 'Adventure','Short','Sport', 'Documentary','Reality-TV','Talk-Show', 'Action','Adventure','News', 'Biography','News', 'Fantasy','Musical','Thriller', 'Reality-TV','Western', 'Animation','News','Short', 'Comedy','Reality-TV','Sport', 'Adventure','Music', 'Drama','Family','Game-Show', 'Game-Show','History','Music', 'Biography','News','Reality-TV', 'Drama','Reality-TV','Romance', 'Adult','Documentary','Reality-TV', 'Documentary','Horror','Reality-TV', 'Biography','Documentary','Fantasy', 'Horror','Talk-Show', 'Fantasy','Short','Western', 'Musical','Sci-Fi','Short', 'Animation','War', 'Crime','Reality-TV','Thriller', 'History','News','War', 'Horror','Short','War', 'Fantasy','History','Short', 'Family','Reality-TV','Short', 'Animation','Comedy','War', 'Adult','Music','Short', 'Crime','Family','Romance', 'Action','Fantasy','Game-Show', 'Drama','Reality-TV','Talk-Show', 'Documentary','News','Sci-Fi', 'Animation','History','War', 'History','Horror','Short', 'Drama','Reality-TV','Sport', 'Crime','Fantasy','Sci-Fi', 'Drama','Game-Show', 'Action','History','Mystery', 'Family','Game-Show','Horror', 'Adventure','News', 'Action','Fantasy','Western', 'Animation','Comedy','Game-Show', 'History','News','Talk-Show', 'Fantasy','Music','Sci-Fi', 'Animation','Sport','Thriller', 'Biography','Family','Music', 'Crime','Thriller','War', 'Mystery','Reality-TV','Sci-Fi', 'Drama','Family','Talk-Show', 'Adventure','Game-Show','Sci-Fi', 'Family','History','Sport', 'Animation','Crime','Music', 'Animation','Crime','Mystery', 'News','Reality-TV','Sport', 'Adventure','Reality-TV','Sport', 'News','Sport','War', 'History','Horror','Thriller', 'Adventure','Music','Mystery', 'History','Short','Thriller', 'Family','Mystery','News', 'Comedy','Horror','Talk-Show', 'Biography','Sport','Talk-Show', 'Action','Drama','Game-Show', 'Adventure','Comedy','News', 'News','Reality-TV','Talk-Show', 'Comedy','Mystery','Reality-TV', 'History','Talk-Show','War', 'Comedy','News','Short', 'History','Horror','Mystery', 'Family','Music','News', 'Biography','Mystery','Short', 'Music','News','Short', 'Action','News', 'Documentary','Horror','Romance', 'Adventure','Game-Show','Horror', 'Biography','Documentary','Reality-TV', 'Comedy','History','Talk-Show', 'Animation','Talk-Show', 'Documentary','Fantasy','Musical', 'Documentary','History','Talk-Show', 'Horror','Music','Musical', 'News','Short','Talk-Show', 'Horror','Sci-Fi','Talk-Show', 'Documentary','Fantasy','History', 'Comedy','Musical','News', 'Animation','Documentary','News', 'Sci-Fi','Short','Talk-Show', 'Adventure','Mystery','Reality-TV', 'Biography','Music','Talk-Show', 'Musical','Mystery','Short', 'Documentary','Short','Thriller', 'Reality-TV','Sport','Talk-Show', 'Biography','Drama','Reality-TV', 'Animation','Comedy','News', 'Adult','Reality-TV','Romance', 'Family','Fantasy','War', 'Biography','Drama','News', 'Biography','Sci-Fi','Short', 'Animation','Musical','Sport', 'Musical','Reality-TV', 'Adventure','Biography','Musical', 'Adult','Game-Show', 'Biography','Family','Short', 'Drama','Music','Reality-TV', 'Action','Reality-TV','Romance', 'Family','Reality-TV','Western', 'Adult','Comedy','Reality-TV', 'Horror','Reality-TV','Talk-Show', 'Action','Family','Western', 'nan', 'Animation','Crime','Fantasy', 'Adventure','Fantasy','News', 'Adventure','History','Sci-Fi', 'Documentary','Romance','Sport', 'Action','History','Horror', 'Reality-TV','Sport','Thriller', 'Game-Show','Short','Sport', 'Biography','History','Thriller', 'Animation','Comedy','Talk-Show', 'Fantasy','History','Mystery', 'Comedy','Family','News', 'Mystery','Sci-Fi','Talk-Show', 'Mystery','Reality-TV','Talk-Show', 'Comedy','Game-Show','History', 'Documentary','Drama','Talk-Show', 'Biography','Comedy','Talk-Show', 'Short','Sport','Talk-Show', 'Horror','Mystery','Talk-Show', 'Family','Mystery','Romance', 'Action','News','Sport', 'Fantasy','Horror','Reality-TV', 'Animation','Horror','Thriller', 'Adventure','Reality-TV','Romance', 'Biography','Short','Sport', 'Music','Musical','Reality-TV', 'Animation','Documentary','Romance', 'Documentary','Family','War', 'Romance','Short','Sport', 'History','Music','Talk-Show', 'Animation','Documentary','Mystery', 'Comedy','Reality-TV','Thriller', 'Action','Mystery','War', 'Music','Short','Talk-Show', 'Family','History','Mystery', 'Action','Reality-TV','Short', 'Drama','Music','Talk-Show', 'Crime','Documentary','Talk-Show', 'Biography','Comedy','Game-Show', 'Music','Short','Thriller', 'Animation','Mystery','Sci-Fi', 'Animation','Sci-Fi','War', 'Action','Animation','Game-Show', 'Crime','Music','Short', 'Animation','History','Music', 'Drama','Game-Show','Sport', 'Animation','Fantasy','News', 'Crime','Sport','Thriller', 'Adult','Crime','Sci-Fi', 'Drama','News','Thriller', 'Crime','History','Thriller', 'History','Musical','Talk-Show', 'Biography','Comedy','Family', 'Drama','Musical','Talk-Show', 'Adventure','Reality-TV','Talk-Show', 'Biography','Comedy','Sport', 'Fantasy','Reality-TV','Sci-Fi', 'Fantasy','Game-Show','Horror', 'Animation','Romance','Sport', 'Animation','News','Sci-Fi', 'Horror','Reality-TV','Short', 'Music','Reality-TV','Short', 'Adventure','Biography','Sport', 'Animation','History','Sci-Fi', 'History','Short','Talk-Show', 'Horror','News', 'Family','History','Horror', 'Adventure','Documentary','Talk-Show', 'Biography','Family','Talk-Show', 'Game-Show','Music','Short', 'Game-Show','Reality-TV','Talk-Show', 'Biography','Music','Mystery', 'Adventure','Biography','Short', 'Horror','Musical','Romance', 'Animation','Horror','Romance', 'Crime','Fantasy','History', 'Adventure','History','Horror', 'Reality-TV','Short','Talk-Show', 'Adventure','Horror','Musical', 'Documentary','Horror','News', 'Adventure','Reality-TV','Western', 'History','Romance','Short', 'Fantasy','History','Music', 'Animation','Horror','News', 'Animation','Game-Show', 'Adventure','Documentary','Music', 'Adventure','Animation','Thriller', 'Action','History','Short', 'Fantasy','News','Sci-Fi', 'News','Reality-TV','Short', 'Action','Biography','Musical', 'Action','History','Musical', 'Biography','Documentary','Sci-Fi', 'Adult','Comedy','Game-Show', 'Action','Sport','War', 'Musical','Reality-TV','Short', 'Documentary','Music','Sci-Fi', 'Documentary','Horror','Musical', 'Biography','History','Mystery', 'Biography','Documentary','Thriller', 'Action','Fantasy','Music', 'Action','News','Sci-Fi', 'Short','Sport','Western', 'Adult','Musical','Short', 'Mystery','News','Short', 'Adventure','Documentary','Horror', 'Action','Fantasy','Sport', 'Biography','Thriller','Western', 'Adventure','Music','Short', 'Crime','Documentary','Sport', 'Comedy','Mystery','News', 'Drama','Mystery','News', 'Animation','Game-Show','Talk-Show', 'Comedy','Mystery','Sport', 'Mystery','News', 'Crime','Fantasy','Thriller', 'Comedy','History','Western', 'Drama','Music','News', 'Animation','Biography','Fantasy', 'Documentary','Music','Mystery', 'History','Horror','Romance', 'Crime','Sci-Fi','War', 'Adult','News', 'News','Sci-Fi', 'Drama','Game-Show','Music', 'Documentary','Mystery','Thriller', 'Adventure','Documentary','Game-Show', 'News','Sci-Fi','Short', 'Animation','History','Mystery', 'Family','History','Music', 'Reality-TV','Short','Sport', 'Adventure','Fantasy','Reality-TV', 'Drama','News','Reality-TV', 'Animation','Reality-TV','Talk-Show', 'Documentary','Fantasy','Sport', 'Animation','Short','Talk-Show', 'Comedy','History','Horror', 'History','Reality-TV','Sport', 'Crime','Game-Show','Reality-TV', 'Biography','Horror','Talk-Show', 'Adventure','Musical','Sci-Fi', 'Talk-Show','Western', 'Musical','Sport','Talk-Show', 'Biography','Game-Show','Reality-TV', 'Biography','Reality-TV','Short', 'Game-Show','Horror','Thriller', 'Family','Talk-Show','Thriller', 'Adventure','News','Short', 'Biography','History','News', 'Animation','Biography','Crime', 'History','Musical','Short', 'Family','Reality-TV','Sci-Fi', 'Documentary','Game-Show','History', 'Fantasy','Romance','Sport', 'Action','Comedy','News', 'Fantasy','History','Thriller', 'Adventure','Documentary','Sci-Fi', 'Animation','Fantasy','Thriller', 'Animation','Documentary','Horror', 'Crime','Family','Fantasy', 'History','Reality-TV','Thriller', 'Documentary','Romance','Thriller', 'Family','Music','Mystery', 'Adventure','Drama','Reality-TV', 'Drama','Talk-Show','War', 'Drama','Horror','Sport', 'Animation','History','Romance', 'Action','Musical','Sci-Fi', 'Music','Romance','Sci-Fi', 'Horror','Musical','Thriller', 'Sci-Fi','Thriller','Western', 'Adult','Biography','Talk-Show', 'Family','Romance','Western', 'History','Reality-TV','War', 'Crime','Romance','Sci-Fi', 'Crime','Family','Musical', 'Adventure','Reality-TV','Sci-Fi', 'Documentary','Mystery','Sport', 'Action','Game-Show','History', 'News','Talk-Show','War', 'Animation','Comedy','Reality-TV', 'Action','Biography','Short', 'Action','Fantasy','Talk-Show', 'Comedy','Fantasy','Reality-TV', 'Action','Horror','Music', 'Animation','Mystery','Romance', 'Animation','Sci-Fi','Sport', 'History','Musical','War', 'Musical','Short','Thriller', 'Documentary','Musical','Thriller', 'Biography','History','Horror', 'Adult','Reality-TV','Short', 'Animation','Reality-TV','Short', 'Animation','History','News', 'Adult','Animation','Documentary', 'Documentary','Sport','War', 'Action','History','Sci-Fi', 'Animation','Documentary','Sport', 'Action','Biography','Music', 'Adult','Drama','Reality-TV', 'Biography','Horror','Short', 'Action','Biography','Romance', 'Action','Documentary','News', 'Animation','Fantasy','Reality-TV', 'Game-Show','Music','Sport', 'Romance','Short','Talk-Show', 'Crime','Reality-TV','Talk-Show', 'Adventure','History','Reality-TV', 'Animation','Horror','Musical', 'Drama','History','News', 'Action','Animation','Musical', 'Biography','Short','Talk-Show', 'Biography','Family','News', 'Adventure','History','Music', 'Fantasy','Reality-TV','Romance', 'Comedy','Fantasy','Talk-Show', 'Documentary','Sci-Fi','Talk-Show', 'Action','Biography','Family', 'Drama','Mystery','Reality-TV', 'Animation','Western', 'Adventure','Mystery','Talk-Show', 'Action','Biography','Game-Show', 'Family','Romance','Sport', 'Documentary','Fantasy','Reality-TV', 'Game-Show','Musical','Reality-TV', 'Family','Musical','Reality-TV', 'Musical','Reality-TV','Sport', 'Family','Short','Talk-Show', 'Sci-Fi','Sport','Talk-Show', 'History','Reality-TV','Talk-Show', 'History','Horror','Sci-Fi', 'Musical','Sport', 'Crime','History','War', 'Action','Talk-Show', 'Comedy','Crime','War', 'Biography','Fantasy','Romance', 'Documentary','Reality-TV','Sci-Fi', 'Biography','Music','Sport', 'Action','Game-Show','News', 'Animation','Reality-TV', 'Crime','Horror','Talk-Show', 'Talk-Show','War', 'Music','Romance','Talk-Show', 'Drama','Reality-TV','Short', 'Action','Biography','Reality-TV', 'Drama','Game-Show','Short', 'Animation','Crime','Romance', 'Biography','Comedy','Reality-TV', 'Comedy','News','Sci-Fi', 'Adventure','Documentary','Musical', 'Crime','Game-Show','Thriller', 'History','Sci-Fi','Short', 'Biography','Comedy','Mystery', 'Fantasy','Game-Show', 'Biography','Music','Reality-TV', 'Crime','History','News', 'Game-Show','Mystery','News', 'Action','Sport','Talk-Show', 'Animation','Musical','Sci-Fi', 'Fantasy','Music','Sport', 'Family','News','Reality-TV', 'Action','History','Western', 'Fantasy','Horror','News', 'Adventure','Drama','News', 'Action','Family','Reality-TV', 'Adventure','Documentary','Western', 'Crime','Short','War', 'Action','Drama','News', 'Animation','Biography','War', 'Horror','Short','Talk-Show', 'Biography','Crime','Sci-Fi', 'Crime','News','Thriller', 'News','Short','War', 'Fantasy','Short','War', 'Horror','Short','Sport', 'Adventure','Horror','Western', 'Crime','History','Music', 'Action','Reality-TV','War', 'Animation','Game-Show','Short', 'Animation','Drama','Reality-TV', 'Drama','Family','News', 'Crime','Fantasy','Music', 'Animation','History','Horror', 'Biography','Fantasy','Short', 'Biography','Fantasy','News', 'Biography','History','Sci-Fi', 'Crime','Horror','Sport', 'Game-Show','Horror', 'Crime','History','Sci-Fi', 'Comedy','Sci-Fi','Sport', 'Documentary','Horror','Talk-Show', 'Biography','Mystery','Sci-Fi', 'Adventure','Short','Talk-Show', 'Adventure','Biography','Mystery', 'Short','Talk-Show','Thriller', 'History','Music','Musical', 'Action','Music','Mystery', 'Biography','Short','Thriller', 'Musical','Short','War', 'Fantasy','Short','Sport', 'Adventure','Biography','Talk-Show', 'Drama','Short','Talk-Show', 'Action','Documentary','Musical', 'Adult','Short','Western', 'Sci-Fi','Thriller','War', 'Documentary','News','Western', 'Comedy','Crime','Talk-Show', 'Crime','History','Horror', 'Animation','Drama','Game-Show', 'News','Sci-Fi','Talk-Show', 'Comedy','Romance','Talk-Show', 'Action','Documentary','Romance', 'Fantasy','Talk-Show', 'Adventure','Music','Reality-TV', 'Family','Sci-Fi','Sport', 'Game-Show','Short','Thriller', 'History','Thriller','War', 'Animation','Biography','Music', 'Adult','Crime','Documentary', 'Short','Talk-Show','War', 'Crime','Mystery','News', 'Biography','Fantasy','Musical', 'Action','Family','History', 'Animation','Biography','Mystery', 'Fantasy','History','Musical', 'History','News','Sci-Fi', 'Documentary','Mystery','News', 'Family','Short','Thriller', 'Biography','Family','Fantasy', 'Animation','Game-Show','Reality-TV', 'Mystery','News','Sci-Fi', 'Comedy','Reality-TV','Sci-Fi', 'Animation','Family','News', 'Action','Documentary','Western', 'Family','Fantasy','News', 'Drama','News','Romance', 'History','Music','News', 'Biography','Fantasy','Horror', 'Fantasy','History','War', 'Documentary','Talk-Show','Thriller', 'Action','Short','Talk-Show', 'Animation','Biography','Sci-Fi', 'History','Mystery','Sci-Fi', 'Adventure','Biography','Music', 'Animation','Music','Mystery', 'Crime','Family','Music', 'Fantasy','History','News', 'Biography','Crime','Romance', 'Documentary','News','Romance', 'Documentary','Family','Horror', 'Drama','Fantasy','News', 'Animation','History','Musical', 'Fantasy','News','Short', 'Crime','News','Short', 'Adult','Romance','War', 'Horror','Mystery','News', 'Action','Game-Show','Horror', 'Adventure','Crime','News', 'Biography','Comedy','News', 'Comedy','Thriller','Western', 'Game-Show','Musical','Romance', 'History','Mystery','Reality-TV', 'Music','Mystery','Sci-Fi', 'Adventure','Musical','Western', 'Animation','Music','Sport', 'Animation','Drama','News', 'Biography','Family','Mystery', 'History','Reality-TV','Short', 'Crime','History','Romance', 'Adventure','News','Talk-Show', 'Adult','Animation','Sport', 'Adult','Game-Show','Reality-TV', 'Comedy','News','Romance', 'Documentary','Musical','Reality-TV', 'Fantasy','Sci-Fi','Western', 'Family','Music','Sci-Fi', 'Fantasy','History','Sci-Fi', 'Family','History','Talk-Show', 'Family','Musical','News', 'Comedy','Mystery','Talk-Show', 'Animation','Mystery','Talk-Show', 'Game-Show','Sci-Fi','Short', 'History','Talk-Show','Western', 'Horror','Music','News', 'News','Romance', 'Biography','Mystery','Romance', 'Comedy','Crime','News', 'Comedy','Fantasy','Game-Show', 'Drama','Musical','Reality-TV', 'Comedy','Horror','War', 'Family','Game-Show','News', 'Horror','News','Short', 'History','Music','Sci-Fi', 'Family','Musical','Sport', 'Biography','Horror','Sci-Fi', 'Family','History','Romance', 'Sci-Fi','Sport','Thriller', 'Documentary','Musical','News', 'Documentary','Fantasy','News', 'Documentary','Sci-Fi','Thriller', 'History','Horror','News', 'Music','Reality-TV','Sport', 'Crime','Documentary','Romance', 'Adventure','History','News', 'Animation','Biography','Sport', 'Comedy','Fantasy','News', 'Musical','News','Short', 'Game-Show','Horror','Short', 'Biography','Music','Sci-Fi', 'History','Mystery','Romance', 'Family','Musical','Sci-Fi', 'Fantasy','Thriller','War', 'Musical','Short','Sport', 'Animation','Biography','News', 'Fantasy','Mystery','News', 'Adult','Animation','History', 'Documentary','Musical','Mystery', 'Adult','Adventure','Western', 'Animation','Documentary','Talk-Show', 'Documentary','Sport','Thriller', 'Adventure','Animation','News', 'Family','Game-Show','Romance', 'History','Horror','Music', 'Music','Musical','News', 'Adventure','Animation','Reality-TV', 'Adult','Drama','Game-Show', 'Game-Show','Reality-TV','Sci-Fi', 'Action','Game-Show','Short', 'Family','Horror','Thriller', 'Action','Adult','History', 'Animation','Biography','Musical', 'Crime','Mystery','War', 'Animation','Crime','Thriller', 'Reality-TV','Romance','Short', 'Documentary','Mystery','War', 'History','Horror','War', 'Comedy','History','Thriller', 'History','Music','Mystery', 'Action','Animation','Western', 'Mystery','Short','Sport', 'Action','Horror','Sport', 'Adult','Mystery','Short', 'Fantasy','Thriller','Western', 'News','Reality-TV','Romance', 'Drama','Fantasy','Reality-TV', 'Action','History','Reality-TV', 'Music','Musical','Mystery', 'News','Romance','Talk-Show', 'Animation','Crime','History', 'Horror','Music','Romance', 'Documentary','Sport','Western', 'Biography','Family','Horror', 'Mystery','News','Reality-TV', 'Action','News','Reality-TV', 'Adventure','Music','Western', 'History','Horror','Musical', 'Action','Biography','Mystery', 'Documentary','News','Thriller', 'Short','War','Western', 'Action','History','Music', 'Action','Sport','Western', 'Mystery','Sport','Western', 'Biography','Crime','Mystery', 'Animation','Crime','News', 'Drama','Game-Show','History', 'Documentary','Music','Western', 'Fantasy','Mystery','Reality-TV', 'Adventure','News','Sci-Fi', 'Crime','Drama','Talk-Show', 'Game-Show','Sport','Talk-Show', 'Adult','Short','Thriller', 'Action','Mystery','Reality-TV', 'Action','Family','War', 'Family','Game-Show','Short', 'Comedy','Mystery','Western', 'Fantasy','Reality-TV','Sport', 'Family','History','Thriller', 'Biography','Family','Romance', 'Adventure','News','Reality-TV', 'History','News','Romance', 'Family','Sci-Fi','Talk-Show', 'Adventure','Musical','Reality-TV', 'Animation','Biography','Romance', 'Biography','Crime','Music', 'Crime','Horror','Reality-TV', 'Game-Show','Musical','Mystery', 'Adventure','Sport','Western', 'Animation','Game-Show','Music', 'History','Sci-Fi','Thriller', 'Animation','Family','Reality-TV', 'Action','Biography','Fantasy', 'News','War', 'Adventure','Biography','Reality-TV', 'Drama','Horror','Reality-TV', 'Adventure','Drama','Game-Show', 'History','News','Reality-TV', 'Adult','Romance','Talk-Show', 'Biography','Romance','Thriller', 'Biography','Family','Sport', 'Talk-Show','Thriller', 'Animation','Biography','Horror', 'Mystery','Short','Talk-Show', 'Documentary','Musical','Sci-Fi', 'Adult','Crime','History', 'Mystery','Reality-TV','Short', 'Animation','Crime','Reality-TV', 'Adventure','Biography','Fantasy', 'Biography','Music','News', 'Drama','History','Reality-TV', 'Family','Horror','Romance', 'Biography','Fantasy','Sci-Fi', 'News','Short','Thriller', 'Crime','Musical','Thriller', 'Animation','Musical','News', 'Drama','Reality-TV','Thriller', 'Comedy','Sport','Western', 'Animation','Music','News', 'Action','Adult','Horror', 'Drama','News','Sci-Fi', 'Horror','Sci-Fi','War', 'Action','Adult','Documentary', 'Documentary','Mystery','Romance', 'Documentary','Drama','Game-Show', 'Biography','News','Sci-Fi', 'Drama','Horror','News', 'Musical','Mystery','Sci-Fi', 'Biography','Family','Sci-Fi', 'Crime','Family','History', 'Documentary','Musical','Sport', 'Crime','Documentary','Musical', 'Comedy','Musical','Reality-TV', 'History','News','Sport', 'Music','Musical','Thriller', 'Action','Documentary','Mystery', 'History','Mystery','News', 'Documentary','Fantasy','Thriller', 'Adventure','Music','Thriller', 'Biography','Crime','Sport', 'Action','Horror','Musical', 'News','Short','Western', 'Adventure','Horror','Reality-TV', 'Game-Show','History','Reality-TV', 'Action','News','Short', 'Action','History','Sport', 'Animation','Documentary','Thriller', 'Animation','Family','War', 'Adventure','Biography','Thriller', 'Documentary','Romance','Sci-Fi', 'Adventure','Crime','Music', 'History','Music','Sport', 'Sport','War', 'History','Reality-TV','Sci-Fi', 'Adult','Horror','Music', 'Animation','Horror','War', 'Crime','History','Sport', 'Animation','Fantasy','War', 'Biography','Game-Show', 'Animation','News','Reality-TV', 'Biography','Musical','Reality-TV', 'Family','Fantasy','Reality-TV', 'Musical','News','Talk-Show', 'Animation','History','Sport', 'Family','Sport','Thriller', 'Biography','Horror','Music', 'Adult','Fantasy','Western', 'Biography','Reality-TV','Sport', 'Biography','Crime','Reality-TV', 'History','Sport','Thriller', 'Crime','Music','Sci-Fi', 'Action','Horror','Reality-TV', 'Horror','News','Sci-Fi', 'Biography','Musical','Mystery', 'Comedy','Sport','Thriller', 'Crime','History','Talk-Show', 'Game-Show','History','Talk-Show', 'Crime','Family','News', 'Action','Fantasy','Reality-TV', 'Crime','Family','Horror', 'Crime','Fantasy','War', 'Family','Horror','War', 'Reality-TV','Talk-Show','Western', 'Fantasy','Romance','War', 'Animation','Music','Thriller', 'Fantasy','Game-Show','Reality-TV', 'Action','Biography','News', 'Comedy','Mystery','War', 'Crime','Reality-TV','Short', 'Biography','Thriller','War', 'Reality-TV','Short','War', 'Crime','History','Reality-TV', 'Reality-TV','Short','Thriller', 'Romance','Sci-Fi','Western', 'Comedy','Crime','Game-Show', 'News','Romance','Short', 'Adventure','Sport','Thriller', 'Adult','Short','War', 'Biography','Crime','Western', 'Game-Show','Horror','Mystery', 'Action','Crime','Talk-Show', 'Adventure','Thriller','War', 'Horror','Reality-TV','Sci-Fi', 'Adult','Animation','Talk-Show', 'Biography','Crime','Fantasy', 'Mystery','Reality-TV','Thriller', 'Musical','Reality-TV','Talk-Show'])


In [345]:
df_title_basics3 = title_basics.copy()


In [346]:
# d'apres les stat Netflix, alloCiné, les films les plus regardés sont généralement des films d'action, de comédie, de science-fiction  de fantasy et les drames.
# ‘Action’,‘Comedy’,‘Drama’,‘Fantasy’,‘Sci-Fi’
listes_genres_uniques = list(listes_genres_uniques)
listes_genres_uniques_to_keep = ['Action','Thriller', 'Adventure', 'Sci-Fi', 'Fantasy','Animation', 'War','Family', 'Musical', 'Mystery','Comedy',  'Drama']

# Convertir les genres en listes
df_title_basics3["genres"] = df_title_basics3["genres"].str.split(",")

# "Exploser" la colonne des genres
title_basics_exploded = df_title_basics3.explode("genres")

# Filtrer le DataFrame
title_basics_filtered = title_basics_exploded[title_basics_exploded["genres"].isin( listes_genres_uniques_to_keep)]
title_basics_filtered

# On ne garde que les movies de la colonne titleType
title_basics_filtered3 = title_basics_filtered[title_basics_filtered['titleType'] == (['movie'],)]

# suppression de toutes les colonnes ayant des \N
title_basics_filtered3 = title_basics_filtered3.loc[:, ~(title_basics_filtered3 == r'\N').any()]
# title_basics_filtered3
# aplatissement des tuples des la colonne
df2 = title_basics_filtered3
for col in df2.columns:
    df2[col] = df2[col].apply(
    lambda x: x[0][0]
    if isinstance(x, tuple) and isinstance(x[0], list)
        else
        (x[0] if isinstance(x, tuple) else x))

# supppression des films en doublons dans la colonne tconst car ils appartiennent à plusieurs genres en mm temps
    #  !! au final on garde les doublons car ce qui nous interesse c'est de faire un choix par genre pour recommander des filmss
# df2 = df2.drop_duplicates(subset='tconst', keep='first')




Unnamed: 0,tconst,titleType,primaryTitle,originalTitle,isAdult,startYear,endYear,runtimeMinutes,genres
1,"(tt0000002,)","([short],)","(Le clown et ses chiens,)",Le clown et ses chiens,0,1892,\N,5,Animation
2,"(tt0000003,)","([short],)","(Pauvre Pierrot,)",Pauvre Pierrot,0,1892,\N,4,Animation
2,"(tt0000003,)","([short],)","(Pauvre Pierrot,)",Pauvre Pierrot,0,1892,\N,4,Comedy
3,"(tt0000004,)","([short],)","(Un bon bock,)",Un bon bock,0,1892,\N,12,Animation
4,"(tt0000005,)","([short],)","(Blacksmith Scene,)",Blacksmith Scene,0,1893,\N,1,Comedy
...,...,...,...,...,...,...,...,...,...
10458180,"(tt9916852,)","([tvEpisode],)","(Episode #3.20,)",Episode #3.20,0,2010,\N,\N,Drama
10458180,"(tt9916852,)","([tvEpisode],)","(Episode #3.20,)",Episode #3.20,0,2010,\N,\N,Family
10458182,"(tt9916880,)","([tvEpisode],)","(Horrid Henry Knows It All,)",Horrid Henry Knows It All,0,2014,\N,10,Adventure
10458182,"(tt9916880,)","([tvEpisode],)","(Horrid Henry Knows It All,)",Horrid Henry Knows It All,0,2014,\N,10,Animation


Aprés traitement on ne garde que les colonnes tconst et primaryTitle  genres

In [None]:
colonne_a_garder=['tconst', 'primaryTitle', 'genres']
title_basics_filtered2 = title_basics_filtered2[colonne_a_garder]
title_basics_filtered2


In [349]:
title_basics_filtered2


Unnamed: 0,tconst,titleType,primaryTitle,originalTitle,isAdult,genres
570,"(tt0000574,)","([movie],)","(The Story of the Kelly Gang,)",The Story of the Kelly Gang,0,Action
570,"(tt0000574,)","([movie],)","(The Story of the Kelly Gang,)",The Story of the Kelly Gang,0,Adventure
587,"(tt0000591,)","([movie],)","(The Prodigal Son,)",L'enfant prodigue,0,Drama
610,"(tt0000615,)","([movie],)","(Robbery Under Arms,)",Robbery Under Arms,0,Drama
625,"(tt0000630,)","([movie],)","(Hamlet,)",Amleto,0,Drama
...,...,...,...,...,...,...
10457981,"(tt9916428,)","([movie],)","(The Secret of China,)",Hong xing zhao yao Zhong guo,0,War
10458033,"(tt9916538,)","([movie],)","(Kuambil Lagi Hatiku,)",Kuambil Lagi Hatiku,0,Drama
10458073,"(tt9916620,)","([movie],)","(The Copeland Case,)",The Copeland Case,0,Drama
10458113,"(tt9916706,)","([movie],)","(Dankyavar Danka,)",Dankyavar Danka,0,Comedy


In [314]:
df_title_basics3.head(5)


Unnamed: 0,tconst,titleType,primaryTitle,originalTitle,isAdult,startYear,endYear,runtimeMinutes,genres
0,"(tt0000001,)","([short],)","(Carmencita,)",Carmencita,0,1894,\N,1,"Documentary,Short"
1,"(tt0000002,)","([short],)","(Le clown et ses chiens,)",Le clown et ses chiens,0,1892,\N,5,"Animation,Short"
2,"(tt0000003,)","([short],)","(Pauvre Pierrot,)",Pauvre Pierrot,0,1892,\N,4,"Animation,Comedy,Romance"
3,"(tt0000004,)","([short],)","(Un bon bock,)",Un bon bock,0,1892,\N,12,"Animation,Short"
4,"(tt0000005,)","([short],)","(Blacksmith Scene,)",Blacksmith Scene,0,1893,\N,1,"Comedy,Short"


In [313]:

listes_genres_uniques_to_keep = ['Action',
                                 'Thriller',
                                 'Adventure',
                                 'Sci-Fi',
                                 'Fantasy',
                                 'Animation',
                                 'War',
                                 'Family',
                                 'Musical',
                                 'Mystery',
                                 'Comedy',
                                 'Drama'
                                 ]


['Action',
 'Thriller',
 'Reality-TV',
 'Adventure',
 'Short',
 'Sci-Fi',
 'Fantasy',
 'Animation',
 'Film-Noir',
 'War',
 'Family',
 'Game-Show',
 'Crime',
 'Musical',
 'Mystery',
 'News',
 'Adult',
 'Romance',
 'Documentary',
 'Sport',
 'Comedy',
 'Horror',
 'Talk-Show',
 'nan',
 'Western',
 'Drama',
 'Music',
 'History',
 'Biography']

In [265]:
# Résumé des informations du dataframe
your_dataframe = df_title_basics3
print("#"+"-"*79)
print("valeurs uniques des colonnes:")
for col in your_dataframe.columns:
    liste_valeurs = []
    for i in range(0, your_dataframe.shape[1]):
        split_values = [item.split(",") for item in your_dataframe[col].values[0:i+10]]
        liste_valeurs.append(split_values[i])
    print(f"\nliste_valeurs :{liste_valeurs} \n")
print("#"+"-"*79)
# print(f"\n====shape====: {your_dataframe.shape} \n====list columns==== :\n{your_dataframe.columns.tolist()} ")
# print("#"+"#"*20)
# print(f"====liste des colonnes numeriques====: \n{your_dataframe.select_dtypes(include=[np.number]).columns.tolist()}\n")
# print("#"+"#"*20)
# print(f"====liste des colonnes non numeriques====: \n{your_dataframe.select_dtypes(exclude=[np.number]).columns.tolist()} ")
# print("#"+"#"*20)
# print(f"====Noms des colonnes avec au moins une valeur NA==== : {your_dataframe.columns[your_dataframe.isna().any()].tolist()}")
# print("#"+"#"*20)
# print(f"====Nombre de lignes avec au moins une valeur NA==== : {your_dataframe.isna().any(axis=1).sum()}")
# print("#"+"#"*20)
# print(f"====Colonne avec des na==== :{your_dataframe.isna().sum()} \n")
# print("#"+"#"*20)
# print(f"\ndf ====head==== :\n{your_dataframe.head(2)} \n")
# print("#"+"#"*20)
# print(f"\ndf ====describe==== :\n{your_dataframe.describe()} \n")
# print("#"+"-"*79)


NameError: name 'title_basics2' is not defined

### Traitement title.crew
##### **title.crew.tsv.gz** : Ce fichier contient des informations sur l’équipe de tournage d’un titre.

- **tconst** : Identifiant unique du titre.
 
- **directors** : Réalisateur(s) du titre donné.
 
- **writers** : Scénariste(s) du titre donné.

### traitement ratings
#### **title.ratings.tsv.gz** : Ce fichier contient des informations sur les notes d’un titre.

- **tconst** : Identifiant unique du titre.
 
- **averageRating** : Moyenne pondérée de toutes les notes des utilisateurs.
 
- **numVotes** : Nombre de votes que le titre a reçus.

## Traitement de basics

#### **name.basics.tsv.gz** : Ce fichier contient des informations de base sur une personne.

- **nconst** : Identifiant unique de la personne.
 
- **primaryName** : Nom par lequel la personne est le plus souvent créditée.
 
- **birthYear** : En format YYYY.
 
- **deathYear** : En format YYYY si applicable, sinon ‘\N’.
 
- **primaryProfession** : Les 3 principales professions de la personne.
 
- **knownForTitles** : Titres pour lesquels la personne est connue.

## Fusion des dataset

# inserer au dessus les datasets à traiter

In [None]:
# pour travailler sur du bigdata on peut prendre un échantillon des datasets
# df_sample =df.samples(n=20000, random_state=1)

#  autre possibilite en passant par un chunk
# n = 200000 # taille des chunks
# list_df = [df[i:i+n] for i in range(0, df.shape[0], n)]
# Dans les deux cas, list_df est une liste de DataFrames. Vous pouvez accéder à chaque chunk avec list_df[0], list_df[1], etc.
# Réassembler les chunks : Si vous voulez réassembler les chunks en un seul DataFrame, vous pouvez utiliser pd.concat
# df = pd.concat(list_df)


In [94]:
# sauvegarde des datafdrames sur le disque au format csv
# for nom, df in dfs.items():
#     df.to_csv(f"data_csv/{nom}.csv", index=False)


### Preprocessing dataframes

In [54]:
# # création d'un fichier de synthése des dataframes au format markdown
# import datetime

# # date et heure actuelles
# now = datetime.datetime.now()
# # Formatage date et heure au format 2024-01-01-9h30
# date_time = now.strftime("%Y-%m-%d-%Hh%M")
# # Ajouter la date et l'heure au nom du fichier
# filename = f"synthése_dataframes_{date_time}.md"
# with open(filename, "w", encoding="utf-8") as f:
#     for df in dfs:
#         your_dataframe = dfs[df]
#         f.write(f"\n## DataFrame : {df}\n")
#         f.write(f"- **Shape** : `{your_dataframe.shape}`\n")
#         f.write(
#             f"- **Liste des colonnes** : `{your_dataframe.columns.tolist()}`\n")
#         f.write(
#             f"- **Liste des colonnes numériques** : `{
#                 your_dataframe.select_dtypes(include=[np.number]).columns.tolist()}`\n"
#         )
#         f.write(
#             f"- **Liste des colonnes non numériques** : `{
#                 your_dataframe.select_dtypes(exclude=[np.number]).columns.tolist()}`\n"
#         )
#         f.write(
#             f"- **Colonnes avec des NA** :\n```\n{
#                 your_dataframe.isna().sum()}\n```\n"
#         )
#         f.write(
#             f"- **Noms des colonnes avec au moins une valeur NA** : `{
#                 your_dataframe.columns[your_dataframe.isna().any()].tolist()}`\n"
#         )
#         f.write(
#             f"- **Nombre de lignes avec au moins une valeur NA** : `{
#                 your_dataframe.isna().any(axis=1).sum()}`\n"
#         )
#         headers = " | ".join(your_dataframe.columns.tolist())
#         # Obtenir les lignes
#         rows = [" | ".join(row)
#                 for row in your_dataframe.head(2).astype(str).values]
#         # Écrire les en-têtes et les lignes dans le fichier
#         f.write(f"- **head** :\n```\n{headers}\n{'\n'.join(rows)}\n```\n")


### Suite Analyse du fichier de synthése, choix des colonnes
les colonnes les plus interessantes pour faire des recommendations de films sont:

Ratings : averageRating dans title_ratings pour recommander les films les mieux notés.

Popularité : numVotes dans title_ratings 

Métadonnées du film : genres, startYear, etc. pour recommander des films similaires à ceux que l’utilisateur a aimés par le passé.

Collaborations : title_crew et name_basics, pour recommander des films basés sur les réalisateurs, les scénaristes ou les acteurs 

### Nous allons garder les colonnes suivantes:

1. **name_basics** : 
    - `nconst` : 
    - `primaryName` : Le nom de la personne peut être utile pour identifier les acteurs, réalisateurs, etc. que l'utilisateur aime.
    - `primaryProfession` : La profession principale peut aider à identifier si la personne est un acteur, un réalisateur, etc.
    - `knownForTitles` : Les titres pour lesquels la personne est connue peuvent aider à identifier les films populaires.

2. **title_akas** :
    - `titleId` 
    - `title` 
    - `region` 
3. **title_basics** :
    - `tconst` : Identifiant unique du film, utile pour relier les différents datasets.
    - `primaryTitle` et `originalTitle` 
    - `startYear` 
    - `genres` 
4. **title_crew** :
    - `tconst` : Identifiant unique du film, utile pour relier les différents datasets.
    - `directors` et `writers` : Les réalisateurs et les scénaristes 
5. **title_principals** :
    - `tconst` : Identifiant unique du film, utile pour relier les différents datasets.
    - `nconst` : Identifiant unique de la personne, utile pour relier les personnes aux films.
    - `category` : La catégorie (par exemple, acteur, réalisateur)
6. **title_ratings** :
    - `tconst` : Identifiant unique du film, utile pour relier les différents datasets.
    - `averageRating` et `numVotes` : La note moyenne et le nombre de votes 


In [6]:
# Création d'un nouveau dataframe Liste des colonnes à conserver
import pandas as pd

cols_to_keep = {
    "name_basics": ["nconst", "primaryName", "primaryProfession", "knownForTitles"],
    "title_akas": ["titleId", "title", "region"],
    "title_basics": ["tconst", "primaryTitle", "originalTitle", "startYear", "genres"],
    "title_crew": ["tconst", "directors", "writers"],
    "title_episode": ["tconst", "episodeNumber", "seasonNumber"],
    "title_principals": ["tconst", "nconst", "category"],
    "title_ratings": ["tconst", "averageRating", "numVotes"],
}


# Création du nouveau dataset
# Création du nouveau dataset
dfsd_light = {}
for k, v in cols_to_keep.items():
    print(f"{k} ")
    print(f":{v}")
    dfsd_light[k] = dfs[k][v]


name_basics 
:['nconst', 'primaryName', 'primaryProfession', 'knownForTitles']
title_akas 
:['titleId', 'title', 'region']
title_basics 
:['tconst', 'primaryTitle', 'originalTitle', 'startYear', 'genres']
title_crew 
:['tconst', 'directors', 'writers']
title_episode 
:['tconst', 'episodeNumber', 'seasonNumber']
title_principals 
:['tconst', 'nconst', 'category']
title_ratings 
:['tconst', 'averageRating', 'numVotes']


In [None]:
# dfs_light_df = pd.DataFrame(dfs_light_df)


In [71]:
# controle des noms de colonnes
for df in dfsd_light:
    print(f"\ndf :{df} ")
    your_dataframe = dfs[df]
    print(f"shape: {your_dataframe.shape}\nlist columns :{your_dataframe.columns.tolist()} \nliste des colonnes numeriques: {your_dataframe.select_dtypes(include=[
          np.number]).columns.tolist()} \nliste des colonnes non numeriques: {your_dataframe.select_dtypes(exclude=[np.number]).columns.tolist()}")



df :name_basics 
shape: (13149959, 6)
list columns :['nconst', 'primaryName', 'birthYear', 'deathYear', 'primaryProfession', 'knownForTitles'] 
liste des colonnes numeriques: [] 
liste des colonnes non numeriques: ['nconst', 'primaryName', 'birthYear', 'deathYear', 'primaryProfession', 'knownForTitles']

df :title_akas 
shape: (38367336, 8)
list columns :['titleId', 'ordering', 'title', 'region', 'language', 'types', 'attributes', 'isOriginalTitle'] 
liste des colonnes numeriques: ['ordering'] 
liste des colonnes non numeriques: ['titleId', 'title', 'region', 'language', 'types', 'attributes', 'isOriginalTitle']

df :title_basics 
shape: (10455620, 9)
list columns :['tconst', 'titleType', 'primaryTitle', 'originalTitle', 'isAdult', 'startYear', 'endYear', 'runtimeMinutes', 'genres'] 
liste des colonnes numeriques: [] 
liste des colonnes non numeriques: ['tconst', 'titleType', 'primaryTitle', 'originalTitle', 'isAdult', 'startYear', 'endYear', 'runtimeMinutes', 'genres']

df :title_cre

In [7]:
# Créez une copie explicite de votre DataFrame
df_sauve = dfsd_light["name_basics"].copy()
df = df_sauve.copy()


In [8]:
# Séparez les titres dans 'knownForTitles'
df["knownForTitles"] = df["knownForTitles"].str.split(",")


In [9]:
# Exploser la colonne 'knownForTitles' et la renommer en 'tconst'
df = df.explode("knownForTitles").rename(columns={"knownForTitles": "tconst"})


In [10]:
# Remplacer les '\n' par des valeurs vides
df["tconst"] = df["tconst"].str.replace("\\N", "")


In [11]:
dfsd_light["name_basics"] = df
dfsd_light["name_basics"]["tconst"] = dfsd_light["name_basics"]["tconst"].str.replace(
    "\\N", ""
)


In [12]:
dfsd_light["name_basics"]
# Résumé des informations du dataframe
your_dataframe = dfsd_light["name_basics"]
print(f"\nshape: {your_dataframe.shape} \nlist columns :\n{
      your_dataframe.columns.tolist()} ")
print(f"liste des colonnes numeriques: \n{
      your_dataframe.select_dtypes(include=[np.number]).columns.tolist()}\n")
print(f"liste des colonnes non numeriques: \n{
      your_dataframe.select_dtypes(exclude=[np.number]).columns.tolist()} ")
print(f"Noms des colonnes avec au moins une valeur NA : {
      your_dataframe.columns[your_dataframe.isna().any()].tolist()}")
print(f"Nombre de lignes avec au moins une valeur NA : {
      your_dataframe.isna().any(axis=1).sum()}")
print(f"Colonne avec des na :{your_dataframe.isna().sum()} \n")
print(f"\ndf head :\n{your_dataframe.head(2)} \n")
print(f"\ndf describe :\n{your_dataframe.describe()} \n")



shape: (23010421, 4) 
list columns :
['nconst', 'primaryName', 'primaryProfession', 'tconst'] 
liste des colonnes numeriques: 
[]

liste des colonnes non numeriques: 
['nconst', 'primaryName', 'primaryProfession', 'tconst'] 
Noms des colonnes avec au moins une valeur NA : ['primaryName', 'primaryProfession']
Nombre de lignes avec au moins une valeur NA : 3061205
Colonne avec des na :nconst                     0
primaryName               10
primaryProfession    3061196
tconst                     0
dtype: int64 


df head :
      nconst   primaryName               primaryProfession     tconst
0  nm0000001  Fred Astaire  soundtrack,actor,miscellaneous  tt0050419
0  nm0000001  Fred Astaire  soundtrack,actor,miscellaneous  tt0072308 


df describe :
           nconst  primaryName primaryProfession    tconst
count    23010421     23010411          19949225  23010421
unique   13153024     10154062             21649   1949805
top     nm5907173  David Smith             actor          
freq    

In [None]:
# Merge de name_basics et title_akas
df1 = dfsd_light["name_basics"]
df2 = dfsd_light["title_akas"]

# Merge des dataframes :
df_nconst = df1.merge(df2, how="inner", left_on="tconst", right_on="titleId")


In [62]:
df = df_nconst
print(f"\ncolonnes :\n{df.columns.tolist()} \n")
# Conversion avec des espaces tous les milliers
shape = df.shape
shape_format = tuple("{:,}".format(x).replace(",", " ") for x in shape)
print("Shape avec des espaces tous les milliers :", shape_format)



colonnes :
['nconst', 'primaryName', 'primaryProfession', 'knownForTitles', 'titleId', 'title', 'region'] 

Shape avec des espaces tous les milliers : ('51 517 295', '7')


In [42]:
# merge des df ayant la colonne tconst en commun
df3 = dfsd_light["title_basics"]
df4 = dfsd_light["title_crew"]
df5 = dfsd_light["title_principals"]
df6 = dfsd_light["title_ratings"]
df7 = dfsd_light["title_episode"]
# Merge des dataframes :
df_tconst = df3.merge(df4, how="outer", on="tconst")
df_tconst = df_tconst.merge(df5, how="inner", on="tconst")
df_tconst = df_tconst.merge(df6, how="inner", on="tconst")
df_tconst = df_tconst.merge(df7, how="inner", on="tconst")


In [58]:
df = df_tconst
print(f"\ncolonnes :\n{df.columns.tolist()} \n")
# Conversion avec des espaces tous les milliers
shape = df.shape
shape_format = tuple("{:,}".format(x).replace(",", " ") for x in shape)
print("Shape avec des espaces tous les milliers :", shape_format)



colonnes :
['tconst', 'primaryTitle', 'originalTitle', 'startYear', 'genres', 'directors', 'writers', 'nconst', 'category', 'averageRating', 'numVotes', 'episodeNumber', 'seasonNumber'] 

Shape avec des espaces tous les milliers : ('60 909 999', '13')


In [None]:
# merge des 2 df df_nconst et df_tconst
df_nconst = df1.merge(df2, how="outer", left_on="nconst", right_on="titleId")


In [None]:
# Fusionner tous les dataframes en un seul
dfsd_light_df = pd.concat(dfsd_light.values(), ignore_index=True)

# Afficher les premières lignes du dataframe
print(dfsd_light_df.head())


In [14]:
dfsd_light


{'name_basics':              nconst         primaryName                    primaryProfession  \
 0         nm0000001        Fred Astaire       soundtrack,actor,miscellaneous   
 1         nm0000002       Lauren Bacall                   actress,soundtrack   
 2         nm0000003     Brigitte Bardot  actress,soundtrack,music_department   
 3         nm0000004        John Belushi              actor,soundtrack,writer   
 4         nm0000005      Ingmar Bergman                writer,director,actor   
 ...             ...                 ...                                  ...   
 13149954  nm9993714   Romeo del Rosario  animation_department,art_department   
 13149955  nm9993716       Essias Loberg                                  NaN   
 13149956  nm9993717  Harikrishnan Rajan                      cinematographer   
 13149957  nm9993718         Aayush Nair                      cinematographer   
 13149958  nm9993719          Andre Hill                                  NaN   
 
           

In [None]:
# suppression des colonnes
dfs["title_basics"].drop(
    ["colonne_a_supprimer1", "colonne_a_supprimer2"], axis=1, inplace=True
)


In [None]:
# df=dfs["name_basics"].copy()
# df.explode("knownForTitles")


In [None]:
# Test indépendance entre deux variables quantitatives / Test de corrélation Pearson
# Test d'indépendance entre deux variables qualitatives / Test du Chi²
# Test d'indépendance entre une variable qualitative et une quantitative / Test de Fisher avec l'analyse de la variance (ANOVA)


prétraitement de vos DataFrames :

1. **name_basics** : Vous pouvez convertir les colonnes `birthYear` et `deathYear` en type numérique. De plus, la colonne `primaryProfession` semble contenir plusieurs professions pour une même personne, vous pourriez diviser cette colonne en plusieurs colonnes binaires (une pour chaque profession).

2. **title_akas** : La colonne `isOriginalTitle` pourrait être convertie en booléen. De plus, si vous n'avez pas besoin de toutes les régions ou langues, vous pourriez filtrer ces colonnes pour ne garder que les lignes pertinentes.

3. **title_basics** : Les colonnes `startYear` et `endYear` pourraient être converties en type numérique. La colonne `genres` semble contenir plusieurs genres pour un même titre, vous pourriez diviser cette colonne en plusieurs colonnes binaires (une pour chaque genre).

4. **title_crew** : Les colonnes `directors` et `writers` semblent contenir plusieurs personnes pour un même titre, vous pourriez diviser ces colonnes en plusieurs colonnes (une pour chaque personne).

5. **title_episode** : Les colonnes `seasonNumber` et `episodeNumber` pourraient être converties en type numérique.

6. **title_principals** : La colonne `characters` semble contenir plusieurs personnages pour une même personne, vous pourriez diviser cette colonne en plusieurs colonnes (une pour chaque personnage).

7. **title_ratings** : Les colonnes `averageRating` et `numVotes` pourraient être converties en type numérique.

 gérer les valeurs manquantes, de supprimer les doublons, effectuer une normalisation ou une standardisation des données si nécessaire. 
 fusionner des DataFrames en utilisant les colonnes appropriées comme clés. 


 KPI des DataFrames, possibilités :

1. **name_basics** : Nombre total d'acteurs, nombre d'acteurs par profession, nombre d'acteurs nés chaque année, etc.
2. **title_akas** : Nombre total de titres, nombre de titres par région ou par langue, etc.
3. **title_basics** : Nombre total de titres par type (film, série, etc.), durée moyenne des titres, nombre de titres par genre, etc.
4. **title_crew** : Nombre de réalisateurs et scénaristes uniques, nombre moyen de réalisateurs et scénaristes par titre, etc.
5. **title_episode** : Nombre total d'épisodes, nombre moyen d'épisodes par saison, etc.
6. **title_principals** : Nombre moyen de personnages par titre, nombre de titres par acteur, etc.
7. **title_ratings** : Note moyenne des titres, nombre moyen de votes par titre, etc.

Pour proposer 10 suggestions de films en fonction du choix d'un film d'un utilisateur, vous pouvez utiliser une approche de filtrage collaboratif ou basée sur le contenu. Voici une approche simple basée sur le contenu :

1. **Sélectionnez le film choisi par l'utilisateur** dans votre DataFrame `title_basics`.
2. **Identifiez les genres** de ce film.
3. **Filtrez** votre DataFrame `title_basics` pour ne garder que les films qui partagent au moins un genre avec le film choisi par l'utilisateur.
4. **Classez** ces films en fonction de leur note moyenne dans votre DataFrame `title_ratings` (vous pouvez également prendre en compte le nombre de votes pour éviter les films avec une note moyenne élevée mais un faible nombre de votes).
5. **Sélectionnez les 10 premiers films** de cette liste.


In [95]:
dfs2 = dfs.copy()
dfs3 = dfs.copy()
dfs4 = dfs.copy()
print(f"\ndfs2['title_basics'].shape :\n{dfs2['title_basics'].shape} \n")



dfs2['title_basics'].shape :
(10450471, 9) 



In [None]:
# suppression des na des colonnes concernées
dfs2["title_basics"].dropna(
    subset=["primaryTitle", "originalTitle", "genres"], inplace=True
)
print(f"\ndfs2['title_basics'].shape :\n{dfs2['title_basics'].shape} \n")


In [None]:
# clustering
from sklearn.cluster import KMeans

# Sélectionnez les colonnes pour le clustering
# Remplacez par vos colonnes
data = dfs2["title_basics"][["startYear", "endYear"]]

# Créez l'objet KMeans
# Remplacez 3 par le nombre de clusters que vous voulez
kmeans = KMeans(n_clusters=3)

# Ajustez le modèle aux données
kmeans.fit(data)

# Obtenez les labels de cluster pour chaque observation
labels = kmeans.labels_


In [None]:
# for df in dfs2:
#     # supprime les valeurs manquantes sur les colonnes
#     # dfs2[df].dropna(axis=0, inplace=True)
#     # supprime les valeurs manquantes sur les lignes
#     dfs2[df].dropna(axis=1, inplace=True)
#     print(f"\nshape {df}:\n{dfs2[df].shape} \n")


In [76]:
# suppression des na des colonnes concernées
dfs2["title_basics"].dropna(
    subset=["primaryTitle", "originalTitle", "genres"], inplace=True
)


def regrouper_genres(genre):
    if genre in ["Action", "Adventure", "Thriller", "War"]:
        return "Action et Aventure"
    elif genre in ["Comedy", "Romance", "Family"]:
        return "Comédies"
    elif genre in ["Drama", "Crime", "Mystery"]:
        return "Drames"
    elif genre in ["Horror", "Sci-Fi"]:
        return "Science-Fiction et Fantastique"
    elif genre in ["Biography", "History", "Documentary"]:
        return "Documentaires"
    elif genre in ["Music", "Musical"]:
        return "Musique et Comédies Musicales"
    else:
        return "Autres"


# Appliquer la fonction de regroupement aux genres
dfs2["title_basics"]["genres"] = dfs2["title_basics"]["genres"].apply(
    regrouper_genres)

# Utiliser get_dummies sur la colonne de genres regroupés
dfs2["title_basics"] = dfs2["title_basics"].join(
    dfs2["title_basics"]["genres"].str.get_dummies(",")
)


In [77]:
# Résumé des informations du dataframe
your_dataframe = dfs2["title_basics"]
print(
    f"\nshape: {your_dataframe.shape} \nlist columns :\n{
        your_dataframe.columns.tolist()} "
)
print(
    f"liste des colonnes numeriques: \n{
        your_dataframe.select_dtypes(include=[np.number]).columns.tolist()}\n"
)
print(
    f"liste des colonnes non numeriques: \n{
        your_dataframe.select_dtypes(exclude=[np.number]).columns.tolist()} "
)
print(
    f"Noms des colonnes avec au moins une valeur NA : {
        your_dataframe.columns[your_dataframe.isna().any()].tolist()}"
)
print(
    f"Nombre de lignes avec au moins une valeur NA : {
        your_dataframe.isna().any(axis=1).sum()}"
)
print(f"Colonne avec des na :{your_dataframe.isna().sum()} \n")
print(f"\ndf head :\n{your_dataframe.head(10)} \n")
print(f"\ndf describe :\n{your_dataframe.describe()} \n")



shape: (10450471, 16) 
list columns :
['tconst', 'titleType', 'primaryTitle', 'originalTitle', 'isAdult', 'startYear', 'endYear', 'runtimeMinutes', 'genres', 'Action et Aventure', 'Autres', 'Comédies', 'Documentaires', 'Drames', 'Musique et Comédies Musicales', 'Science-Fiction et Fantastique'] 
liste des colonnes numeriques: 
['startYear', 'endYear', 'Action et Aventure', 'Autres', 'Comédies', 'Documentaires', 'Drames', 'Musique et Comédies Musicales', 'Science-Fiction et Fantastique']

liste des colonnes non numeriques: 
['tconst', 'titleType', 'primaryTitle', 'originalTitle', 'isAdult', 'runtimeMinutes', 'genres'] 
Noms des colonnes avec au moins une valeur NA : ['primaryTitle', 'originalTitle', 'startYear', 'endYear']
Nombre de lignes avec au moins une valeur NA : 10333737
Colonne avec des na :tconst                                   0
titleType                                0
primaryTitle                            17
originalTitle                           17
isAdult           

In [79]:
nombre_de_films_pour_adultes = dfs2["title_basics"][
    dfs2["title_basics"]["isAdult"] == "1"
].shape[0]
print("Nombre de films pour adultes : ", nombre_de_films_pour_adultes)


Nombre de films pour adultes :  331925


In [None]:
# Prétraitement pour title_crew
dfs2["title_crew"] = dfs2["title_crew"].join(
    dfs2["title_crew"]["directors"].str.get_dummies(
        ",").add_prefix("director_")
)
dfs2["title_crew"] = dfs2["title_crew"].join(
    dfs2["title_crew"]["writers"].str.get_dummies(",").add_prefix("writer_")
)


In [None]:
# Prétraitement pour title_episode
dfs2["title_episode"]["seasonNumber"] = pd.to_numeric(
    dfs2["title_episode"]["seasonNumber"], errors="coerce"
)
dfs2["title_episode"]["episodeNumber"] = pd.to_numeric(
    dfs2["title_episode"]["episodeNumber"], errors="coerce"
)


In [None]:
# Prétraitement pour title_principals
dfs2["title_principals"] = dfs2["title_principals"].join(
    dfs2["title_principals"]["characters"].str.get_dummies(
        ",").add_prefix("character_")
)


In [None]:
# Prétraitement pour title_ratings
dfs2["title_ratings"]["averageRating"] = pd.to_numeric(
    dfs2["title_ratings"]["averageRating"], errors="coerce"
)
dfs2["title_ratings"]["numVotes"] = pd.to_numeric(
    dfs2["title_ratings"]["numVotes"], errors="coerce"
)


In [None]:
# Fusion des tables
df_titles = dfs2["title_basics"].merge(
    dfs2["title_crew"], on="tconst", how="left")
df_titles = df_titles.merge(dfs2["title_episode"], on="tconst", how="left")
df_titles = df_titles.merge(dfs2["title_principals"], on="tconst", how="left")
df_titles = df_titles.merge(dfs2["title_ratings"], on="tconst", how="left")

print(df.head())


In [None]:
# KPI pour name_basics
print("Nombre total d'acteurs : ", dfs2["name_basics"]["nconst"].nunique())
print(
    "Nombre d'acteurs par profession : ",
    dfs2["name_basics"]["primaryProfession"].value_counts(),
)

# KPI pour title_basics
print(
    "Nombre total de titres par type : ",
    dfs2["title_basics"]["titleType"].value_counts(),
)
print("Durée moyenne des titres : ",
      dfs2["title_basics"]["runtimeMinutes"].mean())

# KPI pour title_ratings
print("Note moyenne des titres : ",
      dfs2["title_ratings"]["averageRating"].mean())
print("Nombre moyen de votes par titre : ",
      dfs2["title_ratings"]["numVotes"].mean())


# Suggestions de films
def suggest_movies(movie_title):
    # Sélectionnez le film choisi par l'utilisateur
    movie = dfs2["title_basics"][
        dfs2["title_basics"]["primaryTitle"] == movie_title
    ].iloc[0]

    # Identifiez les genres de ce film
    genres = movie["genres"].split(",")

    # Filtrez votre DataFrame pour ne garder que les films qui partagent au moins un genre avec le film choisi
    similar_movies = dfs2["title_basics"][
        dfs2["title_basics"]["genres"].apply(
            lambda x: any(genre in x for genre in genres)
        )
    ]

    # Fusionnez avec title_ratings
    similar_movies = similar_movies.merge(dfs2["title_ratings"], on="tconst")

    # Classez ces films en fonction de leur note moyenne
    similar_movies = similar_movies.sort_values(
        by="averageRating", ascending=False)

    # Sélectionnez les 10 premiers films de cette liste
    top_10_movies = similar_movies.head(10)

    return top_10_movies


# Remplacez par le titre du film choisi par l'utilisateur
movie_title = "Carmencita"
print("Voici 10 suggestions de films basées sur le film ", movie_title, " :")
print(suggest_movies(movie_title))


In [None]:
# # remplacement de la colonne index par titleId  !!attention cela risque de supprimer des doublons
# title_akas_fr.reset_index(drop=True, inplace=True)
# title_akas_fr.set_index("titleId", inplace=True)
# title_akas_fr
