# Exercise 1: Identifying the Best-Performing Model
We are giving you three different outputs from genre-prediction models. Your job is to use precision, recall, and F1 score metrics to determine which model has the best performance in predicting a movie's genre.

Download each of three outputs from different genre-prediction tasks:
File 1 from Model 1 Download File 1 from Model 1
File 2 from Model 2 Download File 2 from Model 2
File 3 from Model 3 Download File 3 from Model 3
These files contain a row for every prediction, and each row contains the movie ID, predicted genre, the set of actual genres, and whether the prediction was correct (1 for correct, 0 for incorrect).
"Correct" is 1 if the predicted genre exists in the set of actual genres.
E.g., a row contains a prediction for "Action", and the "actual genres" column contains "['Action', 'Adventure', 'Comedy']", the row's prediction is correct since "Action" appears in the genre set.
For each file, calculate accuracy, precision, recall, and F1 scores for each of the following genres separately:
Drama
Comedy
Horror
To determine precision/recall for a movie of a specific genre, "negative" samples are all movies where that genre does not appear in the "actual genre" column, and "positive samples" are all movies where that genre *does* appear in the "actual genre" column.
Use these metrics to justify which model–models 1, 2, or 3–produces the best predictions.
List of Genres (some of which will have zero predictions):

['Action',
 'Adventure',
 'Animation',
 'Biography',
 'Comedy',
 'Crime',
 'Documentary',
 'Drama',
 'Family',
 'Fantasy',
 'History',
 'Horror',
 'Music',
 'Musical',
 'Mystery',
 'News',
 'Romance',
 'Sci-Fi',
 'Sport',
 'Thriller',
 'War',
 'Western']

In [None]:
import pandas as pd
import ast
from sklearn.metrics import precision_score, recall_score, f1_score, accuracy_score


df1 = pd.read_csv('prediction_model_01.csv')
df2 = pd.read_csv('prediction_model_02.csv')
df3 = pd.read_csv('prediction_model_03.csv')


def clean(df): # function for cleaning purposes to calculate scoring metrics afterwards
    df = df.copy()
    df['actual'] = df['actual genres'].apply(ast.literal_eval) # treat as list with copy
    return df

df1 = clean(df1)
df2 = clean(df2)
df3 = clean(df3)

genres = ['Action',
 'Adventure',
 'Animation',
 'Biography',
 'Comedy',
 'Crime',
 'Documentary',
 'Drama',
 'Family',
 'Fantasy',
 'History',
 'Horror',
 'Music',
 'Musical',
 'Mystery',
 'News',
 'Romance',
 'Sci-Fi',
 'Sport',
 'Thriller',
 'War',
 'Western']

def compute_metrics(df):
    results = {}
    for g in genres:
        g_true = df['actual'].apply(lambda lst: g in lst)
        g_pred = df['predicted'] == g

        precision = precision_score(g_true, g_pred, zero_division=0)
        recall = recall_score(g_true, g_pred, zero_division=0)
        f1 = f1_score(g_true, g_pred, zero_division=0)
        accuracy = accuracy_score(g_true, g_pred)
        results[g] = dict(precision=precision, recall=recall, f1=f1, accuracy=accuracy)
    return results

metrics1 = compute_metrics(df1)
metrics2 = compute_metrics(df2)
metrics3 = compute_metrics(df3)

metrics1


metrics2


metrics3


{'Action': {'precision': 0.0,
  'recall': 0.0,
  'f1': 0.0,
  'accuracy': 0.8169761273209549},
 'Adventure': {'precision': 0.0,
  'recall': 0.0,
  'f1': 0.0,
  'accuracy': 0.9135278514588859},
 'Animation': {'precision': 0.0,
  'recall': 0.0,
  'f1': 0.0,
  'accuracy': 0.9697612732095491},
 'Biography': {'precision': 0.0,
  'recall': 0.0,
  'f1': 0.0,
  'accuracy': 0.9607427055702917},
 'Comedy': {'precision': 0.0,
  'recall': 0.0,
  'f1': 0.0,
  'accuracy': 0.6870026525198939},
 'Crime': {'precision': 0.0,
  'recall': 0.0,
  'f1': 0.0,
  'accuracy': 0.8620689655172413},
 'Documentary': {'precision': 0.0,
  'recall': 0.0,
  'f1': 0.0,
  'accuracy': 0.9681697612732095},
 'Drama': {'precision': 0.4981432360742706,
  'recall': 1.0,
  'f1': 0.6650141643059491,
  'accuracy': 0.4981432360742706},
 'Family': {'precision': 0.0,
  'recall': 0.0,
  'f1': 0.0,
  'accuracy': 0.9517241379310345},
 'Fantasy': {'precision': 0.0,
  'recall': 0.0,
  'f1': 0.0,
  'accuracy': 0.9506631299734748},
 'Histo