## Disney Movie Sentiment Effect
This notebook aims to predict the sentiment of upcoming Disney movies based on character roles and viewer sentiment data. The workflow includes merging datasets, training a machine learning model, and making predictions on new datasets.

The purpose of this experiment is to classify characters and analyze their impact on viewers' emotional responses, you can use a combination of personality traits, narrative roles, and archetypes. Here are 10 distinct character types that could influence emotional responses:

1. The Hero/Protagonist
Description: Central character driving the story, often evoking empathy or admiration.
Effect: A poorly written or unrelatable protagonist can lead to disengagement.
Examples: Simba (The Lion King), Moana (Moana).
2. The Villain/Antagonist
Description: Opposes the hero, often introducing conflict and tension.
Effect: An overly predictable or weak villain can reduce emotional stakes.
Examples: Scar (The Lion King), Ursula (The Little Mermaid).
3. The Comic Relief
Description: Provides humor and light-hearted moments.
Effect: Excessive or inappropriate humor can disrupt the emotional tone.
Examples: Olaf (Frozen), Mushu (Mulan).
4. The Sidekick
Description: Supports the hero, often offering loyalty and advice.
Effect: A sidekick who is annoying or lacks depth can frustrate viewers.
Examples: Timon and Pumbaa (The Lion King), Dory (Finding Nemo).
5. The Mentor
Description: Offers wisdom or guidance to the hero, often inspiring emotional connections.
Effect: An unconvincing or overly preachy mentor can feel insincere.
Examples: Mufasa (The Lion King), Grandmother Willow (Pocahontas).
6. The Love Interest
Description: Provides romantic tension or emotional stakes for the protagonist.
Effect: A shallow or forced love interest can detract from the story's emotional weight.
Examples: Belle (Beauty and the Beast), Jasmine (Aladdin).
7. The Anti-Hero
Description: A morally complex character who may work against traditional values.
Effect: If poorly developed, they can confuse or alienate viewers.
Examples: Megamind (Megamind), Jack Sparrow (Pirates of the Caribbean).
8. The Tragic Character
Description: Suffers personal loss or misfortune, evoking pity or sadness.
Effect: Overuse of tragedy or melodrama can feel manipulative.
Examples: Bambi's mother (Bambi), Bing Bong (Inside Out).
9. The Villain-Turned-Ally
Description: Starts as an antagonist but changes sides due to personal growth or redemption.
Effect: A poorly justified redemption arc can feel hollow or unsatisfying.
Examples: Stitch (Lilo & Stitch), Kuzco (The Emperor’s New Groove).
10. The Background Ensemble
Description: Minor characters who add flavor but do not drive the main plot.
Effect: Overly distracting or poorly integrated background characters can disrupt immersion.
Examples: The hyenas (The Lion King), The trolls (Frozen).
Additional Subtypes to Consider
The Innocent Child: Evokes purity or vulnerability (e.g., Boo in Monsters, Inc.).
The Animal Companion: Adds charm or relatability (e.g., Maximus in Tangled).
The Trickster: Causes chaos, often with ambiguous motives (e.g., Loki in Marvel movies).


## Experiment Setup

# Problem Definition
Predict Future Disney Movie Sentiment Based on Scripted Character Data.

## Step 1: Data Collection:

#### Gather viewer feedback on emotional responses to movies with identified character types.
Collect metadata on characters’ roles, screen time, and development.
Feature Engineering:

#### Categorize characters into these 10 types.
Include attributes like personality traits, narrative impact, and audience reactions.
Analysis:

#### Use sentiment analysis on reviews or surveys to measure emotional responses.
Apply machine learning models to determine correlations between character types and negative responses.
Insights:

#### Identify which types consistently evoke negative reactions.
Determine if combinations of character types exacerbate or mitigate negative effects.

In [67]:
# Step 1: Import Required Libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report, accuracy_score

# Step 2: Load and Explore Datasets
Load datasets and examine the structure of the data.

In [68]:
# Step 2: Load and Explore Datasets
# Load the character viewer sentiment dataset
viewer_sentiment_path = 'character_sentiment_scores.csv'  # Replace with your file path
character_roles_path = 'character_roles.csv'  # Replace with your file path
viewer_sentiment_data = pd.read_csv(viewer_sentiment_path)
character_roles_data = pd.read_csv(character_roles_path)

# Display the first few rows of each dataset
print("Viewer Sentiment Data:")
print(viewer_sentiment_data.tail())

print("\nCharacter Roles Data:")
print(character_roles_data.tail())

Viewer Sentiment Data:
   Character Name   Movie Title          Role  Sentiment Score
95         Marlin  Finding Nemo   Protagonist                9
96           Dory  Finding Nemo  Comic Relief                9
97           Nemo  Finding Nemo    Supportive                9
98          Bruce  Finding Nemo    Antagonist                6
99          Crush  Finding Nemo  Comic Relief                8

Character Roles Data:
   Character Name          Role
95         Marlin   Protagonist
96           Dory  Comic Relief
97           Nemo    Supportive
98          Bruce    Antagonist
99          Crush  Comic Relief


# Step 3: Merge Datasets
Combine the datasets for training the model.

In [69]:
# Step 3: Merge Datasets
# Merge the datasets on a common key, such as 'Character Name'
#merged_data = pd.merge(viewer_sentiment_data, character_roles_data, on='Character Name', how='inner')

# Display the merged dataset
print("\nMerged Data:")
print(merged_data.head())

# Step 5: Data Preprocessing
# Encode categorical features and prepare the data for training
merged_data['Sentiment Label'] = merged_data['Sentiment Score'].map({
    10: "Positive",
    9: "Positive",
    8: "Positive",
    7: "Positive",
    6: "Neutral",
    5: "Neutral",
    5: "Neutral",
    4: "Neutral",
    3: "Negative",
    2: "Negative",
    1: "Negative"
})


#merged_data = merged_data.drop(['Sentiment Label'], axis=1)
features = merged_data.drop(['Sentiment Score'], axis=1)
#labels = merged_data['Role']

# Perform one-hot encoding for categorical features
features = pd.get_dummies(features)

print("\nMerged Data:")
print(merged_data)


Merged Data:
   Character Name Movie Title       Role_x  Sentiment Score       Role_y  \
0  Luke Skywalker   Star Wars  Protagonist                9  Protagonist   
1     Darth Vader   Star Wars   Antagonist                8   Antagonist   
2     Leia Organa   Star Wars   Supportive                9   Supportive   
3        Han Solo   Star Wars  Protagonist                9  Protagonist   
4            Yoda   Star Wars       Mentor               10       Mentor   

  Sentiment Label  
0        Positive  
1        Positive  
2        Positive  
3        Positive  
4        Positive  

Merged Data:
    Character Name   Movie Title        Role_x  Sentiment Score        Role_y  \
0   Luke Skywalker     Star Wars   Protagonist                9   Protagonist   
1      Darth Vader     Star Wars    Antagonist                8    Antagonist   
2      Leia Organa     Star Wars    Supportive                9    Supportive   
3         Han Solo     Star Wars   Protagonist                9   Prota

In [70]:
# Step 6: Train-Test Split
X_train, X_test, y_train, y_test = train_test_split(features, labels, test_size=0.2, random_state=42)

# Step 7: Train the Model
model = RandomForestClassifier(random_state=42)
model.fit(X_train, y_train)

# Step 8: Evaluate the Model
predictions = model.predict(X_test)
print("\nModel Evaluation:")
print(classification_report(y_test, predictions))
print("Accuracy:", accuracy_score(y_test, predictions))


Model Evaluation:
                       precision    recall  f1-score   support

             Avengers       0.00      0.00      0.00         1
                 Cars       0.00      0.00      0.00         1
                 Coco       1.00      1.00      1.00         1
         Harry Potter       1.00      1.00      1.00         2
           Inside Out       1.00      1.00      1.00         1
             Iron Man       0.00      0.00      0.00         1
                Shrek       1.00      1.00      1.00         1
              Shrek 2       0.00      0.00      0.00         0
            Star Wars       1.00      1.00      1.00         3
              Tangled       0.00      0.00      0.00         0
      The Dark Knight       0.00      0.00      0.00         1
      The Incredibles       1.00      1.00      1.00         1
        The Lion King       1.00      1.00      1.00         2
   The Little Mermaid       1.00      1.00      1.00         1
The Lord of the Rings       1.00   

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


In [None]:
# Step 9: Plot Characteristics vs. Sentiment
# Visualize patterns and correlations
import seaborn as sns
plt.figure(figsize=(4, 28))
sns.scatterplot(
    data=merged_data,
    y='Character Name',
    x='Sentiment Score',
    hue='Sentiment Score',
    #palette={'Positive': 'green', 'Negative': 'red', 'Neutral': 'blue'},
    s=100
)
plt.title("Character Viewer Sentiment")
plt.xlabel("Sentiment")
plt.ylabel("Character Name")
plt.xticks()
plt.legend(title="Viewer Sentiment")
plt.tight_layout()
plt.show()

In [72]:

# Step 10: Predict Sentiment for New Data
# Simulate a new dataset based on an upcoming movie script
new_script_data = pd.DataFrame({
    'Character Name': ['Hero', 'Villain', 'Sidekick'],
    'Role': ['Protagonist', 'Antagonist', 'Comic Relief'],
    'Screen Time (Minutes)': [60, 45, 30]
})

# Preprocess the new dataset
new_script_data = pd.get_dummies(new_script_data)
missing_cols = set(features.columns) - set(new_script_data.columns)
for col in missing_cols:
    new_script_data[col] = 0

new_script_data = new_script_data[features.columns]

# Predict sentiment for the new script
new_predictions = model.predict(new_script_data)
new_script_data['Predicted Sentiment'] = new_predictions

# Display the predictions
print("\nPredicted Sentiment for New Characters:")
print(new_script_data)

# Step 11: Predict Overall Movie Sentiment
# Aggregate individual character sentiments to predict overall movie sentiment
overall_movie_sentiment = 'Positive' if new_script_data['Predicted Sentiment'].mean() > 0.5 else 'Negative'
print("\nPredicted Overall Movie Sentiment:", overall_movie_sentiment)



Predicted Sentiment for New Characters:
   Character Name_Albus Dumbledore  Character Name_Anger  Character Name_Anna  \
0                                0                     0                    0   
1                                0                     0                    0   
2                                0                     0                    0   

   Character Name_Anton Ego  Character Name_Aragorn  Character Name_Ariel  \
0                         0                       0                     0   
1                         0                       0                     0   
2                         0                       0                     0   

   Character Name_Beast  Character Name_Belle  Character Name_Bruce  \
0                     0                     0                     0   
1                     0                     0                     0   
2                     0                     0                     0   

   Character Name_Bruce Banner  ...  Rol

  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
  new_script_data[col] = 0
 

TypeError: Could not convert string 'MoanaMoanaMoana' to numeric