# Structure and Plot
A key goal of *Moviegoer* is understanding a film's structure. A movie can be divided into several types of units: acts, sequences, shots, but the most important is the **scene**. A scene is a self-contained container of action and dialogue, usually taking place at a single location, and involving one or more characters.

In this notebook, we'll take a look at some basic information about the film as a whole, then look at scenes we've found and what we can learn from their dialogue. Many of these analyses are image-based, and examples of illustrations are provided in the Readme in the root of this directory. It's strongly recommended to follow along with those illustrations.

In [1]:
import sys
sys.path.append('../unifying_features')
sys.path.append('../data_serialization')
from serialization_preprocessing_io import *
from time_reference_io import *
from film_details_io import *
from scene_identification_io import *
from scene_details_io import *
from character_identification_io import *
from character_details_io import *
nlp = spacy.load('en')

# Baseline Film Information
First, we'll take a look at the drama *Lost in Translation* (2003). We can view some basic information about the film as a whole.

In [2]:
film = 'lost_in_translation_2003'
display_film_baseline(film)

---------
Technical
---------
Aspect Ratio: 1.84
Average shot duration: 5.11
Average frame brightness: 36
Average frame contrast: 29

--------
Dialogue
--------
Spoken sentences per minute: 15
Words per sentence: 4.13

-------
Emotion
-------
Percentage of Upset facial expressions: 50%
Instances of laughter, per minute: 0.3
One in 1941 words is a profanity.


*Lost in Translation* is a famously quiet movie. As a film about loneliness, it features many dialogue-free scenes of its characters wandering through Tokyo, trying to make sense of a foreign culture. The dialogue that does exist is sparse.

# Two-Character Dialogue Scenes
We've created an algorithm to automatically identify scenes, partitioning them with the identification of their first and last frame. For now, the algorithm specifically identifies two-character dialogue scenes. These types of scenes are the basic building blocks of cinema: two characters speaking to each other, with no distractions, purely advancing the plot.

In [3]:
srt_df, subtitle_df, sentence_df, vision_df, face_df = read_pickle(film)
scene_dictionaries = generate_scenes(vision_df, face_df, substantial_minimum=4, anchor_search=6)

## The A/B/A/B pattern
In modern film, two-character dialogue scenes follow a very distinct pattern. Character A speaks, then Character B, then back to A, then to B, etc. We cut back and forth between the two characters.

We look for these two Anchor shots, which are the shots of the two characters and form the A/B/A/B pattern. We can also identify Cutaway shots, which aren't part of the A/B/A/B pattern but are still part of the scene.

Let's take a look at the first scene the algorithm identified. This is the very first scene that our main characters, Charlotte and Bob, have a conversation. It doesn't take place until 31 minutes until the film — again, the film is sparse on dialogue.

In [4]:
scene_dictionaries[1]

{'scene_id': 1,
 'first_frame': 1929,
 'last_frame': 2082,
 'scene_duration': 154,
 'left_anchor_shot_cluster': 207,
 'left_anchor_face_cluster': 6.0,
 'matching_left_face_clusters': [],
 'right_anchor_shot_cluster': 66,
 'right_anchor_face_cluster': 17.0,
 'matching_right_face_clusters': [18.0, 12.0],
 'cutaway_shot_clusters': [7, 29, 142]}

In this 2:34 scene, Bob and Charlotte have a conversation in the A/B/A/B format, so we're able to identify the first and last frames in which they speak. We can also discover some Cutaways, which are just two-shots of Bob and Charlotte sitting at the bar.

We can then turn to the dialogue to try and identify important phrases and bits of conversation. We extract possible important terms, as well as potentially important pieces of dialogue.

In [5]:
single_scene_dict = scene_dictionaries[1]
display_scene_dialogue_context(single_scene_dict, subtitle_df, sentence_df, nlp)

-------------------------------
Icebreaker (Conversation Start)
-------------------------------
Thanks.
So, what are you doing here?
A couple of things.


-------------------------
Kicker (Conversation End)
-------------------------
Cheers to that.
Wish I could sleep.
Me, too.


--------------------------------
Possible Important Terms, TF-IDF
--------------------------------
['years', 'porsche', 'yeah', 'doing', 'just']


-------------------------------------
Possible Important Terms, Noun Groups
-------------------------------------
['a Porsche', 'marriage', 'Cheers', 'A couple', 'things']


--------------------------------
Directed Questions and Responses
--------------------------------
So, what are you doing here?
A couple of things.

What are you doing?
My husband's a photographer, so he's here working, and I wasn't doing anything, so I came along.

How long have you been married?
Thank you.

Did you buy a Porsche yet?
You know, I was thinking about buying a Porsche.

What do you

The "Directed Questions" section is particularly informative here. This is the first scene where the characters speak, so they're getting to know each other by asking them personal questions.

Directed questions are specifically questions addressing the second-person "you", so they're generally more informative than any other types of questions. For example, "Is it cold out?" vs. "Are you bringing a jacket?": the second question invokes a more personal response.

## Emotional Analysis, at the Scene Level

With scenes identified, we can also look into the scene's conversation speed, as well as character facial expressions. This scene has Bob and Charlotte reconciling after a fight, and realizing their time together is coming to an end.

In [6]:
single_scene_dict = scene_dictionaries[4]
single_scene_dict

{'scene_id': 4,
 'first_frame': 5195,
 'last_frame': 5242,
 'scene_duration': 48,
 'left_anchor_shot_cluster': 30,
 'left_anchor_face_cluster': 6.0,
 'matching_left_face_clusters': [],
 'right_anchor_shot_cluster': 38,
 'right_anchor_face_cluster': 12.0,
 'matching_right_face_clusters': [9.0, 17.0],
 'cutaway_shot_clusters': [10, 29]}

In [7]:
display_scene_dialogue_context(single_scene_dict, subtitle_df, sentence_df, nlp)

-------------------------------
Icebreaker (Conversation Start)
-------------------------------
That was the worst lunch.
So bad.
What kind of restaurant makes you cook your own food?


-------------------------
Kicker (Conversation End)
-------------------------
When are you leaving?
Tomorrow.
I'll miss you.


--------------------------------
Possible Important Terms, TF-IDF
--------------------------------
['makes cook food', 'cook food', 'll miss', 'bad kind', 'bad kind restaurant']


-------------------------------------
Possible Important Terms, Noun Groups
-------------------------------------
['the worst lunch', 'What kind', 'restaurant', 'your own food', 'Tomorrow']


--------------------------------
Directed Questions and Responses
--------------------------------
What kind of restaurant makes you cook your own food?
When are you leaving?

When are you leaving?
Tomorrow.

-------------------------
First-Person Declarations
-------------------------
I'll miss you.

------------

Recall that when we looked at the film's baseline statistics and found that it had an average conversation speed of 15 sentences per minute. This scene is even slower, clocking in at a barren 8 sentences per minute. The emotional impact comes from the characters' facial features as they look at each other in silence.

Charlotte is sad about their impending departure, and her face is sad in almost 40% of her frames. Bob, played by the notoriously deadpan Bill Murray, has a neutral look on his face for the majority of the scene.

In [8]:
display_scene_emotions(single_scene_dict, face_df)

Left character, with face clusters [6.0] has the primary emotion: angry, in 39% of frames
Right character, with face clusters [12.0, 9.0, 17.0] has the primary emotion: neutral, in 56% of frames


## Finding Drama: Scene vs. Film Attributes
Finally, we can take a look at a scene from *Plus One* (2019), a romantic comedy.

In [9]:
film = 'plus_one_2019'
display_film_baseline(film)

---------
Technical
---------
Aspect Ratio: 2.39
Average shot duration: 3.73
Average frame brightness: 59
Average frame contrast: 44

--------
Dialogue
--------
Spoken sentences per minute: 32
Words per sentence: 4.0

-------
Emotion
-------
Percentage of Upset facial expressions: 45%
Instances of laughter, per minute: 0.67
One in 117 words is a profanity.


Sharp dialogue is a staple of the romantic comedy ― *Plus One* more than doubles the amount of sentences per minute of *Lost in Translation*. As a more mainstream rom-con, it's shot more traditionally than *Lost in Translation*. We've identified 18 two-character dialogue scenes ― we'll take a look at the 17th scene.

In [10]:
srt_df, subtitle_df, sentence_df, vision_df, face_df = read_pickle(film)
scene_dictionaries = generate_scenes(vision_df, face_df, substantial_minimum=4, anchor_search=6)
single_scene_dict = scene_dictionaries[17]
display_scene_dialogue_context(single_scene_dict, subtitle_df, sentence_df, nlp)

-------------------------------
Icebreaker (Conversation Start)
-------------------------------
Uh, does he...
Does he know about me?
Yeah.


-------------------------
Kicker (Conversation End)
-------------------------
Nate, can we just get five minutes?
Um, I should really go back inside.
Bye, Ben.


--------------------------------
Possible Important Terms, TF-IDF
--------------------------------
['right', 'speech', 'love', 'just', 'ben']


-------------------------------------
Possible Important Terms, Noun Groups
-------------------------------------
['a big speech', 'Mm', 'the whole Shaina', 'accounting thing', 'one thing']


--------------------------------
Directed Questions and Responses
--------------------------------
-------------------------
First-Person Declarations
-------------------------
Um, hey, can I...
I just need to say one thing, and I'll...
I'll leave you alone, I swear.
I really can't handle a big speech right now, Ben.
I'm an asshole.
I-I can't do them on my o

This is a tense scene where the character of Ben pours his heart out for Alice. The scene contains double the amount of profanity as the film overall. We can pay special attention to scenes like these. Lots of profanity might indicate lots of drama (and emotional data!) like a fight or scene like this one. This applies for other scene/film attributes, not just profanity, so we should be comparing scenes to the film baseline.

Also of note, this scene contains a lot of First-Person Declarations which tell us a lot about the plot.

In [11]:
display_scene_emotions(single_scene_dict, face_df)

Left character, with face clusters [31.0, 11.0, 3.0] has the primary emotion: sad, in 35% of frames
Right character, with face clusters [2.0, 8.0] has the primary emotion: sad, in 53% of frames
