# scene_details
Since we can automatically identify a few scenes of a film, we can begin taking a closer look at each. In this notebook, we'll try and identify as many features as we can: the level of drama in the dialogue, the emotions of the characters, the mood of the score, etc.

In [1]:
import sys
from sklearn.feature_extraction.text import TfidfVectorizer
from time_reference_io import *
from film_details_io import *
from collections import Counter
import pandas as pd
from scene_identification_io import *
sys.path.append('../subtitle_features')
from subtitle_dataframes_io import *
sys.path.append('../audio_features')
from audio_processing_io import *

In [2]:
film = 'lost_in_translation_2003'
srt_df, subtitle_df, sentence_df, vision_df, face_df = read_pickle(film)

## Creating Scene-Specific DataFrames
We have a function to create a dictionary of scenes.

In [3]:
scene_dictionaries = generate_scenes(vision_df, face_df, substantial_minimum=4, anchor_search=6)
len(scene_dictionaries)

5

In [4]:
scene_dict = scene_dictionaries[3]
scene_dict

{'scene_id': 3,
 'first_frame': 2489,
 'last_frame': 2533,
 'scene_duration': 45,
 'left_anchor_shot_cluster': 196,
 'left_anchor_face_cluster': 18.0,
 'matching_left_face_clusters': [9.0, 12.0],
 'right_anchor_shot_cluster': 38,
 'right_anchor_face_cluster': 6.0,
 'matching_right_face_clusters': [],
 'cutaway_shot_clusters': [29]}

Each dictionary gives us the first and last frames. We can use these to create new vision- and subtitle-related dataframes containing only data from this scene.
### Vision DataFrames

In [5]:
scene_vision_df = vision_df[scene_dict['first_frame'] - 1:scene_dict['last_frame']]
scene_face_df = face_df[scene_dict['first_frame'] - 1:scene_dict['last_frame']]

### Subtitle DataFrames

In [6]:
scene_duration = scene_dict['last_frame'] + 1 - scene_dict['first_frame']
scene_start_time = frame_to_time(scene_dict['first_frame'])
scene_end_time = frame_to_time(scene_dict['last_frame'] + 1) # add 1 second; scene ends one second after this frame is onscreen
scene_subtitle_df = subtitle_df[
    (subtitle_df['end_time'] > scene_start_time) & (subtitle_df['start_time'] < scene_end_time)].copy()

scene_sentence_indices = []
x = 0
for sub_index_list in sentence_df.subtitle_indices.values:
    for sub_index in sub_index_list:
        if sub_index in scene_subtitle_df.index.values:
            scene_sentence_indices.append(x)
    x += 1
scene_sentence_df = sentence_df[scene_sentence_indices[0]: scene_sentence_indices[-1] + 1]

In [7]:
scene_subtitle_df.head(3)

Unnamed: 0,srt_index,original_text,start_time,end_time,concat_sep_text,separated_flag,laugh,hesitation,speaker,music,parenthetical,el_parenthetical,el_italic,cleaned_text
630,631,(GIGGLES)\nHello.,00:41:30.488000,00:41:31.780000,(GIGGLES) Hello.,0,1,0,,0,GIGGLES,0,0,Hello.
631,632,Hello.\nHow are you?,00:41:31.865000,00:41:33.073000,Hello. How are you?,0,0,0,,0,,0,0,Hello. How are you?
632,633,Good. How are you?,00:41:33.158000,00:41:34.575000,Good. How are you?,0,0,0,,0,,0,0,Good. How are you?


In [8]:
scene_sentence_df.head(3)

Unnamed: 0,sentence,subtitle_indices,profanity,self_intro,other_intro,direct_address,conv_boundary,offscreen_speaker,implied_speaker
721,Hello.,[630],0,,,,,,
722,Hello.,[631],0,,,,,,18.0
723,How are you?,[631],0,,,,starter,,18.0


Creation of vision and subtitle DataFrames are available as a function `generate_scene_level_dfs()`.

### Audio File
We don't yet have DataFrames for audio, but we can extract the scene-specific audio from the film's overall audio track.

In [9]:
extract_scene_audio(film, scene_dict)

Extracted audio file: ../extracted_audio/lost_in_translation_2003/2489_2533.wav


## Conversation Vibe
While we can't completely read and understand the dialogue of a film just yet, we can attempt to quantify the general "vibe" of a scene's conversation.
### Cadence
The speed of conversation is an important metric. Witty comedies might have characters launch lightning-fast insults at each other. Introspective dramas or quiet romances might have very little dialogue. We can measure cadence by counting the sentences per minute of a scene. This is available as a function `get_scene_cadence()`.

In [10]:
scene_duration = len(scene_vision_df)
cadence = len(scene_subtitle_df) / (scene_duration / 60)
cadence

21.333333333333332

In [11]:
if round(cadence) >= 35:
    print('This scene has a fast cadence, with a conversation speed of', round(cadence), 'sentences per minute.')
elif round(cadence) < 20:
    print('This scene has a slow cadence, with a conversation speed of', round(cadence), 'sentences per minute.')
else:
    print('This scene has a medium cadence, with a conversation speed of', round(cadence), 'sentences per minute.')

This scene has a medium cadence, with a conversation speed of 21 sentences per minute.


### Profanity
Profanity can indicate drama. A heated argument between characters might have much more profanity than the rest of the film. We'll measure this as profanity per word, as the function `get_scene_ppw()`.

In [12]:
space_count = 0
sentence_list = scene_sentence_df.sentence.tolist()
for sentence in sentence_list:
    for character in sentence:
        if character.isspace():
            space_count += 1

word_count = space_count + len(sentence_list)
profanity_count = scene_sentence_df.profanity.sum()

profanity_per_word = profanity_count/word_count

In [13]:
if profanity_per_word == 0:
    print('The scene contains no profanity.')
else:
    print('One in', round(1 / profanity_per_word), 'words is a profanity.')

The scene contains no profanity.


## Conversation Vibe (print only)
We have a few other ways of getting the feel for a conversation, but they aren't worth making into functions which return a value, either because of their ease of calculation, or their experimental nature.

### Laughter/Hesitations
Laughter and hesitations were already calculated during data serialization, so we just have to count them.

In [14]:
print('There are', scene_subtitle_df.laugh.sum(), 'instances of laughter.')
print('There are', scene_subtitle_df.hesitation.sum(), 'midsentence hesitation interjections.')

There are 2 instances of laughter.
There are 0 midsentence hesitation interjections.


### Icebreaker and Kicker
The beginning and end of a scene might help dictate the scene's plot impact. (Why was this scene important?)

`display_scene_start_end()`

In [15]:
scene_sentences = scene_sentence_df.sentence.tolist()
print('-------------------------------')
print('Icebreaker (Conversation Start)')            # first three sentences of scene
print('-------------------------------')
print(scene_sentences[0])
print(scene_sentences[1])
print(scene_sentences[2])
print()
print()
print('-------------------------')
print('Kicker (Conversation End)')                  # final three sentences of scene
print('-------------------------')
print(scene_sentences[-3])
print(scene_sentences[-2])
print(scene_sentences[-1])

-------------------------------
Icebreaker (Conversation Start)
-------------------------------
Hello.
Hello.
How are you?


-------------------------
Kicker (Conversation End)
-------------------------
I'll see you later.
Okay.
See you.


### Directed Questions, Declarations, and Direct Addresses
With NLP, we can look at specific patterns in words' parts of speech, to try and find the following:
- Directed Questions: Questions that address "you" (and their possible responses)
- First-Person Declarations: Sentences with "I" as a subject
- Second-Person Direct Addresses: Sentences directly addressing "you"

`display_scene_important_sentences()`

In [16]:
nlp = spacy.load('en')
scene_sentence_doc = nlp((' '.join(scene_sentence_df.sentence.tolist())))
sent_nlp_list = list(scene_sentence_doc.sents)

direct_question_indices = []
x = 0
for sent in sent_nlp_list:
    if sent[-1].text == '?':
        for token in sent:
            if token.dep_ == 'nsubj' and token.text == 'you':
                direct_question_indices.append(x)
    x += 1
direct_question_indices = list(set(direct_question_indices))


i_indices = []
x = 0
for sent in sent_nlp_list:
    for token in sent:
        if token.dep_ == 'nsubj' and token.text == 'I' and sent[-1].text != '?':
            if x not in i_indices:
                i_indices.append(x)
    x += 1


you_indices = []
x = 0
for sent in sent_nlp_list:
    if sent[-1].text != '?':
        for token in sent:
            if token.dep_ == 'nsubj' and token.text == 'you':
                if x not in you_indices:
                    you_indices.append(x)
    x += 1

In [17]:
print('--------------------------------')
print('Directed Questions and Responses')       # second-person questions directed at "you"
print('--------------------------------')
for index in direct_question_indices:
    print(sent_nlp_list[index])
    print(sent_nlp_list[index + 1])
    print()
print('-------------------------')
print('First-Person Declarations')
print('-------------------------')
for index in i_indices:
    print(sent_nlp_list[index])
print()
print('-----------------------')
print('Second-Person Addresses')
print('-----------------------')
for index in you_indices:
    print(sent_nlp_list[index])


--------------------------------
Directed Questions and Responses
--------------------------------
How are you?
Good.

How are you?
It's a cool pool, isn't it?

How long you staying for?
I'll be in the bar for the rest of the week.

-------------------------
First-Person Declarations
-------------------------
I'll be in the bar for the rest of the week.
I'm going out with some friends later, if you wanna come...
I'll see you later.

-----------------------
Second-Person Addresses
-----------------------
I'm going out with some friends later, if you wanna come...


### TF-IDF and Noun Groups
We can try various methods of finding important phrases or nouns.

`display_scene_important_phrases()`

In [18]:
# tf_idf data preparation
film_doc = sentence_df.sentence.tolist()
scene_doc = scene_sentence_df.sentence.tolist()
del film_doc[scene_sentence_indices[0]: scene_sentence_indices[-1] + 1]
scene_doc_joined = (' '.join(scene_doc))
film_doc_joined = (' '.join(film_doc))
film_scene_doc = [scene_doc_joined, film_doc_joined]

# tf-idf
vectorizer = TfidfVectorizer(use_idf=True, stop_words='english', ngram_range=(1, 3))
idf_transformed = vectorizer.fit_transform(film_scene_doc)
tf_idf_df = pd.DataFrame(idf_transformed[0].T.todense(), index=vectorizer.get_feature_names(), columns=["TF-IDF"])
tf_idf_df = tf_idf_df.sort_values('TF-IDF', ascending=False)
tf_idf_terms = list(tf_idf_df.head(5).index)



noun_groups = []
for group in scene_sentence_doc.noun_chunks:
    if group.root.pos_ != 'PRON':
        noun_groups.append(str(group))
ng_count = Counter(noun_groups)
ng_terms = []
for ng in ng_count.most_common(5):
    ng_terms.append(ng[0])

print('--------------------------------')
print('Possible Important Terms, TF-IDF')
print('--------------------------------')
print(tf_idf_terms)
print()
print('-------------------------------------')
print('Possible Important Terms, Noun Groups')
print('-------------------------------------')
print(ng_terms)
print()

--------------------------------
Possible Important Terms, TF-IDF
--------------------------------
['later', 'good', 'll', 'okay', 'hello']

-------------------------------------
Possible Important Terms, Noun Groups
-------------------------------------
['a cool pool', 'any sleep', 'the bar', 'the rest', 'the week']



# Scene Emotions
By checking each characters' emotion on each frames, we can calculate their primary emotion of the scene.

`display_scene_emotions()`

In [19]:
left_face_clusters = [scene_dict['left_anchor_face_cluster']]
for matching_cluster in scene_dict['matching_left_face_clusters']:
    left_face_clusters.append(matching_cluster)
right_face_clusters = [scene_dict['right_anchor_face_cluster']]
for matching_cluster in scene_dict['matching_right_face_clusters']:
    right_face_clusters.append(matching_cluster)
left_emotion_index = scene_face_df[scene_face_df.p_face_cluster.isin(left_face_clusters)].p_emotion.value_counts(normalize=True).index[0]
left_emotion_percentage = scene_face_df[scene_face_df.p_face_cluster.isin(left_face_clusters)].p_emotion.value_counts(normalize=True).values[0]
print('Left character, with face clusters', left_face_clusters, 'has the primary emotion:', left_emotion_index + ', in ' + '{:.0%}'.format(left_emotion_percentage) + ' of frames')
right_emotion_index = scene_face_df[scene_face_df.p_face_cluster.isin(right_face_clusters)].p_emotion.value_counts(normalize=True).index[0]
right_emotion_percentage = scene_face_df[scene_face_df.p_face_cluster.isin(right_face_clusters)].p_emotion.value_counts(normalize=True).values[0]
print('Right character, with face clusters', right_face_clusters, 'has the primary emotion:', right_emotion_index +
      ', in ' + '{:.0%}'.format(right_emotion_percentage) + ' of frames')

Left character, with face clusters [18.0, 9.0, 12.0] has the primary emotion: sad, in 57% of frames
Right character, with face clusters [6.0] has the primary emotion: happy, in 38% of frames
