# key_dialogue
With each sentence of the film captured in a dataframe, we can try and pick out the most important, or information-dense dialogue.

In [1]:
from subtitle_dataframes_io import *
nlp = spacy.load('en')

In [2]:
subs = pysrt.open('../subtitles/plus_one_2019.srt')
subtitle_df = generate_base_subtitle_df(subs)
subtitle_df = generate_subtitle_features(subtitle_df)
subtitle_df['cleaned_text'] = subtitle_df['concat_sep_text'].map(clean_line)
sentences = partition_sentences(remove_blanks(subtitle_df['cleaned_text'].tolist()), nlp)
subtitle_indices = tie_sentence_subtitle_indices(sentences, subtitle_df)
sentence_df = pd.DataFrame(list(zip(sentences, subtitle_indices)), columns=['sentence', 'subtitle_indices'])

## First-Person Declarations
We can look for first-person declarations, sentences where the subject of the sentence is "I". When a character speaks in this manner, she may be declaring something of personal note.

In [3]:
scene_sentence_df = sentence_df[2788:2821].copy()

In [4]:
scene_sentence_doc = nlp((' '.join(scene_sentence_df.sentence.tolist())))
sent_nlp_list = list(scene_sentence_doc.sents)

i_indices = []
x = 0
for sent in sent_nlp_list:
    for token in sent:
        if token.dep_ == 'nsubj' and token.text == 'I' and sent[-1].text != '?':
            if x not in i_indices:
                i_indices.append(x)
    x += 1

In [5]:
for index in i_indices:
    print(sent_nlp_list[index])

I just need to say one thing, and I'll...
I'll leave you alone, I swear.
I really can't handle a big speech right now, Ben.
I'm an asshole.
I-I can't do them on my own.
You're not there to... to insult everything that I do.
And the worst part about all this is that you're not there because I hurt you.
I hurt the one person that never deserved it.
And I pushed you away because I'm dumb
and I'm selfish, and fuck me for being too late,
but I love you, Alice.
Um, I should really go back inside.


## Second-Person Addresses
We can also look for second-person addresses, one person speaking to another character, where "you" is being addressed directly.

In [6]:
you_indices = []
x = 0
for sent in sent_nlp_list:
    if sent[-1].text != '?':
        for token in sent:
            if token.dep_ == 'nsubj' and token.text == 'you':
                if x not in you_indices:
                    you_indices.append(x)
    x += 1

In [7]:
for index in you_indices:
    print(sent_nlp_list[index])

Alice, you were right.
And you were right about these weddings.
It's 'cause you're lonely.
It's because you're not there.
And the worst part about all this is that you're not there because I hurt you.
