# Scene Audio
In `scene_boundary_partitioning.ipynb`, in `unifying_features/`, we demonstrated how we can identify individual two-character dialogue scenes. We'll want to conduct some audio analysis on the scene's audio.

We can extract the film's *entire* audio track. However, this presents a challenge because most audio analyses will have memory constraints if loading the entire audio track. We can solve this by extracting just a portion of the film's audio file. We'll take the saved audio file from `/extracted_audio/`, create a new audio file with just the scene's audio, and then save it to `/extracted_audio/(film_name)/`.

We'll use the ffmpeg-python library, which is a Python wrapper for the ffmpeg suite of audio/video tools.

In [1]:
import os
import sys
import ffmpeg
sys.path.append('../unifying_features')
sys.path.append('../data_serialization')
from serialization_preprocessing_io import *
from scene_identification_io import *

We'll start by identifying the film's scenes.

In [2]:
film = 'lost_in_translation_2003'
srt_df, subtitle_df, sentence_df, vision_df, face_df = read_pickle(film)
scene_dictionaries = generate_scenes(vision_df, face_df, substantial_minimum=4, anchor_search=6)

In [3]:
scene_dict = scene_dictionaries[1]
scene_dict

{'scene_id': 1,
 'first_frame': 1929,
 'last_frame': 2082,
 'scene_duration': 154,
 'left_anchor_shot_cluster': 207,
 'left_anchor_face_cluster': 6.0,
 'matching_left_face_clusters': [],
 'right_anchor_shot_cluster': 66,
 'right_anchor_face_cluster': 17.0,
 'matching_right_face_clusters': [18.0, 12.0],
 'cutaway_shot_clusters': [7, 29, 142]}

We'll extract audio for the first scene, and name it based on the frame numbers.

In [4]:
first = str(scene_dict['first_frame'])
last = str(scene_dict['last_frame'])

extracted_file_name = os.path.join('../extracted_audio', film, first + '_' + last + '.wav')
extracted_file_name

'../extracted_audio/lost_in_translation_2003/1929_2082.wav'