## Using ffmpeg-python to trim videos and audio files

In this module, we will trim videos and audio files using ffmpeg-python package.

### Case senarios
**Case 1**: Imagine that you have annotated a video for co-speech manual gestures and speech. Next, you want to extract a video for each annotated gesture. How do you do it?

**Case 2**: You have 60 videos from an experiment. The video starts 3-5 minutes before the trials start, and you want to trim the part before the trials. How do you do it?
<br>
<br>

Obviously, manually trimming the videos isn't the most efficient option. But I have good news for you: **You can use Python to automate this tedious task!!** <br>
*Case 2 requires some manual coding of timestamps, and it might be faster to manually trim the videos. So, we'll focus on Case 1 in this module.

***

## Playing with ffmpeg-python

Let's play with ffmpeg-python package. Here, we will cut the first 60 seconds (media_start) of the video. <br>
Before running the code, make sure to delete the output video file in the output folder if it already exists

In [None]:
import ffmpeg

media_start = 60 # seconds
input_file_path = "input/salma_hayek.mp4"
trimmed_video_file_path = "output/salma_hayek_trimmed.mp4"

# syntax: ffmpeg.input(input_video_path, ss=start_time_in_seconds).output(output_video_path).run()
# loglevel="quiet" is used to hide the ffmpeg logs
# c='copy' is used to copy the video stream without re-encoding
ffmpeg.input(input_file_path, ss=media_start).output(trimmed_video_file_path, loglevel="quiet", c='copy').run()
print("Success! Trimmed video saved at: ", trimmed_video_file_path)

Let's check if the video is trimmed properly

In [None]:
info_input = ffmpeg.probe(input_file_path)
info_output = ffmpeg.probe(trimmed_video_file_path)

#print duration of videos in seconds
print(f"Input video duration: {info_input['format']['duration']}")
print(f"Output video duration: {info_output['format']['duration']}")

### <font color = 'orange'>Exercise</font>

Trim the same video from 00:00:30 to 00:01:30 (so the video will be 60 seconds long).

In [None]:
trimmed_video_file_path = "output/salma_hayek_exercise.mp4"

### write the rest of the code here ###


<details><summary>Click here for tips</summary>

In ffmpeg.input(), we used **ss** paramter for start time, and we can use **to** to specify until which point in the video we want to keep.

<details><summary>Click here for solution</summary>

```python
trimmed_video_file_path = "output/salma_hayek_video_exercise.mp4"

media_start = 30 #seconds
media_end = 90

ffmpeg.input(input_file_path, ss=media_start, to=media_end).output(trimmed_video_file_path, loglevel="quiet", c='copy').run()
```

***

## Case 1: Extracting videos for each annotated gesture

In this module, we will extract video clips for each annotated gesture. The output video will be named in the format "videoname_gesturetype_index.mp4" and will be stored in the output folder.

Let's start with importing libraries and specifying folders.

In [12]:
### Import libraries
import os
import pandas as pd
import ffmpeg # ffmpeg-python

In [19]:
### Folders
video_folder = "input"
annot_folder = os.path.join(video_folder, "elan")
output_folder = "output"

### Write "extract_video" function

In [33]:
def extract_video(video, df_annot):
    # initiate counter variables for each gesture type
    iconic_counter = 0
    metaphoric_counter = 0
    deictic_counter = 0
    beat_counter = 0
    annot_index = ""

    # get the video file name
    video_file_name = os.path.basename(video)

    # loop through the dataframe and get start and end time for each annotation (row)
    for index, row in df_annot.iterrows():
        # get start and end time
        start_time = row['Begin Time - msec'] / 1000 # convert to seconds
        end_time = row['End Time - msec'] / 1000 # convert to seconds
        # get the annotation text foe gesture_type tier
        annot_text = row['gesture_type']
        
        # check the annotation text and update the counter variable
        if annot_text == "iconic":
            iconic_counter += 1
            annot_index = annot_text + "_" + str(iconic_counter)
        elif annot_text == "metaphoric":
            metaphoric_counter += 1
            annot_index = annot_text + "_" +  str(metaphoric_counter)
        elif annot_text == "deictic":
            deictic_counter += 1
            annot_index = annot_text + "_" +  str(deictic_counter)
        elif annot_text == "beat":
            beat_counter += 1
            annot_index = annot_text + "_" +  str(beat_counter)

        # create the output file name
        output_file_name = video_file_name + "_" + annot_index + ".mp4"
        # create the output file path
        output_file_path = os.path.join(output_folder, output_file_name)
        
        # check if the output file already exists
        if not os.path.exists(output_file_path):
            # trim the video
            ffmpeg.input(video, ss=start_time, to=end_time).output(output_file_path, loglevel="quiet", c='copy').run()

    return iconic_counter, metaphoric_counter, deictic_counter, beat_counter, video_file_name


### Loop through each elan annotation file and run the "extract_video" function
*The name of video files and annotation files must be identical except for the extension.

In [None]:
annot_files = [f for f in os.listdir(annot_folder) if f.endswith('.txt')]
print(annot_files)

for annot_file in annot_files:
    annot_file_path = os.path.join(annot_folder, annot_file)
    video_file_path = os.path.join(video_folder, annot_file.replace(".txt", ".mp4"))

    df_annot = pd.read_csv(annot_file_path, sep="\t")
    iconic, metaphoric, deictic, beat, video_file_name = extract_video(video_file_path, df_annot)
    print(f"Video file: {video_file_name} --> Exported videos for {iconic} iconic, {metaphoric} metaphoric, {deictic} deictic, and {beat} beat gestures.")
