# GoPro Video Concatenation
This notebook uses the `sftk` to find and concatenate multi-part GoPro video files stored in an S3 bucket. It identifies groups of videos belonging to the same `DropID` and combines them into a single MP4 file.

# Requirements
This section imports the necessary libraries for the script to run. These include libraries for handling file paths, interacting with AWS S3, processing videos, and managing concurrent operations.

In [None]:
from sftk.s3_handler import S3Handler
from sftk.video_handler import VideoProcessor

In [None]:
# --- Configuration ---
S3_PREFIX = "media/"  # e.g., "media/SURVEY_ID"
GOPRO_PREFIX = "GX"  # Prefix for GoPro video files (e.g., 'GX', 'GH')
DELETE_ORIGINALS = False  # Set to True to delete original video parts after concatenation
TEST_MODE = True  # If True, concatenated files are not uploaded and originals are not deleted (TBC, process only the first drop)
VERIFY_VIDEOS=False, # If True, verifies each single concatenated video matches the expected size and is identical (takes a while)
PARALLEL_DROPS=True, # If False, process one drop at a time
SEQUENTIAL_DOWNLOAD=True  # If False, download files one gopro at a time (most reliable)
MAX_WORKERS=4 # Max number of parallel downloads for a single drop

# 1. Initialize S3Handler and VideoProcessor
s3_handler = S3Handler()
processor = VideoProcessor(s3_handler, 
prefix=S3_PREFIX, 
gopro_prefix=GOPRO_PREFIX, 
delete_originals= DELETE_ORIGINALS, 
test_mode=TEST_MODE, 
max_workers=MAX_WORKERS, 
verify_videos=VERIFY_VIDEOS,  
parallel_drops=PARALLEL_DROPS,  
sequential_download=SEQUENTIAL_DOWNLOAD
)


# 2. Preview the movies that will be processed
display(processor.filtered_df.head())

In [None]:
# Process go_pro_files from drops
processor.process_gopro_videos()

This section finds individual files that can be removed because a concatenated version already exists.

In [None]:
# TBCFunctionality to be added to processor
# Get individual files that can be removed (concatenated version exists)
files_to_remove = processor.find_already_concatenated_movies_df(size_tolerance=0.01)

# Preview the files that will be removed
for _, row in files_to_remove.iterrows():
    print(f"Safe to remove: {row['Key']} ({row['Size']/1024/1024:.1f}MB)")

This section removes the redundant files. The `3_handler.s3.delete_object` line is commented out for safety. Uncomment it to actually delete the files.

In [None]:
# Remove redundant files
for _, row in files_to_remove.iterrows():
    print(f"Removing: {row['Key']} ({row['Size']/1024/1024:.1f}MB)")
    # s3_handler.s3.delete_object(Bucket=s3_handler.bucket, Key=row['Key'])