Skip to content

Concatenate videos in Google Drive using FFMPEG and then places it back in Google Drive

Notifications You must be signed in to change notification settings


Folders and files

Last commit message
Last commit date

Latest commit



40 Commits

Repository files navigation

Video Concat

With every family video, we get more ambitious. It always is a frantic finish just before we need to play the video.

This project helps join/concatenate videos based on a sequence file. It get's triggered by API call and results in a final video published on Google Drive.

Video processing takes time (CPU intensive) and a lot of space. On Mac Air, you are severly constrained. That's why I picked AWS ECS with Fargate.

I have used Terraform to quickly bring up the infrastructure and destroy when I am done. It's perfect for infrequent use like Family events.

I assume you have an account with AWS and have a brief understanding of AWS, Terraform, Docker, etc

Get Started

  1. Ensure you have the pre-requisites:

    1. AWS CLI downloaded and configured (aws configure) with significant previlige to create all the infrastructure.
    2. Terraform downloaded and in path.
    3. Docker downloaded. (On a Mac, brew cask install docker worked instead of the regular brew install docker)
  2. Generate a Service Account to access Google Drive

    1. Follow the instructions here to generate the json file and store it as cotainer/credentials.json
    2. Remember to enable Google Drive Apis for this service account.
    3. On the output Google Drive folder, remember to add the email of the Service Account as a collaborator with edit previleges.
  3. Create the lambda function:

    1. npm install in the lambda folder
    2. zip and create the package
          zip -r ../ main.js node_modules package.json
  4. Build the infrastructure

    1. The following commands have to be run in the terraform folder

          cd terraform
    2. Create a my.tfvars file with the following:

          ecr_name                   = "ss-video-concate"
          ecs_cluster_name           = "ss-video-cluster"
          ecs_service_name           = "videoconcat-service"
          ecs_task_definition_family = "VideoConcat"
          docker_image               = ""
          sqs_queue_name             = "video-queue"
          api_path                   = "video-concat"

      Note: You may not have the docker_image just yet. Put in a dummy value (like above) and then update it after running the docker steps.

    3. Run terraform

          terraform init
          terraform plan --var-file="my.tfvars" -out=tfplan
          terraform apply "tfplan"
    4. Note down the output variables from above

          base_url = ""
          queue_url = ""
  5. Create and publish the docker image

    1. Create the docker image locally

          docker build -t video-concat .
    2. Follow the instructions on the ECR page to publish. It will substitute variables correctly.

          aws ecr get-login-password --region ap-south-1 | docker login --username AWS --password-stdin
          docker build -t ss-video-concate .
          docker tag ss-video-concate:latest
          docker push
    3. Copy the URI / ARN for the latest docker image and update the file my.tfvars created in step 3.2

  6. And you are done! If you are stuck, keep repeating steps 4.1-4.3 to get the configuration right.

  7. Test your setup by creating a POST request to your API:

    1. Find the input and output folder IDs. It's part that comes after the folders in a Google Drive link
    1. Create a sequence file by mention each file name on a separate line. Remember that the sequence file should be a simple text file and not Google Docs or Doc or Docx or any other complex format. (We will be doing cat sequence in our script.)
        Video 1.mp4
        Video 2.mp4        
    1. Generate the video by calling the API:
        curl --location --request POST '' \
            --header 'Content-Type: application/json' \
            --data-raw '{
                "input_folder": "XXXXXXXXXXTgtj9wZG3HIcb6b1dLLqg",
                "output_folder": "XXXXXXXXXjjLvduXwQM4j5lPHY7_6z6a",
                "sequence_file_name": "sequence",
                "template_file_name": "template",
                "output_file_prefix": "Short-Video-",
    1. Check CloudWatch logs to see if there are any errors in Lambda or ECS. If not, you will see your video in the output folder
  8. When done you can destroy your infrastructure:

        terraform plan --var-file="my.tfvars" -destroy -out=tfplan
        terraform apply "tfplan"

How does this work?

Architecture Diagram

  1. User's POST request is passed on to a Lambda Function via API Gateway
  2. The Lambda Function posts an event to SQS Queue and updates the ECS service's desired count to 1
  3. The ECS container reads from the queue in a while loop, until there are no messages.
  4. The ECS container downloads all the files in the input folder. It expects the sequence file to be present here.
  5. It uses ffmpeg to concatenate the videos using ffmpeg filters
  6. It then pushes the final video to the output folder
  7. It then deletes the message from the queue, and checks for another message

Example Sequence & Templates

Beginning Video with Music.mp4
Mom \& Dad.mp4	Name 1	mandala2.png	#032B60	SN2s.jpg
Relative2.MOV	Name 2	mandala2.png	#8F3A6F	SN14s.jpg	Name 3	mandala1.png	#70094A	SN14s.jpg	Name 4	mandala2.png	#3750A8	SN13s.jpg
Relative5.mp4	Name 5	mandala3.png	#A67761	SN1s.jpg
Relative6.mp4	Name 6	mandala3.png	#A67761	SN1s.jpg
Ending Video with Music.mp4

if [[ -z "$arg3" ]];

    ffmpeg -hide_banner -i "$arg1" -t 10 \
    -c:a aac -c:v libx264 -crf 23 \
    -filter_complex "[0:v]scale=1920x1080:force_original_aspect_ratio=decrease,pad=1920:1080:0:0:color=black, \
    setdar=16/9,setsar=1/1,fps=fps=30,format=yuv420p; \
    [0:a]loudnorm=i=-24:tp=-2:lra=7" \


    ffmpeg -hide_banner -i "$arg1" -t 10 -i $arg3 -i $arg5 \
    -filter_complex "[0:v]scale=1600:900:force_original_aspect_ratio=decrease,pad=1920:1080:320:0:color=$arg4,setdar=16/9,setsar=1/1,fps=fps=30,format=yuv420p[V1]; \
    [V1][1:v]overlay=(overlay_w/2)*-1+40:main_h-(overlay_h/2)-40[V2]; \
    [V2][2:v]overlay=0:450-(overlay_h/2)[V3]; \
    [V3]drawtext=fontfile=./Sanchez-Regular.ttf: text='$arg2': fontcolor=white: fontsize=64: x=(w-text_w)/2: y=(h-90-(text_h/2)); \
    [0:a]loudnorm=i=-24:tp=-2:lra=7" \
    -c:a aac -c:v libx264 -crf 23 \

More templates here

Further Reading

I found the following tutorials and articles very helfpul while working on this project:

  1. Better Together: Amazon ECS and AWS Lambda | AWS Compute Blog
  2. How to manage Terraform state. A guide to file layout, isolation, and… | by Yevgeniy Brikman | Gruntwork
  3. Copy all files in a folder from Google Drive to AWS S3 (Example)
  4. Docker Images : Part I - Reducing Image Size
  5. Fargate as Batch Service. AWS Fargate can be a useful service for… | by Ava Chen | Aug, 2020 | Medium
  6. Serverless Applications with AWS Lambda and API Gateway | Terraform - HashiCorp Learn
  7. Why does AWS Lambda need to pass ecsTaskExecutionRole to ECS task
  8. EFS and ECS
  9. FFMPEG documentation