transcode and cache short video clips & thumbnails from long-form video for cheap on GCP Functions
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

serverless video trimmer & thumbnailer

Trim or transcode video clips from long files on serverless platforms (Lambda and GCF). Used in a video pipeline to generate custom user-specified clips server-side. Especially quick for long files as trimming/transcoding is done with byte-range requests. Repeated requests are served from a storage-backed bucket on S3 or GCS.


The general format of the URL is https://FUNCTION_URL/{operation}/{parameters}/{source_filename}.

The operation is either a video trim or a thumbnail.

Allowed parameters are fast, width, height, start and end:

  • fast (optional) does a straight codec copy (-c copy in ffmpeg), avoiding a slow transcoding step
  • width and height (optional) specifies output width and height in pixels. For thumbnails, a percentage can be specified.
  • start and end (optional) are the start and finish timestamps in seconds.

Example live URLs

Return a thumbnail 200px wide at 50 seconds:,width:200/5_minute_timer.mp4

Return a thumbnail at 50% duration (original video size):

Return a video clip from 10.5 seconds to 20.5 seconds, compressed to 240p:,end:20.5,height:240/5_minute_timer.mp4

Return a video clip from 10.5 seconds to 20.5 seconds (original video size):,end:20.5,fast/5_minute_timer.mp4

Advantages & disadvantages

The pseudo-streaming/byte-range approach to the input file has several key advantages:

  • Not limited by Lambda / GCF limitations on storage capacity, which for longer videos can be easy to hit
  • Trimming LONG videos (>5GB) is much faster, as the entire original file does not need to be downloaded first, only the relevant section
  • The video is being transcoded as it is being downloaded

Of course, serverless also comes with its own advantages:

  • Massive scalability & concurrency
  • No need to provision servers
  • Reduced costs, because functions run only when requested

This approach also comes with disadvantages (some pretty big):

  • The first uncached request will take some time before a response is generated
  • Not all video formats support pseudo/progressive streaming
  • Requires generation of signed URLs as it relies on HTTP byte range requests
  • Transcoding can be slow, as even the largest GCF configurations are very weak for video processing (max 2 vCPU)
  • GCF as of this time does not support caching with a CDN layer in front natively, so a regional GCS bucket is used instead as a asset store
  • 301/302/307 replies for existing assets is another set of handshakes for the client, slightly increasing load times

Possible improvements

  • Switch from HTTP triggers to a pub/sub model
  • Put behind Firebase Hosting and use it as a CDN
  • Improve cold start times
  • Improve ffmpeg start up times with a barebones static build

Technologies used

  • python 3.7
  • Google Cloud Datastore
  • Google Cloud Storage
  • Google Cloud Functions
  • ffmpeg
  • ffprobe
  • Flask
  • terraform


  • From gcloud command line: gcloud functions deploy serverless-ffmpeg-segmentation --runtime python37 --region=asia-northeast1 --trigger-http --entry-point=trim --memory=2048MB
  • Using terraform: run terraform init then terraform apply from the terraform folder. Will set up artifact bucket and automatically zip and create a function video-segmentation.


I ran into issues with the static build of ffmpeg in that it errors with a segfault if doing a overlay filter with a seeked HTTP streaming input. I believe it is related to the timestamp being negative after seeking on a pseudo-streaming input, but more testing is needed. No issues with the build installed through apt.