Skip to content

google/ai_video_dubbing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README

Context

AI Dubbing allows you to create localized videos using the same video base and adding translations using Google AI Powered TextToSpeech API

Pre-requisites:

  • Google Cloud

  • Google Workspace (Google Spreadsheets)

  • Google Cloud user with privileges over all the APIs listed in the config (ideally Owner role), so it’s possible to grant some privileges to the Service Account automatically. \

  • Latest version of Terraform installed \

  • Python version >= 3.8.1 installed

  • Python Virtualenv installed \

Roles that will be automatically granted to the service account during the installation process:

"roles/iam.serviceAccountShortTermTokenMinter"

"roles/storage.objectAdmin"

"roles/pubsub.publisher"

Installation Steps:

  1. Open a shell \

  2. Clone the git repository

  3. Open a text editor and configure the following installation variables in the file “variables.tf”

    variable "gcp_project" { type = string description = "Google Cloud Project ID where the artifacts will be deployed" default = "my-project" }

    variable "gcp_region" { type = string description = "Google Cloud Region" default = "my-gcp-region" }

    variable "ai_dubbing_sa" { type = string description = "Service Account for the deployment" default = "ai-dubbing" }

    variable "ai_dubbing_bucket_name" { type = string description = "GCS bucket used for deployment tasks" default = "my-bucket" }

    variable "config_spreadsheet_id" { type = string description = "The ID of the config spreadhseet" default = "my-google-sheets-sheet-id" }

    #Do not set this value to a high frequency, since executions might overlap

    variable "execution_schedule" { type = string description = "The schedule to execute the process (every 30 min default) " default = "*/30 * * * *" }

  4. _Now execute ”terraform apply”
    _

  5. Type “yes” and hit return when the system asks for confirmation_
    _

Generated Cloud Artifacts

  • Service Account: if it not exists, it will be created as per the values in the configuration
  • Cloud Scheduler: ai-dubbing-trigger
  • Cloud Functions: generate_tts_file, generate_video_file. Both triggered by pub/sub
  • Cloud Pub/Sub topics: generate_tts_files_trigger, generate_video_file_trigger

Output

Every generated artifact will be stored in the supplied GCS bucket under the output/YYYYMMDD folder, where YYYY represents the year, MM the month and DD the day of the processing date

Audio Files

The generated TTS audio files will be stored as mp3

{campaign}-{topic}-{voice_id}.mp3

Audio url: gs://{gcs_bucket}/output/{YYYYMMDD}/{campaign}-{topic}-{voice_id}.mp3

Video Files

The generated video field will be stored as mp4

{campaign}-{topic}-{voice_id}.mp4

Video url: gs://{gcs_bucket}/output/{YYYYMMDD}/{campaign}-{topic}-{voice_id}.mp4

How to activate

Most of the effort will be on building the first SSML text and adapting the timings to the video. Once that task is mastered, video creation will be done in a breeze!

You can use the web-based SSML-Editor for this purpose, and then export each SSML file.

What’s required for video generation

  • A file containing a base video without music
  • A file containing the music for the video

Configure the input

  1. Create a copy of the configuration spreadsheet

  2. Configure the fields in the sheet “config” following the instructions

Field Name Type Mandatory Description Sample Value Notes
campaign Input Yes A string to generate the name of the video summer
topic Input Yes A string to generate the name of the video outdoor
gcs_bucket Input Yes The bucket where video_file and base_audio_file could be located (the service account must be granted access). We recommend to use the same gcs_bucket as for the output videodub_test_input
video_file Input Yes The location of the master video file within the gcs_bucket input/videos/bumper_base_video.mp4
base_audio_file Input No The location of the base audio file within the gcs_bucket input/audios/bumper_base_audio.mp3
text Input Yes The SSML text to convert to speech

Find your own style in the constantly renewed catalog of the <emphasis level="strong">somewhere.com online shop</emphasis></prosody>

Design what you love</prosody>

Check SSML supported syntax
voice_id Input Yes The id of the voice to use en-GB-Wavenet-C##FEMALE Check voices here
millisecond_start_audio Input No Millisecond of the video when the audio must start. This could be also accomplished using TTS

0

audio_encoding Input Yes The audio encoding available MP3 At the moment only MP3 is supported
base_audio_vol_percent Input Yes Modifies the volume of the base audio (whether in the base video or in the base audio file) 0.6
final_video_file_url Output N/A The location of the generated video file with the base audio and speech gs://videodub_tester/output/20230420/summer-outdoor-en-gb-wavenet-c##female.mp4
status Output N/A The status of the process Video OK
last_update Output N/A The last time the row was modified by the automatic process 2023/04/20, 12:25:16

Trigger the generation process

Once all the configuration is set in the spreadsheet, the process will run every X minutes, as defined by the execution_schedule.

The “Status” column will change its contents, the possible values are:

  • “TTS OK”: audio file generated correctly
  • “Video OK”: video file generated correctly
  • Other value: an error occurred

When all the cells in the status column would display “Video OK”, the process will be completed

When all the cells display “Video OK” or different from “TTS OK”, the process will be completed but it might have errors \

Just download the videos from gs://{gcs_bucket}/output/{YYYYMMDD} and make the best use of them.

Note:

For the initial tests, the scheduled execution period might be too long. The recommendation in these kinds of situations is just to disable the schedule and run it on demand. To do that:

  1. Go to the Cloud Scheduler tab in your Google Cloud project and
  2. Check the box next to “ai-dubbing-trigger”
  3. Click on “Force Run”

About

No description, website, or topics provided.

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published