This project generates a talking face video driven by an input audio and source image using the SadTalker framework.
This guide provides complete setup instructions for Windows (Python 3.9, CPU version).
Download and install Python 3.9 (64-bit) from:
👉 https://www.python.org/downloads/release/python-390/
During installation:
- Check “Add Python to PATH”
- Then click Install Now
Verify installation:
python --versionSadTalker depends on FFmpeg for video/audio processing.
- Option A – Install via Winget
winget install ffmpeg- Option B – Manual installation
Download ZIP from https://ffmpeg.org/download.html
Extract to C:\ffmpeg
Add C:\ffmpeg\bin to System PATHVerify:
ffmpeg -versionCreate new folder and clone the project
mkdir E:\Project_SadTalker
cd E:\Project_SadTalker
git clone https://github.com/<your-github-username>/<your-repo-name>.git
cd <your-repo-name>py -3.9 -m venv venv
venv\Scripts\activateVerify python version
python -VIt should show something like
Python 3.9.xcd sadtalker
pip install torch==1.12.1+cpu torchvision==0.13.1+cpu torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cpu
python -m pip install basicsr==1.4.2 filterpy==1.4.5 gfpgan==1.3.8 facexlib==0.3.0
pip install -r requirements.txtDownload pretrained models weights by running script download_models.py
python download_models.pyThis will create two new folders, checkpoints and gfpgan/weights
Two types of files are required: image (avatar image) and audio (spoken audio file .wav) To create avatar image, run the script named as get_avatar_image.py, replace openai api key with one you have, change input prompt and output image path as required and then run
python get_avatar_image.pyVerify the quality and accuracy of created avatar image, if not satisfied you can re run the script and check what model outputs.
Open the script named as run_inference.py, change source_image_path and audio_path as required and run the script using command
python run_inference.py