<a href="https://colab.research.google.com/github/ahern88/colab_repo/blob/main/SadTalker.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [2]:
#@title **setup（about 5 minutes）**

# 打印当前Python版本 (初始状态)
print('Initial Python version:')
!python3 --version

# 设置 Python 3.9 为默认 Python3 （更现代且兼容性好）
# 注意：Colab 环境通常默认是 Python 3.10 或更高，如果已经是，这些命令可能不会改变太多
# 但为了兼容 SadTalker 项目可能早期依赖 3.8/3.9 的情况，保留它们。
!update-alternatives --install /usr/local/bin/python3 python3 /usr/bin/python3.8 2
!update-alternatives --install /usr/local/bin/python3 python3 /usr/bin/python3.9 1

# 验证当前 Python 版本
print('\nPython version after alternatives update:')
!python3 --version

# 更新 apt 包列表
print('\nUpdating apt packages...')
!apt-get update

# 安装 software-properties-common
print('Installing software-properties-common...')
!apt install -y software-properties-common

# 移除可能冲突的 pip/setuptools/wheel 包
print('Removing potentially conflicting pip/setuptools/wheel...')
!sudo dpkg --remove --force-remove-reinstreq python3-pip python3-setuptools python3-wheel || true
# `|| true` 防止因为这些包不存在而导致命令失败

# 重新安装 python3-pip
print('Installing python3-pip...')
!apt-get install -y python3-pip

print('\nGit clone project and install requirements...')
!git clone https://github.com/OpenTalker/SadTalker.git &> /dev/null
%cd SadTalker
!export PYTHONPATH=/content/SadTalker:$PYTHONPATH

# --- PyTorch 和 CUDA 兼容性处理 ---
# 检查当前 Colab 环境的 CUDA 版本
print('\nChecking CUDA version in Colab environment...')
!nvidia-smi

# 获取 Colab 的 CUDA 版本（粗略判断）
cuda_version_output = !nvidia-smi
cuda_version = "unknown"
for line in cuda_version_output:
    if "CUDA Version:" in line:
        try:
            version_str = line.split("CUDA Version:")[1].strip().split(" ")[0]
            cuda_version = "cu" + "".join(filter(str.isdigit, version_str))
            break
        except:
            pass

print(f"Detected CUDA version string from nvidia-smi: {cuda_version}")

# 定义 PyTorch 版本和对应的 index URL
# 注意：请根据 PyTorch 官网 (https://pytorch.org/get-started/locally/)
# 为您检测到的 CUDA 版本（例如 cu124 或 cu118）手动确认并更新这些版本号。
# 截至您报错的时间，PyTorch 2.7.1 + cu124 是一个不常见的组合，
# PyTorch 官网通常会提供类似 cu121, cu118 的预编译包。
# 请务必检查 PyTorch 官网！

# 假设您 nvidia-smi 实际检测到的是 CUDA 12.1 或 12.4 （尽管 12.4 不常见作为 `whl` 源后缀）
# 根据您提供的 `torch==2.7.1+cu124` 和 `--index-url https://download.pytorch.org/whl/cu121`
# 这两个是不匹配的。PyTorch 2.7.1 + cu124 通常会期望 `whl/cu124` 作为源。
# **非常重要：请根据 PyTorch 官网检查 `torch==2.7.1` 实际支持的 `CUDA` 版本是什么。**
# **如果 PyTorch 官网没有 `torch==2.7.1+cu124` 这样的版本，您就不能这样指定。**
# **Colab 通常提供 CUDA 12.1。让我们假设您需要安装 PyTorch 2.7.1 for CUDA 12.1。**

# **以下是基于 Colab 常见 CUDA 12.1 环境的假设性修复**
# 请根据实际 PyTorch 官网建议的版本替换：
pytorch_version = "2.3.1" # 请检查 PyTorch 官网，通常较新的 PyTorch 与 CUDA 12.1 兼容
torchvision_version = "0.18.1" # 匹配 PyTorch 版本
torchaudio_version = "2.3.1" # 匹配 PyTorch 版本
pytorch_index_url = "https://download.pytorch.org/whl/cu121" # Colab 常见 CUDA 版本

# 如果 nvidia-smi 显示的是 CUDA 11.8:
# pytorch_version = "2.3.1"
# torchvision_version = "0.18.1"
# torchaudio_version = "2.3.1"
# pytorch_index_url = "https://download.pytorch.org/whl/cu118"

print(f"Attempting to install PyTorch version: {pytorch_version} with index URL: {pytorch_index_url}")

# --- 核心修复：直接执行 shell 命令 ---
# 构建完整的 pip 命令字符串
pip_install_cmd = f"python3 -m pip install torch=={pytorch_version} torchvision=={torchvision_version} torchaudio=={torchaudio_version} --index-url {pytorch_index_url}"
# 如果需要指定 CUDA 后缀（如 `+cu121`），确保它与 `index-url` 匹配
# 比如，如果 PyTorch 官网给出 `torch==2.3.1+cu121`，则这里也需要加 `+cu121`
# 但通常，只要 `--index-url` 正确，`pip` 会从该源找到合适的 `whl` 文件，
# 名字中是否包含 `+cuXXX` 取决于该 `whl` 文件的实际命名。
# 为了安全起见，通常直接按照官网给出的完整命令去构建。

# 重新检查并构建 PyTorch 安装命令
# 假设 PyTorch 官网给出的完整命令格式是这样的：
# pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
# 那么我们不需要在 `torch==` 后面加上 `+cu124` 这样的后缀，因为这会限制 `pip` 的查找。
# `pip` 会从 `https://download.pytorch.org/whl/cu121` 中找到适合 `Python` 版本和 `CUDA 12.1` 的 `torch` 包。

# 再次提醒：请根据 PyTorch 官网 (https://pytorch.org/get-started/locally/)
# 实际生成的安装命令，来设置 `pytorch_version`, `torchvision_version`, `torchaudio_version`
# 和 `pytorch_index_url`。
# 如果官网生成的是 `pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121`
# 则 `torch` 的版本号不带 `+cuXXX`。

# 示例：根据官网生成的命令，例如 for PyTorch 2.3.1 + CUDA 12.1:
pip_install_cmd = (
    f"python3 -m pip install "
    f"torch=={pytorch_version} "
    f"torchvision=={torchvision_version} "
    f"torchaudio=={torchaudio_version} "
    f"--index-url {pytorch_index_url}"
)


print(f"Final PyTorch installation command to execute:\n!{pip_install_cmd}")

# 执行 PyTorch 安装命令
# 使用 f-string 直接插入变量到 shell 命令中
!{pip_install_cmd}


# 如果上述安装成功，则继续安装 requirements.txt
print('\nUpdating apt packages again...')
!apt update
print('Installing ffmpeg...')
!apt install -y ffmpeg &> /dev/null

print('Installing other requirements from requirements.txt...')
!python3 -m pip install -r requirements.txt

print('\nSetup complete!')

Initial Python version:
Python 3.11.13
update-alternatives: error: alternative path /usr/bin/python3.8 doesn't exist
update-alternatives: error: alternative path /usr/bin/python3.9 doesn't exist

Python version after alternatives update:
Python 3.11.13

Updating apt packages...
Get:1 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64  InRelease [1,581 B]
Hit:2 https://ppa.launchpadcontent.net/deadsnakes/ppa/ubuntu jammy InRelease
Get:3 https://ppa.launchpadcontent.net/graphics-drivers/ppa/ubuntu jammy InRelease [24.3 kB]
Get:4 https://cloud.r-project.org/bin/linux/ubuntu jammy-cran40/ InRelease [3,632 B]
Get:5 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64  Packages [1,802 kB]
Hit:6 https://ppa.launchpadcontent.net/ubuntugis/ppa/ubuntu jammy InRelease
Get:7 https://ppa.launchpadcontent.net/graphics-drivers/ppa/ubuntu jammy/main amd64 Packages [46.4 kB]
Get:8 https://r2u.stat.illinois.edu/ubuntu jammy InRelease [6,555 B]
Get:9 https://

In [12]:
# !pip uninstall -y imageio imageio-ffmpeg Pillow
# 步骤 3: 重新安装 imageio 和推荐的后端
# 在同一个代码块中继续运行：
!pip install imageio imageio-ffmpeg Pillow
# SadTalker可能需要特定版本的ffmpeg，确保ffmpeg在你的环境中是可用的
# 如果系统没有安装ffmpeg，还需要安装它。在Colab通常预装了。

Collecting imageio
  Downloading imageio-2.37.0-py3-none-any.whl.metadata (5.2 kB)
Collecting imageio-ffmpeg
  Downloading imageio_ffmpeg-0.6.0-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting Pillow
  Downloading pillow-11.3.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (9.0 kB)
Downloading imageio-2.37.0-py3-none-any.whl (315 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m315.8/315.8 kB[0m [31m11.9 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading imageio_ffmpeg-0.6.0-py3-none-manylinux2014_x86_64.whl (29.5 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m29.5/29.5 MB[0m [31m70.9 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading pillow-11.3.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (6.6 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.6/6.6 MB[0m [31m122.0 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: Pillow, imageio-ffmpeg, imageio
[31mERROR: pip'

In [3]:
#@title **download model（about 1 minute)**
print('Download pre-trained models...')
!bash scripts/download_models.sh

Download pre-trained models...
--2025-07-02 14:45:09--  https://github.com/OpenTalker/SadTalker/releases/download/v0.0.2-rc/mapping_00109-model.pth.tar
Resolving github.com (github.com)... 140.82.121.3
Connecting to github.com (github.com)|140.82.121.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/569518584/ccc415aa-c6f4-47ee-8250-b10bf440ba62?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=releaseassetproduction%2F20250702%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20250702T144509Z&X-Amz-Expires=1800&X-Amz-Signature=cee7539fab5507205c422074b7cd33ee6921c4945ea4290dab4491b280dd600a&X-Amz-SignedHeaders=host&response-content-disposition=attachment%3B%20filename%3Dmapping_00109-model.pth.tar&response-content-type=application%2Foctet-stream [following]
--2025-07-02 14:45:09--  https://objects.githubusercontent.com/github-production-release-asset-2e65be/569518584/ccc415aa-c6f4-47ee

In [13]:
# 这是一个独立的 Colab 代码块
image ='test.jpg'
audio ='test.wav'
source_image = 'examples/' + image
driven_audio = 'examples/' + audio

!python3 inference.py --driven_audio $driven_audio \
           --source_image $source_image \
           --result_dir ./results --enhancer gfpgan

using safetensor as default
3DMM Extraction for source image
landmark Det:: 100% 1/1 [00:00<00:00,  5.58it/s]
3DMM Extraction In Video:: 100% 1/1 [00:00<00:00, 11.73it/s]
mel:: 100% 316/316 [00:00<00:00, 42841.91it/s]
audio2exp:: 100% 32/32 [00:00<00:00, 228.34it/s]
Face Renderer:: 100% 158/158 [01:21<00:00,  1.94it/s]
The generated video is named ./results/2025_07_02_15.14.28/test##test.mp4
face enhancer....
Face Enhancer:: 100% 316/316 [01:40<00:00,  3.13it/s]
The generated video is named ./results/2025_07_02_15.14.28/test##test_enhanced.mp4
The generated video is named: ./results/2025_07_02_15.14.28.mp4


In [9]:
#@title **play movie**
import glob
from IPython.display import HTML
from base64 import b64encode
import os, sys

# get the last from results
mp4_name = sorted(glob.glob('./results/*.mp4'))[-1]

mp4 = open('{}'.format(mp4_name),'rb').read()
data_url = "data:video/mp4;base64," + b64encode(mp4).decode()

print('Display animation: {}'.format(mp4_name), file=sys.stderr)
display(HTML("""
  <video width=256 controls>
        <source src="%s" type="video/mp4">
  </video>
  """ % data_url))

IndexError: list index out of range

In [5]:
#@title **inference for portrait**
image ='test.jpg' #@param {type:"string"}
audio ='test.wav' #@param {type:"string"}
source_image = 'examples/' + image
driven_audio = 'examples/' + audio

!python3.8 inference.py --driven_audio $driven_audio \
           --source_image $source_image \
           --result_dir ./results --still --preprocess full --enhancer gfpgan

/bin/bash: line 1: python3.8: command not found


In [None]:
#@title **play movie**
import glob
from IPython.display import HTML
from base64 import b64encode
import os, sys

# get the last from results
mp4_name = sorted(glob.glob('./results/*.mp4'))[-1]

mp4 = open('{}'.format(mp4_name),'rb').read()
data_url = "data:video/mp4;base64," + b64encode(mp4).decode()

print('Display animation: {}'.format(mp4_name), file=sys.stderr)
display(HTML("""
  <video width=256 controls>
        <source src="%s" type="video/mp4">
  </video>
  """ % data_url))