# Environment setup

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
!nvidia-smi

Tue May  6 01:43:55 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA A100-SXM4-40GB          Off |   00000000:00:04.0 Off |                    0 |
| N/A   29C    P0             43W /  400W |       0MiB /  40960MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
                                                

In [None]:
%cd /content/drive/MyDrive/APAI3010-GP
!git clone https://github.com/MooreThreads/Moore-AnimateAnyone.git

/content/drive/MyDrive/APAI3010-GP
fatal: destination path 'Moore-AnimateAnyone' already exists and is not an empty directory.


In [None]:
%cd /content/drive/MyDrive/APAI3010-GP/Moore-AnimateAnyone
!python tools/download_weights.py

/content/drive/MyDrive/APAI3010-GP/Moore-AnimateAnyone
Preparing base stable-diffusion-v1-5 weights...
Preparing image encoder weights...
Preparing DWPose weights...
Preparing vae weights...
Preparing AnimateAnyone weights...


In [None]:
%%capture
!pip install lpips pytorch-fid imageio opencv-python
!pip3 install torch==2.7.0 torchvision torchaudio
!pip install accelerate
!pip install diffusers==0.24.0
!pip install transformers
!pip install av
!pip install omegaconf
!pip install decord
!pip install controlnet_aux
!pip install onnxruntime-gpu
!pip install mlflow
!pip install mediapipe
!pip install xformers

In [None]:
file_path = "/usr/local/lib/python3.11/dist-packages/diffusers/utils/dynamic_modules_utils.py"

with open(file_path, "r") as f:
    lines = f.readlines()

with open(file_path, "w") as f:
    for line in lines:
        # Remove 'cached_download' from the import statement
        if "from huggingface_hub" in line and "cached_download" in line:
            line = line.replace(", cached_download", "")
        f.write(line)

In [None]:
import os
import warnings
import torch
warnings.filterwarnings("ignore")
os.environ["CUDA_LAUNCH_BLOCKING"] = "1"
os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "expandable_segments:True"
torch.cuda.empty_cache()

# DWPose

## Data Preparation

### UBC Fashion

**About Dataset**

"We introduce a new Fashion dataset containing **500 training and 100 test videos, each containing roughly 350 frames**. Videos from our dataset are of a single human subject and characterized by the high resolution and static camera. Most importantly, clothing and textures are diverse and cover large space of possible appearances. The dataset is publicly released
at: https://vision.cs.ubc.ca/datasets/fashion/."

In [None]:
# Download UBC Fashion data
%cd /content/drive/MyDrive/APAI3010-GP/Moore-AnimateAnyone/data/UBC_fashion
!python UBC_fashion_data_crowler.py

/content/drive/MyDrive/APAI3010-GP/Moore-AnimateAnyone/data/UBC_fashion
Traceback (most recent call last):
  File "/content/drive/MyDrive/APAI3010-GP/Moore-AnimateAnyone/data/UBC_fashion/UBC_fashion_data_crowler.py", line 10, in <module>
    os.mkdir('train')
FileExistsError: [Errno 17] File exists: 'train'


### Trending on TikTok

**About Dataset**

- Videos.zip: This file contains **the actual 1000 trending TikTok videos. Each filename corresponds to the id key in the trending.json file.**

- trending.json: The raw scraped dataset. I figured splitting up the dataset resulted in messy errors. For example: a user might have one avatar while posting a video and another while posting the next video. This resulted in multiple users with the same name, id etc. except for the avatar. So I decided to post the raw data and I will show you how to translate this multi-level JSON structure to a single DataFrame in my first Notebook.

In [None]:
!curl -L -o /content/drive/MyDrive/APAI3010-GP/Moore-AnimateAnyone/data/TikTok/tiktokdataset.zip \
  https://www.kaggle.com/api/v1/datasets/download/yasaminjafarian/tiktokdataset

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100 45.2G  100 45.2G    0     0   104M      0  0:07:25  0:07:25 --:--:-- 72.0M


### TedTalk dataset

**About the dataset**

In order to create the TED-talks dataset, 3,035 YouTube videos were downloaded using the "TED talks" query. From these initial candidates, videos in which the upper part of the person is visible for at least 64 frames, and the height of the person bounding box was at least 384 pixels were selected. Static videos were manually filtered out and videos in which a person is doing something other than presenting.

## Extract keypoints from raw videos

To match the original OpenPose format used in the referred paper, you should **exclude face and foot keypoints when rendering or processing pose skeletons**.

Go to /content/drive/MyDrive/APAI3010-GP/Moore-AnimateAnyone/src/dwpose/__init__.py and find the `draw_pose(pose, H, W)` function, delete the draw_facepose.



```
def draw_pose(pose, H, W):
    bodies = pose['bodies']
    faces = pose['faces']
    hands = pose['hands']
    candidate = bodies['candidate']
    subset = bodies['subset']
    canvas = np.zeros(shape=(H, W, 3), dtype=np.uint8)

     # ✅ Only draw body and hand pose
    canvas = util.draw_bodypose(canvas, candidate, subset)
    canvas = util.draw_handpose(canvas, hands)
    # canvas = util.draw_facepose(canvas, faces)

    return canvas
```

**Custom Modification (2025-04-25): Changed default detect_resolution and image_resolution to 768.**

To increase the output pose map resolution, both detect_resolution and image_resolution are now set to 768 by default in DWposeDetector. This ensures pose images are generated at 768x768 (shorter side).
```
def __call__(
    self,
    input_image,
    detect_resolution=768,
    image_resolution=768,
    output_type="pil",
    **kwargs,
):
```

In [None]:
%cd /content/drive/MyDrive/APAI3010-GP/Moore-AnimateAnyone
!python tools/extract_dwpose_from_vid.py --video_root /content/drive/MyDrive/APAI3010-GP/Moore-AnimateAnyone/data/UBC_fashion/test_demo

/content/drive/MyDrive/APAI3010-GP/Moore-AnimateAnyone
2025-05-05 01:53:16.262964: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-05-05 01:53:16.281414: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1746409996.303344    7856 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1746409996.309921    7856 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-05-05 01:53:16.332273: I tensorflow/core/platform/cpu_feature_guard.cc:210] This Te

## Extract ref_img from the video

In [None]:
import cv2
from pathlib import Path

video_folder = Path('/content/drive/MyDrive/APAI3010-GP/Moore-AnimateAnyone/configs/inference/ubc/test_demo')
output_folder = video_folder.parent / 'test_demo_ref_img'
output_folder.mkdir(exist_ok=True)

for video_file in video_folder.iterdir():
    if video_file.suffix.lower() in ['.mp4', '.avi', '.mov', '.mkv']:
        cap = cv2.VideoCapture(str(video_file))
        ret, frame = cap.read()  # Always reads the first frame
        cap.release()

        if ret:
            out_name = video_file.stem + '_img.png'
            out_path = output_folder / out_name
            cv2.imwrite(str(out_path), frame)
            print(f"Saved first frame from {video_file.name} as {out_path}, shape: {frame.shape}")
        else:
            print(f"Failed to read the first frame from: {video_file}")

Saved first frame from A1T-Ea-FlQS.mp4 as /content/drive/MyDrive/APAI3010-GP/Moore-AnimateAnyone/configs/inference/ubc/test_demo_ref_img/A1T-Ea-FlQS_img.png, shape: (940, 720, 3)
Saved first frame from A1s1Xh4xEtS.mp4 as /content/drive/MyDrive/APAI3010-GP/Moore-AnimateAnyone/configs/inference/ubc/test_demo_ref_img/A1s1Xh4xEtS_img.png, shape: (940, 720, 3)
Saved first frame from 91iZ9x8NI0S.mp4 as /content/drive/MyDrive/APAI3010-GP/Moore-AnimateAnyone/configs/inference/ubc/test_demo_ref_img/91iZ9x8NI0S_img.png, shape: (940, 720, 3)
Saved first frame from 915AFYiy5HS.mp4 as /content/drive/MyDrive/APAI3010-GP/Moore-AnimateAnyone/configs/inference/ubc/test_demo_ref_img/915AFYiy5HS_img.png, shape: (940, 720, 3)
Saved first frame from 91EfnBTEE2S.mp4 as /content/drive/MyDrive/APAI3010-GP/Moore-AnimateAnyone/configs/inference/ubc/test_demo_ref_img/91EfnBTEE2S_img.png, shape: (940, 720, 3)


# Inference

In [None]:
%cd /content/drive/MyDrive/APAI3010-GP/Moore-AnimateAnyone

!python -m scripts.pose2vid --config ./configs/prompts/animation.yaml -W 720 -H 936 -L 256

/content/drive/MyDrive/APAI3010-GP/Moore-AnimateAnyone
2025-05-06 01:48:36.150636: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-05-06 01:48:36.168095: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1746496116.189486    1958 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1746496116.196007    1958 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-05-06 01:48:36.217822: I tensorflow/core/platform/cpu_feature_guard.cc:210] This Te