Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pose-Drived Algo Inference not working #36

Open
nitinmukesh opened this issue Jul 14, 2024 · 4 comments
Open

Pose-Drived Algo Inference not working #36

nitinmukesh opened this issue Jul 14, 2024 · 4 comments

Comments

@nitinmukesh
Copy link

nitinmukesh commented Jul 14, 2024

I followed the instructions and it is not working

  1. Firstly download the checkpoints with '_pose.pth' postfix from huggingface
image
  1. Edit driver_video and ref_image to your path in demo_motion_sync.py, then run
    left it as it is linking to sample
image
  1. python -u demo_motion_sync.py
    Output
    https://youtu.be/1JsPRYPiQso

  2. python -u infer_audio2vid_pose.py [with draw_mouse=True]
    No output produced, no error in console

(echomimic) C:\sd\EchoMimic>  python -u demo_motion_sync.py
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
W0000 00:00:1720949138.076016    3284 face_landmarker_graph.cc:174] Sets FaceBlendshapesGraph acceleration to xnnpack by default.
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
288

(echomimic) C:\sd\EchoMimic>python -u infer_audio2vid_pose.py
C:\Users\nitin\miniconda3\envs\echomimic\lib\site-packages\diffusers\utils\outputs.py:63: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
C:\Users\nitin\miniconda3\envs\echomimic\lib\site-packages\diffusers\utils\outputs.py:63: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
video in 24 FPS, audio idx in 50FPS
latents shape:torch.Size([1, 4, 160, 64, 64]), video_length:160

(echomimic) C:\sd\EchoMimic>
  1. python -u infer_audio2vid_pose.py [with draw_mouse=False]
    No output produced, no error in console
(echomimic) C:\sd\EchoMimic>python -u infer_audio2vid_pose.py
C:\Users\nitin\miniconda3\envs\echomimic\lib\site-packages\diffusers\utils\outputs.py:63: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
C:\Users\nitin\miniconda3\envs\echomimic\lib\site-packages\diffusers\utils\outputs.py:63: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
video in 24 FPS, audio idx in 50FPS
latents shape:torch.Size([1, 4, 160, 64, 64]), video_length:160

(echomimic) C:\sd\EchoMimic>
image
@nitinmukesh nitinmukesh mentioned this issue Jul 14, 2024
@nitinmukesh
Copy link
Author

here is the pip list

(echomimic) C:\Users\nitin>pip list

Package                   Version
------------------------- ------------
absl-py                   2.1.0
accelerate                0.32.1
aiofiles                  23.2.1
aiohttp                   3.9.5
aiosignal                 1.3.1
albumentations            1.1.0
altair                    5.3.0
annotated-types           0.7.0
antlr4-python3-runtime    4.9.3
anyio                     4.4.0
asgiref                   3.8.1
asttokens                 2.4.1
async-timeout             4.0.3
attrs                     23.2.0
av                        11.0.0
backcall                  0.2.0
backports.zoneinfo        0.2.1
blinker                   1.8.2
blosc2                    2.0.0
boto3                     1.34.143
botocore                  1.34.143
cachetools                5.3.3
certifi                   2024.7.4
cffi                      1.16.0
charset-normalizer        3.3.2
clean-fid                 0.1.35
click                     8.1.7
colorama                  0.4.6
colorlog                  6.8.2
configobj                 5.0.8
contourpy                 1.1.1
cycler                    0.12.1
Cython                    3.0.10
datasets                  2.20.0
decorator                 4.4.2
decord                    0.6.0
deepdish                  0.3.7
Deprecated                1.2.14
diffusers                 0.24.0
dill                      0.3.8
Django                    4.2.14
dnspython                 2.6.1
docker-pycreds            0.4.0
easydict                  1.13
einops                    0.4.1
email_validator           2.2.0
ete3                      3.1.3
exceptiongroup            1.2.1
executing                 2.0.1
facenet-pytorch           2.5.0
fastapi                   0.111.0
fastapi-cli               0.0.4
ffmpeg-python             0.2.0
ffmpy                     0.3.2
filelock                  3.15.4
flatbuffers               24.3.25
fonttools                 4.53.1
frozenlist                1.4.1
fsspec                    2024.5.0
ftfy                      6.0.3
future                    1.0.0
gitdb                     4.0.11
GitPython                 3.1.43
google-auth               2.32.0
google-auth-oauthlib      1.0.0
gradio                    4.37.2
gradio_client             1.0.2
grpcio                    1.64.1
h11                       0.14.0
h5py                      3.11.0
httpcore                  1.0.5
httptools                 0.6.1
httpx                     0.27.0
huggingface-hub           0.23.4
idna                      3.7
imageio                   2.14.1
imageio-ffmpeg            0.4.7
importlib_metadata        8.0.0
importlib_resources       6.4.0
intel-openmp              2021.4.0
invisible-watermark       0.2.0
ipdb                      0.13.13
ipython                   8.12.3
jax                       0.4.13
jedi                      0.19.1
Jinja2                    3.1.4
jmespath                  1.0.1
joblib                    1.4.2
json-lines                0.5.0
jsonschema                4.23.0
jsonschema-specifications 2023.12.1
kiwisolver                1.4.5
kornia                    0.6.0
lazy_loader               0.4
lpips                     0.1.4
Markdown                  3.6
markdown-it-py            3.0.0
MarkupSafe                2.1.5
matplotlib                3.7.5
matplotlib-inline         0.1.7
mdurl                     0.1.2
mediapipe                 0.10.11
mkl                       2021.4.0
ml-dtypes                 0.2.0
moviepy                   1.0.3
mpmath                    1.3.0
msgpack                   1.0.8
multidict                 6.0.5
multiprocess              0.70.16
networkx                  3.1
nltk                      3.8.1
numexpr                   2.8.6
numpy                     1.24.4
oauthlib                  3.2.2
omegaconf                 2.3.0
opencv-contrib-python     4.10.0.84
opencv-python             4.2.0.34
opencv-python-headless    4.10.0.84
opt-einsum                3.3.0
orderedset                2.0.3
orjson                    3.10.6
packaging                 24.1
pandas                    2.0.3
parso                     0.8.4
pickleshare               0.7.5
Pillow                    9.0.1
pip                       24.0
pkgutil_resolve_name      1.3.10
platformdirs              4.2.2
proglog                   0.1.10
progressbar               2.5
prompt_toolkit            3.0.47
protobuf                  3.20.3
psutil                    6.0.0
pudb                      2019.2
pure-eval                 0.2.2
py-cpuinfo                9.0.0
pyarrow                   16.1.0
pyarrow-hotfix            0.6
pyasn1                    0.6.0
pyasn1_modules            0.4.0
pycparser                 2.22
pydantic                  2.8.2
pydantic_core             2.20.1
pydeck                    0.9.1
pyDeprecate               0.3.1
pydub                     0.25.1
Pygments                  2.18.0
pymongo                   4.8.0
pyparsing                 3.1.2
python-dateutil           2.9.0.post0
python-dotenv             1.0.1
python-magic              0.4.27
python-multipart          0.0.9
pytorch-fid               0.3.0
pytorch-lightning         1.5.9
pytz                      2024.1
PyWavelets                1.4.1
PyYAML                    6.0.1
qudida                    0.0.4
referencing               0.35.1
regex                     2024.5.15
requests                  2.32.3
requests-oauthlib         2.0.0
rich                      13.7.1
rouge_score               0.1.2
rpds-py                   0.19.0
rsa                       4.9
ruff                      0.5.1
s3transfer                0.10.2
safetensors               0.4.3
scikit-image              0.20.0
scikit-learn              1.3.2
scipy                     1.9.1
semantic-version          2.10.0
sentry-sdk                2.9.0
setproctitle              1.3.3
setuptools                59.5.0
shellingham               1.5.4
simplejson                3.19.2
six                       1.16.0
smmap                     5.0.1
sniffio                   1.3.1
sounddevice               0.4.7
sqlparse                  0.5.0
stack-data                0.6.3
starlette                 0.37.2
streamlit                 1.36.0
sympy                     1.13.0
tables                    3.8.0
tbb                       2021.13.0
tenacity                  8.5.0
tensorboard               2.14.0
tensorboard-data-server   0.7.2
tensorboardX              2.4.1
test_tube                 0.7.5
threadpoolctl             3.5.0
tifffile                  2023.7.10
timm                      1.0.7
tokenizers                0.19.1
toml                      0.10.2
tomli                     2.0.1
tomlkit                   0.12.0
toolz                     0.12.1
torch                     2.2.2+cu121
torch-fidelity            0.3.0
torchaudio                2.2.2
torchmetrics              0.6.0
torchtyping               0.1.4
torchvision               0.17.2+cu121
tornado                   6.4.1
tqdm                      4.66.4
traitlets                 5.14.3
transformers              4.42.4
typeguard                 4.3.0
typer                     0.12.3
typing_extensions         4.12.2
tzdata                    2024.1
ujson                     5.10.0
urllib3                   2.2.2
urwid                     2.6.15
uvicorn                   0.30.1
wandb                     0.17.4
watchdog                  4.0.1
watchfiles                0.22.0
wcwidth                   0.2.13
websockets                11.0.3
Werkzeug                  3.0.3
wheel                     0.43.0
wrapt                     1.16.0
xxhash                    3.4.1
yacs                      0.1.8
yarl                      1.9.4
zipp                      3.19.2

@JoeFannie
Copy link
Contributor

Could you please add more logs in the code. It seems that the inference does not work at all. And the provided information is not sufficient to find the bugs.

@nitinmukesh
Copy link
Author

nitinmukesh commented Jul 15, 2024

Could you please add more logs in the code. It seems that the inference does not work at all. And the provided information is not sufficient to find the bugs.

Okay here you go. Used Claude.

Log


(echomimic) C:\sd\EchoMimic>  python -u infer_audio2vid_pose.py
C:\Users\nitin\miniconda3\envs\echomimic\lib\site-packages\diffusers\utils\outputs.py:63: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
C:\Users\nitin\miniconda3\envs\echomimic\lib\site-packages\diffusers\utils\outputs.py:63: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
2024-07-15 14:44:19,964 - INFO - 1. Starting main function
2024-07-15 14:44:19,964 - INFO - 2. Arguments parsed
2024-07-15 14:44:19,967 - INFO - 3. Config loaded
2024-07-15 14:44:19,967 - INFO - 4. Weight dtype set to torch.float16
2024-07-15 14:44:21,626 - INFO - 5. Device set to cuda
2024-07-15 14:44:21,637 - INFO - 6. Inference config loaded
2024-07-15 14:44:21,637 - INFO - 7. Starting model initialization
2024-07-15 14:44:21,637 - INFO - 8. Initializing VAE
2024-07-15 14:44:22,045 - INFO - 9. VAE initialized
2024-07-15 14:44:22,045 - INFO - 10. Initializing Reference UNet
2024-07-15 14:44:31,577 - INFO - 11. Reference UNet initialized
2024-07-15 14:44:31,577 - INFO - 12. Initializing Denoising UNet
2024-07-15 14:44:31,577 - INFO - loaded temporal unet's pretrained weights from pretrained_weights\sd-image-variations-diffusers\unet ...
2024-07-15 14:44:38,743 - INFO - Load motion module params from pretrained_weights\motion_module_pose.pth
2024-07-15 14:44:41,520 - INFO - Loaded 453.20928M-parameter motion module
2024-07-15 14:44:47,617 - INFO - 13. Denoising UNet initialized
2024-07-15 14:44:47,617 - INFO - 14. Initializing Face Locator
2024-07-15 14:44:47,695 - INFO - 15. Face Locator initialized
2024-07-15 14:44:47,695 - INFO - 16. Initializing Visualizer
2024-07-15 14:44:47,695 - INFO - 17. Visualizer initialized
2024-07-15 14:44:47,695 - INFO - 18. Loading Audio Processor
2024-07-15 14:44:48,140 - INFO - 19. Audio Processor loaded
2024-07-15 14:44:48,140 - INFO - 20. Initializing Face Detector
2024-07-15 14:44:48,160 - INFO - 21. Face Detector initialized
2024-07-15 14:44:48,160 - INFO - 23. Model initialization completed
2024-07-15 14:44:48,171 - INFO - 24. Scheduler initialized
2024-07-15 14:44:48,171 - INFO - 25. Creating pipeline
2024-07-15 14:44:48,181 - INFO - 26. Pipeline created
2024-07-15 14:44:48,181 - INFO - 28. Save directory created: output\20240715\1444--seed_420-512x512
2024-07-15 14:44:48,181 - INFO - 29. Processing reference image: ./assets/test_pose_demo/d.jpg
2024-07-15 14:44:48,181 - INFO - 30. Audio path: ./assets/test_pose_demo_audios/movie_0_clip_0.wav, Pose directory: ./assets/test_pose_demo_pose
2024-07-15 14:44:48,181 - INFO - 31. Generator seed set: 420
2024-07-15 14:44:48,181 - INFO - 32. Reference name: d, Audio name: movie_0_clip_0, FPS: 24
2024-07-15 14:44:48,191 - INFO - 33. Reference image loaded
2024-07-15 14:44:48,191 - INFO - 34. Starting face_locator process
2024-07-15 14:44:48,381 - INFO - 35. Face mask tensor created
2024-07-15 14:44:48,381 - INFO - 36. Starting pipeline processing
video in 24 FPS, audio idx in 50FPS
2024-07-15 14:44:48,952 - WARNING - C:\sd\EchoMimic\src\pipelines\pipeline_echo_mimic_pose.py:446: FutureWarning: Accessing config attribute `in_channels` directly via 'EchoUNet3DConditionModel' object attribute is deprecated. Please access 'in_channels' over 'EchoUNet3DConditionModel's config object instead, e.g. 'unet.config.in_channels'.
  num_channels_latents = self.denoising_unet.in_channels

latents shape:torch.Size([1, 4, 160, 64, 64]), video_length:160
2024-07-15 14:44:49,333 - WARNING - C:\Users\nitin\miniconda3\envs\echomimic\lib\site-packages\diffusers\models\attention_processor.py:1231: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:263.)
  hidden_states = F.scaled_dot_product_attention(

2024-07-15 14:44:49,375 - ERROR - 41. Error during video processing: [WinError 6] The handle is invalid
2024-07-15 14:44:49,380 - INFO - 42. Main function completed

Code


import argparse
import os
import random
from datetime import datetime
from pathlib import Path
from typing import List

import av
import cv2
import numpy as np
import torch
import torchvision
from diffusers import AutoencoderKL, DDIMScheduler
from diffusers.pipelines.stable_diffusion import StableDiffusionPipeline
from einops import repeat
from omegaconf import OmegaConf
from PIL import Image
from torchvision import transforms
from transformers import CLIPVisionModelWithProjection

from src.models.unet_2d_condition import UNet2DConditionModel
from src.models.unet_3d_echo import EchoUNet3DConditionModel
from src.models.whisper.audio2feature import load_audio_model
from src.pipelines.pipeline_echo_mimic_pose import AudioPose2VideoPipeline
from src.utils.util import get_fps, read_frames, save_videos_grid, crop_and_pad
import sys
from src.models.face_locator import FaceLocator
from moviepy.editor import VideoFileClip, AudioFileClip
from facenet_pytorch import MTCNN
from src.utils.draw_utils import FaceMeshVisualizer
import pickle
import logging

# Set up logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s', stream=sys.stdout)

def parse_args():
    parser = argparse.ArgumentParser()
    parser.add_argument("--config", type=str, default="./configs/prompts/animation_pose.yaml")
    parser.add_argument("-W", type=int, default=512)
    parser.add_argument("-H", type=int, default=512)
    parser.add_argument("-L", type=int, default=160)
    parser.add_argument("--seed", type=int, default=420)
    parser.add_argument("--facemusk_dilation_ratio", type=float, default=0.1)
    parser.add_argument("--facecrop_dilation_ratio", type=float, default=0.5)

    parser.add_argument("--context_frames", type=int, default=12)
    parser.add_argument("--context_overlap", type=int, default=3)

    parser.add_argument("--cfg", type=float, default=2.5)
    parser.add_argument("--steps", type=int, default=30)
    parser.add_argument("--sample_rate", type=int, default=16000)
    parser.add_argument("--fps", type=int, default=24)
    parser.add_argument("--device", type=str, default="cuda")

    args = parser.parse_args()
    return args

def select_face(det_bboxes, probs):
    ## max face from faces that the prob is above 0.8
    ## box: xyxy
    filtered_bboxes = []
    for bbox_i in range(len(det_bboxes)):
        if probs[bbox_i] > 0.8:
            filtered_bboxes.append(det_bboxes[bbox_i])
    if len(filtered_bboxes) == 0:
        return None

    sorted_bboxes = sorted(filtered_bboxes, key=lambda x:(x[3]-x[1]) * (x[2] - x[0]), reverse=True)
    return sorted_bboxes[0]


def main():
    logging.info("1. Starting main function")
    args = parse_args()
    logging.info("2. Arguments parsed")

    config = OmegaConf.load(args.config)
    logging.info("3. Config loaded")
    if config.weight_dtype == "fp16":
        weight_dtype = torch.float16
    else:
        weight_dtype = torch.float32
    logging.info(f"4. Weight dtype set to {weight_dtype}")

    device = args.device
    if device.__contains__("cuda") and not torch.cuda.is_available():
        device = "cpu"
    logging.info(f"5. Device set to {device}")

    inference_config_path = config.inference_config
    infer_config = OmegaConf.load(inference_config_path)
    logging.info("6. Inference config loaded")

    logging.info("7. Starting model initialization")

    try:
        logging.info("8. Initializing VAE")
        vae = AutoencoderKL.from_pretrained(
            config.pretrained_vae_path,
        ).to("cuda", dtype=weight_dtype)
        logging.info("9. VAE initialized")

        logging.info("10. Initializing Reference UNet")
        reference_unet = UNet2DConditionModel.from_pretrained(
            config.pretrained_base_model_path,
            subfolder="unet",
        ).to(dtype=weight_dtype, device=device)
        reference_unet.load_state_dict(
            torch.load(config.reference_unet_path, map_location="cpu"),
        )
        logging.info("11. Reference UNet initialized")

        logging.info("12. Initializing Denoising UNet")
        if os.path.exists(config.motion_module_path):
            ### stage1 + stage2
            denoising_unet = EchoUNet3DConditionModel.from_pretrained_2d(
                config.pretrained_base_model_path,
                config.motion_module_path,
                subfolder="unet",
                unet_additional_kwargs=infer_config.unet_additional_kwargs,
            ).to(dtype=weight_dtype, device=device)
        else:
            ### only stage1
            denoising_unet = EchoUNet3DConditionModel.from_pretrained_2d(
                config.pretrained_base_model_path,
                "",
                subfolder="unet",
                unet_additional_kwargs={
                    "use_motion_module": False,
                    "unet_use_temporal_attention": False,
                    "cross_attention_dim": infer_config.unet_additional_kwargs.cross_attention_dim
                }
            ).to(dtype=weight_dtype, device=device)
        denoising_unet.load_state_dict(
            torch.load(config.denoising_unet_path, map_location="cpu"),
            strict=False
        )
        logging.info("13. Denoising UNet initialized")

        logging.info("14. Initializing Face Locator")
        face_locator = FaceLocator(320, conditioning_channels=3, block_out_channels=(16, 32, 96, 256)).to(
            dtype=weight_dtype, device="cuda"
        )
        face_locator.load_state_dict(torch.load(config.face_locator_path))
        logging.info("15. Face Locator initialized")

        logging.info("16. Initializing Visualizer")
        visualizer = FaceMeshVisualizer(draw_iris=False, draw_mouse=False)
        logging.info("17. Visualizer initialized")

        logging.info("18. Loading Audio Processor")
        audio_processor = load_audio_model(model_path=config.audio_model_path, device=device)
        logging.info("19. Audio Processor loaded")

        logging.info("20. Initializing Face Detector")
        face_detector = MTCNN(image_size=320, margin=0, min_face_size=20, thresholds=[0.6, 0.7, 0.7], factor=0.709, post_process=True, device=device)
        logging.info("21. Face Detector initialized")

    except Exception as e:
        logging.error(f"22. Error during model initialization: {str(e)}")
        return

    logging.info("23. Model initialization completed")

    width, height = args.W, args.H
    sched_kwargs = OmegaConf.to_container(infer_config.noise_scheduler_kwargs)
    scheduler = DDIMScheduler(**sched_kwargs)
    logging.info("24. Scheduler initialized")

    try:
        logging.info("25. Creating pipeline")
        pipe = AudioPose2VideoPipeline(
            vae=vae,
            reference_unet=reference_unet,
            denoising_unet=denoising_unet,
            audio_guider=audio_processor,
            face_locator=face_locator,
            scheduler=scheduler,
        )
        pipe = pipe.to("cuda", dtype=weight_dtype)
        logging.info("26. Pipeline created")
    except Exception as e:
        logging.error(f"27. Error creating pipeline: {str(e)}")
        return

    date_str = datetime.now().strftime("%Y%m%d")
    time_str = datetime.now().strftime("%H%M")
    save_dir_name = f"{time_str}--seed_{args.seed}-{args.W}x{args.H}"
    save_dir = Path(f"output/{date_str}/{save_dir_name}")
    save_dir.mkdir(exist_ok=True, parents=True)
    logging.info(f"28. Save directory created: {save_dir}")

    for ref_image_path in config["test_cases"].keys():
        logging.info(f"29. Processing reference image: {ref_image_path}")
        for file_path in config["test_cases"][ref_image_path]:
            if ".wav" in file_path:
                audio_path = file_path
            else:
                pose_dir = file_path
        logging.info(f"30. Audio path: {audio_path}, Pose directory: {pose_dir}")

        if args.seed is not None and args.seed > -1:
            generator = torch.manual_seed(args.seed)
        else:
            generator = torch.manual_seed(random.randint(100, 1000000))
        logging.info(f"31. Generator seed set: {generator.initial_seed()}")

        ref_name = Path(ref_image_path).stem
        audio_name = Path(audio_path).stem
        final_fps = args.fps
        logging.info(f"32. Reference name: {ref_name}, Audio name: {audio_name}, FPS: {final_fps}")

        ref_image_pil = Image.open(ref_image_path).convert("RGB")
        logging.info("33. Reference image loaded")

        logging.info("34. Starting face_locator process")
        pose_list = []
        for index in range(len(os.listdir(pose_dir))):
            tgt_musk_path = os.path.join(pose_dir, f"{index}.pkl")
            with open(tgt_musk_path, "rb") as f:
                tgt_kpts = pickle.load(f)
            tgt_musk = visualizer.draw_landmarks((args.W, args.H), tgt_kpts)
            tgt_musk_pil = Image.fromarray(np.array(tgt_musk).astype(np.uint8)).convert('RGB')
            pose_list.append(torch.Tensor(np.array(tgt_musk_pil)).to(dtype=weight_dtype, device="cuda").permute(2,0,1) / 255.0)
        face_mask_tensor = torch.stack(pose_list, dim=1).unsqueeze(0)
        logging.info("35. Face mask tensor created")

        try:
            logging.info("36. Starting pipeline processing")
            video = pipe(
                ref_image_pil,
                audio_path,
                face_mask_tensor,
                width,
                height,
                args.L,
                args.steps,
                args.cfg,
                generator=generator,
                audio_sample_rate=args.sample_rate,
                context_frames=12,
                fps=final_fps,
                context_overlap=3
            ).videos
            logging.info("37. Pipeline processing completed")

            video = torch.cat([video[:, :, :args.L, :, :], face_mask_tensor[:, :, :args.L, :, :].detach().cpu()], dim=-1)
            save_videos_grid(
                video,
                f"{save_dir}/{ref_name}_{audio_name}_{args.H}x{args.W}_{int(args.cfg)}_{time_str}.mp4",
                n_rows=2,
                fps=final_fps,
            )
            logging.info(f"38. Video saved: {save_dir}/{ref_name}_{audio_name}_{args.H}x{args.W}_{int(args.cfg)}_{time_str}.mp4")

            logging.info("39. Adding audio to video")
            video_clip = VideoFileClip(f"{save_dir}/{ref_name}_{audio_name}_{args.H}x{args.W}_{int(args.cfg)}_{time_str}.mp4")
            audio_clip = AudioFileClip(audio_path)
            video_clip = video_clip.set_audio(audio_clip)
            video_clip.write_videofile(f"{save_dir}/{ref_name}_{audio_name}_{args.H}x{args.W}_{int(args.cfg)}_{time_str}_withaudio.mp4", codec="libx264", audio_codec="aac")
            logging.info(f"40. Video with audio saved: {save_dir}/{ref_name}_{audio_name}_{args.H}x{args.W}_{int(args.cfg)}_{time_str}_withaudio.mp4")

        except Exception as e:
            logging.error(f"41. Error during video processing: {str(e)}")

    logging.info("42. Main function completed")


if __name__ == "__main__":
    main()

@YangQuGithub
Copy link

I got similar errors, a new folder named d (containing numerous pkl files) created in project root folder.
a new d folder(containing numerous pkl files) created in project root folder
:

(echomimic) D:\ai\EchoMimic>python -u demo_motion_sync.py
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
W0000 00:00:1721116243.688493 19664 face_landmarker_graph.cc:174] Sets FaceBlendshapesGraph acceleration to xnnpack by default.
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
W0000 00:00:1721116243.698317 2292 inference_feedback_manager.cc:114] Feedback manager requires a model with a single signature inference. Disabling support for feedback tensors.
W0000 00:00:1721116243.705942 2292 inference_feedback_manager.cc:114] Feedback manager requires a model with a single signature inference. Disabling support for feedback tensors.
W0000 00:00:1721116243.713692 6704 inference_feedback_manager.cc:114] Feedback manager requires a model with a single signature inference. Disabling support for feedback tensors.
C:\Users\quyan.conda\envs\echomimic\lib\site-packages\google\protobuf\symbol_database.py:55: UserWarning: SymbolDatabase.GetPrototype() is deprecated. Please use message_factory.GetMessageClass() instead. SymbolDatabase.GetPrototype() will be removed soon.
warnings.warn('SymbolDatabase.GetPrototype() is deprecated. Please '
288

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants