Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Command line Inference not working with config #63

Closed
nitinmukesh opened this issue Jun 19, 2024 · 0 comments · Fixed by #89
Closed

Command line Inference not working with config #63

nitinmukesh opened this issue Jun 19, 2024 · 0 comments · Fixed by #89
Assignees

Comments

@nitinmukesh
Copy link

I am specifying the config file and it is trying to load an image which is not specified anywhere

(venv) C:\sd\hallo>python scripts/inference.py --config "C:\sd\hallo\configs\inference\default.yaml"
A matching Triton is not available, some optimizations will not be enabled
Traceback (most recent call last):
  File "C:\sd\hallo\venv\lib\site-packages\xformers\__init__.py", line 55, in _is_triton_available
    from xformers.triton.softmax import softmax as triton_softmax  # noqa
  File "C:\sd\hallo\venv\lib\site-packages\xformers\triton\softmax.py", line 11, in <module>
    import triton
ModuleNotFoundError: No module named 'triton'
INFO:albumentations.check_version:A new version of Albumentations is available: 1.4.9 (you have 1.4.8). Upgrade using: pip install --upgrade albumentations
WARNING:py.warnings:C:\sd\hallo\venv\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py:69: UserWarning: Specified provider 'CUDAExecutionProvider' is not in available provider names.Available providers: 'AzureExecutionProvider, CPUExecutionProvider'
  warnings.warn(

Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: ./pretrained_models/face_analysis\models\1k3d68.onnx landmark_3d_68 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: ./pretrained_models/face_analysis\models\2d106det.onnx landmark_2d_106 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: ./pretrained_models/face_analysis\models\genderage.onnx genderage ['None', 3, 96, 96] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: ./pretrained_models/face_analysis\models\glintr100.onnx recognition ['None', 3, 112, 112] 127.5 127.5
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: ./pretrained_models/face_analysis\models\scrfd_10g_bnkps.onnx detection [1, 3, '?', '?'] 127.5 128.0
set det-size: (640, 640)
Traceback (most recent call last):
  File "C:\sd\hallo\scripts\inference.py", line 374, in <module>
    inference_process(command_line_args)
  File "C:\sd\hallo\scripts\inference.py", line 162, in inference_process
    source_image_lip_mask = image_processor.preprocess(
  File "C:\sd\hallo\scripts\hallo\datasets\image_processor.py", line 115, in preprocess
    source_image = Image.open(source_image_path)
  File "C:\sd\hallo\venv\lib\site-packages\PIL\Image.py", line 3277, in open
    fp = builtins.open(filename, "rb")
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\sd\\hallo\\test_data\\source_images\\6.jpg'

default.yaml

source_image: .\examples\reference_images\FACE.png
driving_audio: .\examples\driving_audios\1.wav

weight_dtype: fp16

data:
  n_motion_frames: 2
  n_sample_frames: 16
  source_image:
    width: 512
    height: 512
  driving_audio:
    sample_rate: 16000
  export_video:
    fps: 25

inference_steps: 40
cfg_scale: 3.5

audio_ckpt_dir: ./pretrained_models/hallo

base_model_path: ./pretrained_models/stable-diffusion-v1-5

motion_module_path: ./pretrained_models/motion_module/mm_sd_v15_v2.ckpt

face_analysis:
  model_path: ./pretrained_models/face_analysis

wav2vec:
  model_path: ./pretrained_models/wav2vec/wav2vec2-base-960h
  features: all

audio_separator:
  model_path: ./pretrained_models/audio_separator/Kim_Vocal_2.onnx

vae:
  model_path: ./pretrained_models/sd-vae-ft-mse

save_path: ./.cache

face_expand_ratio: 1.1
pose_weight: 1.1
face_weight: 1.1
lip_weight: 1.1

unet_additional_kwargs:
  use_inflated_groupnorm: true
  unet_use_cross_frame_attention: false
  unet_use_temporal_attention: false
  use_motion_module: true
  use_audio_module: true
  motion_module_resolutions:
    - 1
    - 2
    - 4
    - 8
  motion_module_mid_block: true
  motion_module_decoder_only: false
  motion_module_type: Vanilla
  motion_module_kwargs:
    num_attention_heads: 8
    num_transformer_block: 1
    attention_block_types:
      - Temporal_Self
      - Temporal_Self
    temporal_position_encoding: true
    temporal_position_encoding_max_len: 32
    temporal_attention_dim_div: 1
  audio_attention_dim: 768
  stack_enable_blocks_name:
    - "up"
    - "down"
    - "mid"
  stack_enable_blocks_depth: [0,1,2,3]
  

enable_zero_snr: true

noise_scheduler_kwargs:
  beta_start: 0.00085
  beta_end: 0.012
  beta_schedule: "linear"
  clip_sample: false
  steps_offset: 1
  ### Zero-SNR params
  prediction_type: "v_prediction"
  rescale_betas_zero_snr: True
  timestep_spacing: "trailing"

sampler: DDIM


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants