Skip to content

Conversation

@yiyixuxu
Copy link
Collaborator

@yiyixuxu yiyixuxu commented May 5, 2024

part of #7776

I refactored the preprocess for both image and video processor

a few notes:

  • we do not need to accept list of list of 4d (the current code does not accept it). I removed the relevant test for list of list of 4d
  • I also think we do not need to accept list of 5d (the video processor should only accept video or a list of videos, and 5d is already a list of videos, I don't think passing list of 5d is needed); however, a list of 5d input works with current code so I make it work here as well (and add notes to it because it is a little bit confusing). I also would like to hear @DN6 ' and @a-r-r-o-w s opinions that if there would be a use case where someone needs to pass a list of 5d tensor/array as inputs - we can deprecate it if there is no use case for it

I refactored the image processor a little bit, too, so it is more aligned with the video processor

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@yiyixuxu yiyixuxu requested a review from sayakpaul May 5, 2024 18:41
Comment on lines 41 to 54
def is_valid_image(image):
return isinstance(image, PIL.Image.Image) or isinstance(image, (np.ndarray, torch.Tensor)) and image.ndim in (2, 3)


def is_valid_image_imagelist(images):
# check if the image input is one of the supported formats for image and image list:
# it can be either (1) a 4d pytorch tensor or numpy array, (2) a valid image or (3) list of valid image
if isinstance(images, (np.ndarray, torch.Tensor)) and images.ndim == 4:
return True
elif is_valid_image(images):
return True
elif isinstance(images, list):
return all(is_valid_image(image) for image in images)
return False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very important! Thank you.

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a bunch!

@sayakpaul sayakpaul mentioned this pull request May 6, 2024
2 tasks
@DN6
Copy link
Collaborator

DN6 commented May 6, 2024

I also think we do not need to accept list of 5d (the video processor should only accept video or a list of videos, and 5d is already a list of videos, I don't think passing list of 5d is needed); however, a list of 5d input works with current code so I make it work here as well (and add notes to it because it is a little bit confusing).

Agree that list of 5D is not needed.

elif isinstance(video, list) and isinstance(video[0], PIL.Image.Image):
# video processor only accepts video or a list of videos or a batch of videos (5d array/tensors) as inputs,
# while we do accept a list of 5d array/tensors, we concatenate them to a single video batch
if isinstance(video, list) and isinstance(video[0], np.ndarray) and video[0].ndim == 5:
Copy link
Collaborator

@DN6 DN6 May 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think it's okay to not accept a list of 5D video. Perhaps just raise an error if a 5D list is passed here with a message asking to concatenate along the batch dimension?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I throw a warning and deprecated it just to be more safe

Copy link
Collaborator

@DN6 DN6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely done 👍🏽

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's ship!

@sayakpaul sayakpaul merged commit e2f61d5 into video-processor May 7, 2024
@sayakpaul
Copy link
Member

Thanks a lot, Yiyi!

@sayakpaul sayakpaul deleted the video-processor-yiyi-testing branch May 7, 2024 04:51
sayakpaul added a commit that referenced this pull request May 10, 2024
* introduce videoprocessor.

* fix quality

* address yiyi's feedback

* fix preprocess_video call.

* video_processor -> image_processor

* fix

* fix more.

* quality

* image_processor -> video_processor

* support List[List[PIL.Image.Image]]

* change to video_processor.

* documentation

* Apply suggestions from code review

* changes

* remove print.

* refactor video processor (part # 7776) (#7861)

* update

* update remove deprecate

* Update src/diffusers/video_processor.py

* update

* Apply suggestions from code review

* deprecate list of 5d for video and list of 4d for image + apply other feedbacks

* up

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* add doc.

* tensor2vid -> postprocess_video.

* refactor preprocess with preprocess_video

* set default values.

* empty commit

* more refactoring of prepare_latents in animatediff vid2vid

* checking documentation

* remove documentation for now.

* fix animatediff sdxl

* fix test failure [part of video processor PR] (#7905)

up

* remove preceed_with_frames.

* doc

* fix

* fix

* remove video input as a single-frame video.

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
sayakpaul added a commit that referenced this pull request Dec 23, 2024
* introduce videoprocessor.

* fix quality

* address yiyi's feedback

* fix preprocess_video call.

* video_processor -> image_processor

* fix

* fix more.

* quality

* image_processor -> video_processor

* support List[List[PIL.Image.Image]]

* change to video_processor.

* documentation

* Apply suggestions from code review

* changes

* remove print.

* refactor video processor (part # 7776) (#7861)

* update

* update remove deprecate

* Update src/diffusers/video_processor.py

* update

* Apply suggestions from code review

* deprecate list of 5d for video and list of 4d for image + apply other feedbacks

* up

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* add doc.

* tensor2vid -> postprocess_video.

* refactor preprocess with preprocess_video

* set default values.

* empty commit

* more refactoring of prepare_latents in animatediff vid2vid

* checking documentation

* remove documentation for now.

* fix animatediff sdxl

* fix test failure [part of video processor PR] (#7905)

up

* remove preceed_with_frames.

* doc

* fix

* fix

* remove video input as a single-frame video.

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants