-
Notifications
You must be signed in to change notification settings - Fork 6.5k
refactor video processor (part # 7776) #7861
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
| def is_valid_image(image): | ||
| return isinstance(image, PIL.Image.Image) or isinstance(image, (np.ndarray, torch.Tensor)) and image.ndim in (2, 3) | ||
|
|
||
|
|
||
| def is_valid_image_imagelist(images): | ||
| # check if the image input is one of the supported formats for image and image list: | ||
| # it can be either (1) a 4d pytorch tensor or numpy array, (2) a valid image or (3) list of valid image | ||
| if isinstance(images, (np.ndarray, torch.Tensor)) and images.ndim == 4: | ||
| return True | ||
| elif is_valid_image(images): | ||
| return True | ||
| elif isinstance(images, list): | ||
| return all(is_valid_image(image) for image in images) | ||
| return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very important! Thank you.
sayakpaul
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a bunch!
Agree that list of 5D is not needed. |
| elif isinstance(video, list) and isinstance(video[0], PIL.Image.Image): | ||
| # video processor only accepts video or a list of videos or a batch of videos (5d array/tensors) as inputs, | ||
| # while we do accept a list of 5d array/tensors, we concatenate them to a single video batch | ||
| if isinstance(video, list) and isinstance(video[0], np.ndarray) and video[0].ndim == 5: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think it's okay to not accept a list of 5D video. Perhaps just raise an error if a 5D list is passed here with a message asking to concatenate along the batch dimension?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I throw a warning and deprecated it just to be more safe
DN6
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nicely done 👍🏽
sayakpaul
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's ship!
|
Thanks a lot, Yiyi! |
* introduce videoprocessor. * fix quality * address yiyi's feedback * fix preprocess_video call. * video_processor -> image_processor * fix * fix more. * quality * image_processor -> video_processor * support List[List[PIL.Image.Image]] * change to video_processor. * documentation * Apply suggestions from code review * changes * remove print. * refactor video processor (part # 7776) (#7861) * update * update remove deprecate * Update src/diffusers/video_processor.py * update * Apply suggestions from code review * deprecate list of 5d for video and list of 4d for image + apply other feedbacks * up --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * add doc. * tensor2vid -> postprocess_video. * refactor preprocess with preprocess_video * set default values. * empty commit * more refactoring of prepare_latents in animatediff vid2vid * checking documentation * remove documentation for now. * fix animatediff sdxl * fix test failure [part of video processor PR] (#7905) up * remove preceed_with_frames. * doc * fix * fix * remove video input as a single-frame video. --------- Co-authored-by: YiYi Xu <yixu310@gmail.com>
* introduce videoprocessor. * fix quality * address yiyi's feedback * fix preprocess_video call. * video_processor -> image_processor * fix * fix more. * quality * image_processor -> video_processor * support List[List[PIL.Image.Image]] * change to video_processor. * documentation * Apply suggestions from code review * changes * remove print. * refactor video processor (part # 7776) (#7861) * update * update remove deprecate * Update src/diffusers/video_processor.py * update * Apply suggestions from code review * deprecate list of 5d for video and list of 4d for image + apply other feedbacks * up --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * add doc. * tensor2vid -> postprocess_video. * refactor preprocess with preprocess_video * set default values. * empty commit * more refactoring of prepare_latents in animatediff vid2vid * checking documentation * remove documentation for now. * fix animatediff sdxl * fix test failure [part of video processor PR] (#7905) up * remove preceed_with_frames. * doc * fix * fix * remove video input as a single-frame video. --------- Co-authored-by: YiYi Xu <yixu310@gmail.com>
part of #7776
I refactored the
preprocessfor both image and video processora few notes:
I refactored the image processor a little bit, too, so it is more aligned with the video processor