Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardize fast pipeline tests with PipelineTestMixin #1526

Merged
merged 25 commits into from Dec 6, 2022

Conversation

anton-l
Copy link
Member

@anton-l anton-l commented Dec 2, 2022

  • Moves some of the common tests (save/load, output consistency, determinism) to a common PipelineTesterMixin
  • Adds dummy pipeline and inputs generators to pass to the mixin class and tidy up the rest of the tests.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Dec 2, 2022

The documentation is not available anymore as the PR was closed or merged.

max_diff = np.abs(output - output_loaded).max()
self.assertLessEqual(max_diff, 1e-5)

def test_tuple_output(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! That's very good for now ! In the future we could expand this test to also automatically make sure that not only the first tuple is the same but all the others now.

For now this would e.g. apply to the stable diffusion pipeline.

Copy link
Member

@pcuenca pcuenca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the approach in principle, I think it works to simplify and make tests coherent!

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Design looks nice to me! Just think we could add some more common tests. Note the more tests we add here the more "safety" we will get for free. Some additional tests we could add:

  • We can for now assume that every pipeline has a num_inference_steps parameter IMO. I'd also add a test that runs every pipeline with 2,3 different num_inference_steps and makes sure that a) output sizes are always the same, b) speed is less when less steps, ...
  • from_pretrained from the hub and locally -> IMO every pipeline should also define a parameter dummy_components_on_hub in addition to get_common_pipeline_components() this dummy components could then be used to check that loading from hub saving loading same locally gives same results
  • use dummy_components_on_hub and add a EXPECTED_SLICE to create a common test that makes sure that dummy weights on the Hub give expected results. This should help us to get rid of a lot of boiler plate fast tests that check for numerical equivalence.
  • The def components function can/should also be tested here. E.g. just check that components is the same as get_commen_pipeline_components and also the same as the init signature without the optional arguments.
  • enabling disabling the progress bar can be checked
  • Load/save with safetensors

=> Contrary to src/diffusers I'm really happy to go full abstraction mode here, so the more "common" tests we add here the faster we'll be able to add new pipelines going forward.

@patrickvonplaten
Copy link
Contributor

Also, some signature tests would be nice e.g. for now we could maybe "force" every pipeline to have a num_inference_steps input

@anton-l
Copy link
Member Author

anton-l commented Dec 6, 2022

Added the requested tests, except for:

  • from_pretrained from the hub + dummy_components_on_hub: will add those in a separate PR after mass-generating the missing checkpoints
  • EXPECTED_SLICE: haven't seen many repeated slices so far, but might revisit this later after updating the rest of the tests
  • Load/save with safetensors: we don't have save_pretrained with safetensors yet, but loading can probably be tested with the same dummy_components_on_hub where safetensors are in the same repo

output_loaded = pipe_loaded(**inputs)[0]

max_diff = np.abs(output - output_loaded).max()
self.assertLess(max_diff, 3e-3, "The output of the fp16 pipeline changed after saving and loading.")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might have an opportunity to make fp16 more stable somewhere, the error of about 0.002 even when running the same pipeline twice (without saving) seems suspicious to me

Comment on lines +69 to +85
# TODO: address the non-deterministic text encoder (fails for save-load tests)
# torch.manual_seed(0)
# text_encoder_config = RobertaSeriesConfig(
# hidden_size=32,
# project_dim=32,
# intermediate_size=37,
# layer_norm_eps=1e-05,
# num_attention_heads=4,
# num_hidden_layers=5,
# vocab_size=5002,
# )
# text_encoder = RobertaSeriesModelWithTransformation(text_encoder_config)

torch.manual_seed(0)
config = RobertaSeriesConfig(
text_encoder_config = CLIPTextConfig(
bos_token_id=0,
eos_token_id=2,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@patil-suraj the AltDiffusion pipeline produces non-matching outputs if I run it with the same inputs twice. Replacing RobertaSeriesModelWithTransformation with CLIPTextModel helped, so it's probably that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm weird, could you maybe open a seperate issue for this? I can look into it :-) We should test with RobertaSeriesConfig here IMO :-)

@anton-l anton-l changed the title [WIP] Standardize fast pipeline tests with PipelineTestMixin Standardize fast pipeline tests with PipelineTestMixin Dec 6, 2022
@anton-l anton-l marked this pull request as ready for review December 6, 2022 16:13
Comment on lines +60 to +67
def test_save_load_local(self):
if torch_device == "mps" and self.pipeline_class in (
DanceDiffusionPipeline,
CycleDiffusionPipeline,
StableDiffusionImg2ImgPipeline,
):
# FIXME: inconsistent outputs on MPS
return
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @pcuenca I might need your expertise with these, whenever you have time 😅

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline let's maybe open an issue here. But it's very nice that we spotted these bugs :-)


torch.backends.cuda.matmul.allow_tf32 = False


class KarrasVePipelineFastTests(PipelineTesterMixin, unittest.TestCase):
class KarrasVePipelineFastTests(unittest.TestCase):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guess this whole class needs an API update no? Could we maybe also open an issue for this?


@property
def pipeline_class(self) -> Union[Callable, DiffusionPipeline]:
raise NotImplementedError(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very nice! This forces new models to be nicely tested :-)

max_diff = np.abs(output_with_offload - output_without_offload).max()
self.assertLess(max_diff, 1e-5, "XFormers attention should not affect the inference results")

def test_progress_bar(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

@@ -9,4 +27,347 @@ class PipelineTesterMixin:
equivalence of dict and tuple outputs, etc.
"""

pass
# set these parameters to False in the child class if the pipeline does not support the corresponding functionality
test_attention_slicing = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very cool to reuse transformers mechanism here!

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Let's merge it!

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
@anton-l anton-l merged commit 02d83c9 into main Dec 6, 2022
@anton-l anton-l deleted the pipeline-test-mixins branch December 6, 2022 17:35
tcapelle pushed a commit to tcapelle/diffusers that referenced this pull request Dec 12, 2022
)

* [WIP] Standardize fast pipeline tests with PipelineTestMixin

* refactor the sd tests a bit

* add more common tests

* add xformers

* add progressbar test

* cleanup

* upd fp16

* CycleDiffusionPipelineFastTests

* DanceDiffusionPipelineFastTests

* AltDiffusionPipelineFastTests

* StableDiffusion2PipelineFastTests

* StableDiffusion2InpaintPipelineFastTests

* StableDiffusionImageVariationPipelineFastTests

* StableDiffusionImg2ImgPipelineFastTests

* StableDiffusionInpaintPipelineFastTests

* remove unused mixins

* quality

* add missing inits

* try to fix mps tests

* fix mps tests

* add mps warmups

* skip for some pipelines

* style

* Update tests/test_pipelines_common.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
sliard pushed a commit to sliard/diffusers that referenced this pull request Dec 21, 2022
)

* [WIP] Standardize fast pipeline tests with PipelineTestMixin

* refactor the sd tests a bit

* add more common tests

* add xformers

* add progressbar test

* cleanup

* upd fp16

* CycleDiffusionPipelineFastTests

* DanceDiffusionPipelineFastTests

* AltDiffusionPipelineFastTests

* StableDiffusion2PipelineFastTests

* StableDiffusion2InpaintPipelineFastTests

* StableDiffusionImageVariationPipelineFastTests

* StableDiffusionImg2ImgPipelineFastTests

* StableDiffusionInpaintPipelineFastTests

* remove unused mixins

* quality

* add missing inits

* try to fix mps tests

* fix mps tests

* add mps warmups

* skip for some pipelines

* style

* Update tests/test_pipelines_common.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request Dec 25, 2023
)

* [WIP] Standardize fast pipeline tests with PipelineTestMixin

* refactor the sd tests a bit

* add more common tests

* add xformers

* add progressbar test

* cleanup

* upd fp16

* CycleDiffusionPipelineFastTests

* DanceDiffusionPipelineFastTests

* AltDiffusionPipelineFastTests

* StableDiffusion2PipelineFastTests

* StableDiffusion2InpaintPipelineFastTests

* StableDiffusionImageVariationPipelineFastTests

* StableDiffusionImg2ImgPipelineFastTests

* StableDiffusionInpaintPipelineFastTests

* remove unused mixins

* quality

* add missing inits

* try to fix mps tests

* fix mps tests

* add mps warmups

* skip for some pipelines

* style

* Update tests/test_pipelines_common.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants