Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

does diffuer example support batch size option? #6

Closed
ericlormul opened this issue Oct 3, 2022 · 4 comments
Closed

does diffuer example support batch size option? #6

ericlormul opened this issue Oct 3, 2022 · 4 comments

Comments

@ericlormul
Copy link

In plain diffuser, if we make prompt a list, it will batch the input, but got following error in AITemplate. I make prompt a list of size 2.

{'trained_betas'} was not found in config. Values will be initialized to default values.
[18:28:36] ./tmp/CLIPTextModel/model-generated.h:275: Init AITemplate Runtime.
[18:28:37] ./tmp/UNet2DConditionModel/model-generated.h:3262: Init AITemplate Runtime.
[18:28:37] ./tmp/AutoencoderKL/model-generated.h:678: Init AITemplate Runtime.
[18:28:40] ./tmp/CLIPTextModel/model_interface.cu:92: Error: [SetValue] Dimension got value out of bounds; expected value to be in [1, 1], but got 2
Traceback (most recent call last):
  File "examples/05_stable_diffusion/demo.py", line 46, in <module>
    run()
  File "/home/root/miniconda3/envs/ldm/lib/python3.8/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/home/root/miniconda3/envs/ldm/lib/python3.8/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/home/root/miniconda3/envs/ldm/lib/python3.8/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/root/miniconda3/envs/ldm/lib/python3.8/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "examples/05_stable_diffusion/demo.py", line 37, in run
    image = pipe(prompt).images[0]
  File "/home/root/miniconda3/envs/ldm/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/root/repos/AITemplate/examples/05_stable_diffusion/pipeline_stable_diffusion_ait.py", line 247, in __call__
    text_embeddings = self.clip_inference(text_input.input_ids.to(self.device))
  File "/home/root/repos/AITemplate/examples/05_stable_diffusion/pipeline_stable_diffusion_ait.py", line 139, in clip_inference
    exe_module.run_with_tensors(inputs, ys, graph_mode=True)
  File "/home/root/miniconda3/envs/ldm/lib/python3.8/site-packages/aitemplate/compiler/model.py", line 483, in run_with_tensors
    outputs_ait = self.run(
  File "/home/root/miniconda3/envs/ldm/lib/python3.8/site-packages/aitemplate/compiler/model.py", line 438, in run
    return self._run_impl(
  File "/home/root/miniconda3/envs/ldm/lib/python3.8/site-packages/aitemplate/compiler/model.py", line 377, in _run_impl
    self.DLL.AITemplateModelContainerRun(
  File "/home/root/miniconda3/envs/ldm/lib/python3.8/site-packages/aitemplate/compiler/model.py", line 192, in _wrapped_func
    raise RuntimeError(f"Error in function: {method.__name__}")
RuntimeError: Error in function: AITemplateModelContainerRun
@antinucleon
Copy link
Contributor

Yes. the batch version is not merged yet. Check here:

https://github.com/terrychenism/AIT_StableDiffusion/tree/main/examples/05_stable_diffusion

@terrychenism
Copy link
Contributor

batched sd: #8

@ericlormul
Copy link
Author

Hi, i pulled the latest and commit and re-install and compile everything. but when I change the prompt variable in demo.py to a list of strings. It still gives out the above error. What's the correct way of doing batch size inference using StableDiffusionAITPipeline? Thanks!

@lileilai
Copy link

I have meet the same problem, Do you get the solution ?

asroy pushed a commit to shaojiewang/AITemplate that referenced this issue Nov 10, 2022
* fixed batch_size > 1

* load so file for benchmark
tissue3 pushed a commit to tissue3/AITemplate-1 that referenced this issue Feb 7, 2023
* [runner] unified parallel builder/profiler

* [lint] patched

* [test] recover avg pool2d test

* [task_runner] add comment for ftask_proc, fret_proc

Co-authored-by: Bing Xu <bingxu@fb.com>
evshiron pushed a commit to are-we-gfx1100-yet/AITemplate that referenced this issue Jun 21, 2023
* updated to 5th stable diffusion checkpoint (facebookincubator#57)

* updated to 5th stable diffusion checkpoint

* updated all stable diffusion example files to checkpoint v1.5

* Support different sizes via recompilation (StableDiff demo) (facebookincubator#71)

Mostly, this commit is just re-establishing the relationship
between various previously-hardcoded constants and the target
image size (since the latent size is 1/8 of the image size,
hardcoding the latent sizes is inconvenient).

This adds `--width` and `--height` options to both compile.py
and demo.py, and provided these both match you can process
different sizes. For img2img mode, the size options passed at
compile time must match the size of the actual input image.

Consequently, the `--img2img` flag for `compile.py` no longer
exists: all this ever did was change the hardcoded size to
match the default input image used by `demo_img2img.py`. Yikes.

Sooo it's slightly more flexible than before, but still has no
support for a single binary to handle different image sizes. It
isn't super clear that compiling a generic binary is useful: the
upstream project can do that just fine: isn't the whole point
of AITemplates to achieve performance gains via aggressive
constant propagation and benchmarking to select the optimal
kernels?

* v0.1.1 (facebookincubator#74)

* v0.11

* update cutlass

* fix

* add missing files

* patch cutlass

Co-authored-by: Bing Xu <bingxu@fb.com>

* fix sm86 conv (facebookincubator#81)

Co-authored-by: Bing Xu <bingxu@fb.com>

* fix README.md of bert example (facebookincubator#82)

* Add negative prompts feature for txt2img pipeline (facebookincubator#75)

Add optional negative prompt option for txt2img pipeline

* add missing copyright headers (facebookincubator#86)

* Conv2d group (facebookincubator#73)

* group conv

* add conv_groups op compiler

* Conv2d groups

* Conv2d depthwise

* wip

* wip

* wip

* wip

* only one ops to get feedback

* only one ops to get feedback

* Fix layout, now test passes

* Fix docstring

* Add conv2d_depthwise_bias and test

* Add conv2d_depthwise_bias and test and frontends

* doc

* frontend import depthwise

* Fix lint

* Fix lint

* Fix after rebase UTs pass

* fix lint

* fix more lint

* add more tile size for GN + update CK to main  (facebookincubator#40) (facebookincubator#3)

* add more tile size for gn

* update ck

Co-authored-by: Terry Chen <terrychen@meta.com>

Co-authored-by: Terry Chen <hahakuku@hotmail.com>
Co-authored-by: Terry Chen <terrychen@meta.com>

* Ck remove unnecessary compile include directories (facebookincubator#4)

* remove unnecessary include directory while compiling ck code

* refactor data_type.hpp under ck/utility/data_type.hpp

* Update docker to ROCm5.3 (facebookincubator#2)

* upgrade compiler to ROCM 5.3 version

* remove unnecessary build fixes

Co-authored-by: illsilin <Illia.Silin@amd.com>

* Fix BERT benchmark for 2 gcd (facebookincubator#6)

* fixed batch_size > 1

* load so file for benchmark

* Ci setup (facebookincubator#11)

* add script for ci and testing

* fix syntax

* fix syntax again

* get rid of the drun alias

* get rid of interactive flag for docker

* fix syntax

* run docker without sudo

* run some sanity checks before docker

* change the run directive

* fix syntax

* merge build and test steps into one

* fix the path to examples

* add pytorch

* fix syntax

* install timm module

* set paths in the docker

* change the version of the pytorch

* try running bert and vit models

* add modules for bert

* test if examples work with FB repo

* try building the docker from the ait source

* try building the docker from the rocm/ait repo

* get rid of unnecessary changing paths

* try running examples 1 and 4

* update docker arguments

* fix syntax

* try skippinfg the rebuilding steps

* try using the same commits as Jing

* check the pytorch version

* force replacing pytorch

* update the examples

* remove the foreground commands

* skip the BERT tests while using mi100

* clean up and add logfiles

* archive the logfiles

* fix path to log files, refine steps

* fix paths

* fix path to logfiles

* specify exact paths to logs

* fix syntax

* fix syntax

* get rid of workspace path in artifact paths

* write log headers and archive them in one step

* set git branch name as global env var

* fix syntax

* set the branch name value in each necessary step

* test posting test results to db

* add missing python packages

* do not install glob module

* do not convert dbsshport to int type

* check the port value

* hardcode ssh port

* try re-running with new action secrets

* skip the ssh tunnel

* apply changes to all branches and use tunnel if not running on db host

* change the syntax to check hostname

* fix syntax

* move the python script for processing the results

* only run ci for the push branch

* add BERT tests

* modify the script to parse and store BERT test results

* post-merge fix of pr 6 (facebookincubator#13)

Co-authored-by: root <root@ctr-ubbsmc15.amd.com>
Co-authored-by: Chao Liu <lc.roy86@gmail.com>

* Add stable diffusion benchmark to the CI. (facebookincubator#16)

* add compilation of stable diffusion

* add missing python modules and new demos

* add accelerate module and fix the parsing script

* only use batch size 1 for stable diffusion

* add stable diffusion benchmark result to the table

* sync upstream v0.1.1 (facebookincubator#15)

* updated to 5th stable diffusion checkpoint (facebookincubator#57)

* updated to 5th stable diffusion checkpoint

* updated all stable diffusion example files to checkpoint v1.5

* Support different sizes via recompilation (StableDiff demo) (facebookincubator#71)

Mostly, this commit is just re-establishing the relationship
between various previously-hardcoded constants and the target
image size (since the latent size is 1/8 of the image size,
hardcoding the latent sizes is inconvenient).

This adds `--width` and `--height` options to both compile.py
and demo.py, and provided these both match you can process
different sizes. For img2img mode, the size options passed at
compile time must match the size of the actual input image.

Consequently, the `--img2img` flag for `compile.py` no longer
exists: all this ever did was change the hardcoded size to
match the default input image used by `demo_img2img.py`. Yikes.

Sooo it's slightly more flexible than before, but still has no
support for a single binary to handle different image sizes. It
isn't super clear that compiling a generic binary is useful: the
upstream project can do that just fine: isn't the whole point
of AITemplates to achieve performance gains via aggressive
constant propagation and benchmarking to select the optimal
kernels?

* v0.1.1 (facebookincubator#74)

* v0.11

* update cutlass

* fix

* add missing files

* patch cutlass

Co-authored-by: Bing Xu <bingxu@fb.com>

* fix profile

* fix profile bugs

* update ck commit

* fix format

* fix format

* update timeout

* add rocm unittest case

Co-authored-by: Ivan Mikhnenkov <39604625+ivanmikhnenkov@users.noreply.github.com>
Co-authored-by: Chris Kitching <chriskitching@linux.com>
Co-authored-by: Bing Xu <antinucleon@gmail.com>
Co-authored-by: Bing Xu <bingxu@fb.com>

* merge amd-develop

Co-authored-by: Ivan Mikhnenkov <39604625+ivanmikhnenkov@users.noreply.github.com>
Co-authored-by: Chris Kitching <chriskitching@linux.com>
Co-authored-by: Bing Xu <antinucleon@gmail.com>
Co-authored-by: Bing Xu <bingxu@fb.com>
Co-authored-by: Zhang Jun <ewalker@live.cn>
Co-authored-by: Bozhao <yubz86@gmail.com>
Co-authored-by: Max Podkorytov <maxdp@meta.com>
Co-authored-by: Ehsan Azar <dashesy@gmail.com>
Co-authored-by: Chao Liu <lc.roy86@gmail.com>
Co-authored-by: Terry Chen <hahakuku@hotmail.com>
Co-authored-by: Terry Chen <terrychen@meta.com>
Co-authored-by: carlushuang <carlus.huang@amd.com>
Co-authored-by: illsilin <Illia.Silin@amd.com>
Co-authored-by: zjing14 <zhangjing14@gmail.com>
Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com>
Co-authored-by: root <root@ctr-ubbsmc15.amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants