Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mingyuanm/sdxl quantization notebook #9042

Merged
merged 80 commits into from
Apr 30, 2024

Conversation

Victor49152
Copy link
Collaborator

What does this PR do ?

Update quantization script and adding tutorial/documentation

Collection: [Note which collection this PR will affect]

Changelog

  • Add specific line by line info of high level changes in this PR.

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

Jenkins CI

The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.

There's no need to comment jenkins on the PR to trigger Jenkins CI.
The GitHub Actions CI will run automatically when the PR is opened.
To run CI on an untrusted fork, a NeMo user with write access must click "Approve and run".

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

Victor49152 and others added 30 commits April 9, 2024 16:07
Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>
…container.

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>
Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>
Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>
Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>
Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>
Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>
Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>
Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>
Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>
Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>
@github-actions github-actions bot removed the NLP label Apr 25, 2024
pre-commit-ci bot and others added 3 commits April 25, 2024 21:57
Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>
…book' into mingyuanm/sdxl_quantization_notebook
yaoyu-33
yaoyu-33 previously approved these changes Apr 26, 2024
Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>
…book' into mingyuanm/sdxl_quantization_notebook
@Victor49152 Victor49152 merged commit be59e08 into main Apr 30, 2024
132 of 246 checks passed
@Victor49152 Victor49152 deleted the mingyuanm/sdxl_quantization_notebook branch April 30, 2024 02:32
suiyoubi pushed a commit that referenced this pull request May 2, 2024
* Move cached embedding devices and dtype for onnx export consistency

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* Add old trt export/inference script, currently not working in latest container.

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* Update intro and why nemo in dev doc

* Categorize tutorials

* Add NeMo TRT inference pipeline and quatization workflow

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add guards to avoid undefined variables

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* minor fix

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update tutorials link

* update index

* Restructure

* Restructure

* Restructure

* Restructure

* Restructure

* Restructure

* Add conversion script from hf sdxl to nemo sdxl

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* Update quantize pipeline to adapt to variable image dimension

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* update sdxl pipeline to be aware of additional emb channels

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add guards for potential local var

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* copyright header

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Restructure

* Restructure

* Update flash attention

* Update flash attention

* Update file paths

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* Fix few structure issue

* Fix migration

* Fix structure

* Fix structure

* Few updates

* Add few more scripts

* Fix scripts

* Fix few things

* Fix tutorial table

* Restructure

* Rename

* Add notebook

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* WIP

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* Documentation

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* Few fixes and moves

* Move sections

* Fix bib

* Refactor files

* Fixes

* Update quantization script

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* Add tutorial and docs

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* Add images

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* Fix

* Fix few issues

* remove scripts

* Update comments

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* Update docs

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* Add links to sdxl quantization tutorial

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* Add link to new tutorial

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove unused import

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* Using links to images

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* remove unused imports

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>
Co-authored-by: yaoyu-33 <yaoyu.094@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Pablo Garay <palenq@gmail.com>
Signed-off-by: Ao Tang <aot@nvidia.com>
rohitrango pushed a commit to rohitrango/NeMo that referenced this pull request Jun 25, 2024
* Move cached embedding devices and dtype for onnx export consistency

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* Add old trt export/inference script, currently not working in latest container.

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* Update intro and why nemo in dev doc

* Categorize tutorials

* Add NeMo TRT inference pipeline and quatization workflow

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add guards to avoid undefined variables

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* minor fix

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update tutorials link

* update index

* Restructure

* Restructure

* Restructure

* Restructure

* Restructure

* Restructure

* Add conversion script from hf sdxl to nemo sdxl

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* Update quantize pipeline to adapt to variable image dimension

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* update sdxl pipeline to be aware of additional emb channels

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add guards for potential local var

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* copyright header

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Restructure

* Restructure

* Update flash attention

* Update flash attention

* Update file paths

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* Fix few structure issue

* Fix migration

* Fix structure

* Fix structure

* Few updates

* Add few more scripts

* Fix scripts

* Fix few things

* Fix tutorial table

* Restructure

* Rename

* Add notebook

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* WIP

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* Documentation

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* Few fixes and moves

* Move sections

* Fix bib

* Refactor files

* Fixes

* Update quantization script

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* Add tutorial and docs

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* Add images

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* Fix

* Fix few issues

* remove scripts

* Update comments

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* Update docs

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* Add links to sdxl quantization tutorial

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* Add link to new tutorial

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove unused import

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* Using links to images

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* remove unused imports

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com>
Co-authored-by: yaoyu-33 <yaoyu.094@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Pablo Garay <palenq@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants