Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Inductor] add config for weight prepacking #93811

Closed
wants to merge 1 commit into from

Conversation

Valentine233
Copy link
Collaborator

@Valentine233 Valentine233 commented Feb 1, 2023

Fixes #93495

Mkldnn weight prepacking may lead to large memory footprint for some models such as UniXcoder. In this case, disabling mkldnn weight prepacking is needed to avoid memory overload.

This PR adds a config for switching mkldnn weight prepacking.

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @ezyang @soumith @msaroufim @wconstab @ngimel @bdhirsh @mlazos @voznesenskym @yanboliang @penguinwu @anijain2305 @EikanWang @Guobing-Chen @chunyuan-w @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @desertfire

@pytorch-bot
Copy link

pytorch-bot bot commented Feb 1, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/93811

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 90cce21:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

torch/_inductor/mkldnn.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@jgong5 jgong5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a second thought, the prepack happens outside cpp codegen. Adding a mkldnn_weight_prepack at the top level sounds better.

@@ -78,6 +78,9 @@

comment_origin = False

# enable mkldnn weight prepacking to get a better performance; may lead to large memory cost
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# enable mkldnn weight prepacking to get a better performance; may lead to large memory cost
# enable mkldnn weight prepacking to get a better performance; may lead to large memory footprint

@Valentine233 Valentine233 added the ciflow/trunk Trigger trunk jobs on your pull request label Feb 1, 2023
Copy link
Contributor

@Chillee Chillee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this under the cpp section?

@Valentine233
Copy link
Collaborator Author

Valentine233 commented Feb 1, 2023

Can we move this under the cpp section?

@Chillee The topic is mentioned by Jiong above. As the weight prepack happens outside cpp codegen, it may not be proper to move it under the cpp section.

@Chillee
Copy link
Contributor

Chillee commented Feb 1, 2023

But it's only relevant for CPU codegen, no?

@jgong5
Copy link
Collaborator

jgong5 commented Feb 1, 2023

But it's only relevant for CPU codegen, no?

Yes. I'm fine it is kept inside cpp section.

@Valentine233
Copy link
Collaborator Author

But it's only relevant for CPU codegen, no?

@Chillee OK, changed.

@Valentine233
Copy link
Collaborator Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

ragulpr added a commit to ragulpr/pytorch that referenced this pull request Feb 2, 2023
…n-dev-setup

* origin: (898 commits)
  Move dynamo.optimizations.distributed to backends (pytorch#93408)
  Remove cuda 11.6 from nightly (pytorch#93979)
  Refactor dynamo register_backend/BACKENDS (pytorch#93389)
  Remove cuda 11.6 from CI replace with 11.7 (pytorch#93406)
  [Dynamo] Rename `GuardBuilder.guarded_code` -> `check_fn_manager` (pytorch#93934)
  Revert "Remove CUDA 11.6 from nightly builds (pytorch#93404)"
  Revert "[inductor] fix crash issue when input is a view tensor (pytorch#90150)"
  Basic Validation for FSDP `state_dict` transformations of modules with persistent buffers (pytorch#93396)
  Merge Inductor perf smoke test with other inductor CI tests (pytorch#93395)
  [inductor] Don't import torchvision (pytorch#93027)
  [FSDP][3/N] Refactor `summon_full_params` unit tests (pytorch#92298)
  [FSDP][2/N] `_summon_full_params` -> `_unshard_params` (pytorch#92297)
  Remove CUDA 11.6 from nightly builds (pytorch#93404)
  Mark buffers that reuse other buffers (pytorch#93329)
  Refactor to allow reuse of SchedulerNode.allocate (pytorch#93328)
  retire sparse_mask_helper (pytorch#91714)
  update fbgemm third party (pytorch#93907)
  [inductor] fix crash issue when input is a view tensor (pytorch#90150)
  [Inductor] add config for weight prepacking (pytorch#93811)
  Check for none for NNModuleVariable.__module__ (pytorch#93326)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

[Torch2 CPU] torch._inductor.ir: [WARNING] Using FallbackKernel: aten.cumsum
7 participants