Add mkl implementation for exponential on CPU #69967

CaoE · 2021-12-15T07:21:49Z

Description

Add mkl implementation for exponential on CPU to improve the performance of exponential.

Testing

data type: float32
single socket (28cores):

before: torch.Size([10, 128, 10, 124])  0.065 s
        torch.Size([10, 128, 20, 124])  0.130 s
            
after:  torch.Size([10, 128, 10, 124])  5.9e-05 s
        torch.Size([10, 128, 20, 124])  0.000113 s

single core:

before: torch.Size([10, 128, 10, 124])  0.065 s
        torch.Size([10, 128, 20, 124])  0.130 s 

after:  torch.Size([10, 128, 10, 124])  0.00117 s
        torch.Size([10, 128, 20, 124])  0.002347 s

cc @VitalyFedyunin @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10

pytorch-probot · 2021-12-15T07:21:51Z

CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/CaoE/pytorch/blob/f3412187fedaa72f0f4f06a6ac68c10b279e260e/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default

Workflows	Labels (bold enabled)	Status
Triggered Workflows
linux-binary-conda	`ciflow/binaries`, `ciflow/binaries/conda`, `ciflow/default`	✅ triggered
linux-binary-libtorch-cxx11-abi	`ciflow/binaries`, `ciflow/binaries/libtorch`, `ciflow/default`	✅ triggered
linux-binary-libtorch-pre-cxx11	`ciflow/binaries`, `ciflow/binaries/libtorch`, `ciflow/default`	✅ triggered
linux-binary-manywheel	`ciflow/binaries`, `ciflow/binaries/wheel`, `ciflow/default`	✅ triggered
linux-bionic-py3.7-clang9	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/noarch`, `ciflow/trunk`, `ciflow/xla`	✅ triggered
linux-docs	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/docs`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-vulkan-bionic-py3.7-clang9	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`, `ciflow/vulkan`	✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7-bazel-test	`ciflow/all`, `ciflow/bazel`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-xenial-py3-clang5-mobile-build	`ciflow/all`, `ciflow/default`, `ciflow/linux`, `ciflow/mobile`, `ciflow/trunk`	✅ triggered
linux-xenial-py3-clang5-mobile-custom-build-static	`ciflow/all`, `ciflow/default`, `ciflow/linux`, `ciflow/mobile`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-clang7-asan	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/sanitizers`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-clang7-onnx	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/onnx`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-gcc7	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-gcc7-no-ops	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single	`ciflow/all`, `ciflow/android`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit	`ciflow/all`, `ciflow/android`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
win-vs2019-cpu-py3	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/trunk`, `ciflow/win`	✅ triggered
win-vs2019-cuda11.3-py3	`ciflow/all`, `ciflow/cuda`, `ciflow/default`, `ciflow/trunk`, `ciflow/win`	✅ triggered
Skipped Workflows
caffe2-linux-xenial-py3.7-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
docker-builds	`ciflow/all`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-arm64	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-arm64-coreml	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-arm64-custom-ops	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-arm64-full-jit	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-arm64-metal	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-x86-64	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-x86-64-coreml	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-x86-64-full-jit	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
libtorch-linux-xenial-cuda10.2-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/slow`, `ciflow/trunk`	🚫 skipped
linux-bionic-rocm4.5-py3.7	`ciflow/all`, `ciflow/linux`, `ciflow/rocm`, `ciflow/trunk`	🚫 skipped
linux-docs-push	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
linux-xenial-cuda11.3-py3.7-gcc7-no-ops	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
macos-10-15-py3-arm64	`ciflow/all`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
macos-10-15-py3-lite-interpreter-x86-64	`ciflow/all`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
macos-11-py3-x86-64	`ciflow/all`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
parallelnative-linux-xenial-py3.7-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
periodic-libtorch-linux-bionic-cuda11.5-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-libtorch-linux-xenial-cuda11.1-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-linux-bionic-cuda11.5-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`, `ciflow/slow`, `ciflow/slow-gradcheck`	🚫 skipped
periodic-linux-xenial-cuda11.1-py3.7-gcc7-debug	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-win-vs2019-cuda11.1-py3	`ciflow/all`, `ciflow/cuda`, `ciflow/scheduled`, `ciflow/win`	🚫 skipped
periodic-win-vs2019-cuda11.5-py3	`ciflow/all`, `ciflow/cuda`, `ciflow/scheduled`, `ciflow/win`	🚫 skipped
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-build	`ciflow/all`, `ciflow/android`, `ciflow/cpu`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped

You can add a comment to the PR and tag @pytorchbot with the following commands:

# ciflow rerun, "ciflow/default" will always be added automatically
@pytorchbot ciflow rerun

# ciflow rerun with additional labels "-l <ciflow/label_name>", which is equivalent to adding these labels manually and trigger the rerun
@pytorchbot ciflow rerun -l ciflow/scheduled -l ciflow/slow

For more information, please take a look at the CI Flow Wiki.

jbschlosser · 2021-12-15T22:06:02Z

Hey @mruberry - not sure who the best person to review this internally was - please feel free to reassign as appropriate :)

facebook-github-bot · 2021-12-16T00:31:31Z

🔗 Helpful links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/69967
✖️ Python docs build was skipped
✖️ C++ docs build was skipped
❓Need help or want to give feedback on the CI? Visit our office hours

✅ No Failures (0 Pending)

As of commit 54c8ef6 (more details on the Dr. CI page):

Expand to see more

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

mruberry · 2021-12-16T21:52:36Z

cc @VitalyFedyunin

izaitsevfb · 2022-11-17T22:50:25Z

https://github.com/facebookresearch/multimodal/blob/afe52dec1987e5a3764539606a55f430994ebbc8/test/modules/losses/test_mdetr_losses.py#L115.
May I know why fixed numbers are used here since test_exponential_sample in test_distribution.py already uses scipy.stats to test sampling results for exponential.
The fixed expected_l1_loss = torch.tensor(0.7721) and expected_giou_loss = torch.tensor(1.1768) is not applicable for new exponential.

Perhaps the test author @ebsmothers can answer that?

…nential implementation of PyTorch (#364) Summary: pytorch/pytorch#69967 adds a MKL implementation for exponential in PyTorch. Even the new implementation fits the exponential distribution but it is not the same as the old one. Change the expected values to accommodate the new exponential implementation of PyTorch. Pull Request resolved: #364 Reviewed By: pikapecan Differential Revision: D41430249 Pulled By: ebsmothers fbshipit-source-id: c3333fd7f4449881a239c6b0c26d22fd63009b68

CaoE · 2022-11-23T01:11:47Z

@malfet @izaitsevfb As suggested by @ebsmothers, we change the expected value to match the exact value from the new exponential implementationhttps://github.com/facebookresearch/multimodal/pull/364. Could you please check if this PR will cause internal breakage ?

facebook-github-bot · 2022-11-29T18:29:55Z

@izaitsevfb has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

izaitsevfb · 2022-11-29T18:32:04Z

@malfet @izaitsevfb As suggested by @ebsmothers, we change the expected value to match the exact value from the new exponential implementationhttps://github.com/facebookresearch/multimodal/pull/364. Could you please check if this PR will cause internal breakage ?

I've manually imported this PR to run the internal tests. For Meta employees, see D41587318

malfet · 2022-11-29T18:36:23Z

@izaitsevfb you need to re-import again, as @CaoE just pushed several commits to the PR after the import.

izaitsevfb · 2022-11-29T18:39:07Z

@izaitsevfb you need to re-import again, as @CaoE just pushed several commits to the PR after the import.

I think GH timeline is misleading, the commits were made 2 hours ago, before the import. I'll double check via the code.

UPD: I believe we have the latest version imported.

izaitsevfb · 2022-11-30T18:18:07Z

@CaoE, there are some failures, related to the values, expected by the internal models that I can't share specifics about. Please don't merge this yet. I'm trying to find a PoC that can provide more context.

izaitsevfb · 2022-11-30T19:20:30Z

@CaoE, I've confirmed with the test owners, it's all good. The tests didn't expect the RNG change, but they can be forward fixed.

For the diff train oncall: please see the details in D41587318.

CaoE · 2022-12-05T08:11:50Z

@CaoE, I've confirmed with the test owners, it's all good. The tests didn't expect the RNG change, but they can be forward fixed.

Thank you @izaitsevfb. May I know if these tests will be fixed for the new RNG in the near future?

izaitsevfb · 2022-12-06T02:43:54Z

@CaoE, I've confirmed with the test owners, it's all good. The tests didn't expect the RNG change, but they can be forward fixed.

Thank you @izaitsevfb. May I know if these tests will be fixed for the new RNG in the near future?

I think there are plans to rewrite the tests to not rely on RNG specifics, but I don't have the specifics. Please don't worry about that, as I've confirmed that the internal test failures are expected and can be forward-fixed.

CaoE · 2022-12-12T09:38:34Z

I think there are plans to rewrite the tests to not rely on RNG specifics, but I don't have the specifics. Please don't worry about that, as I've confirmed that the internal test failures are expected and can be forward-fixed.

@izaitsevfb That's great ! Can we merge this PR first and then fix the tests?

izaitsevfb · 2022-12-13T00:02:56Z

I think there are plans to rewrite the tests to not rely on RNG specifics, but I don't have the specifics. Please don't worry about that, as I've confirmed that the internal test failures are expected and can be forward-fixed.

@izaitsevfb That's great ! Can we merge this PR first and then fix the tests?

Yep, that's what I meant, sorry for not being clear.

CaoE · 2022-12-13T09:49:42Z

@pytorchbot merge

pytorchmergebot · 2022-12-13T09:51:19Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Summary: Due to a change in pytorch's random distribution calculation (see pytorch/pytorch#69967), `torch.multinomial` is producing different values for unit tests. `GenerationUtil` uses multinomial for autoregressive decoding and token prediction. The `test_sample` unit test now tests for output shape instead of exact tokens to mitigate sensitivities to random distributions. Pull Request resolved: #403 Test Plan: ``` python -m pytest -v tests/utils/test_generate.py tests/utils/test_generate.py::TestLogitsMask::test_normal PASSED [ 6%] tests/utils/test_generate.py::TestLogitsMask::test_zero_dims PASSED [ 13%] tests/utils/test_generate.py::TestLogitsMask::test_in_seq_only PASSED [ 20%] tests/utils/test_generate.py::TestLogitsMask::test_out_seq_only PASSED [ 26%] tests/utils/test_generate.py::TestGenerationUtil::test_model_eval_warning PASSED [ 33%] tests/utils/test_generate.py::TestGenerationUtil::test_sample PASSED [ 40%] tests/utils/test_generate.py::TestGenerationUtil::test_filter_logits PASSED [ 46%] tests/utils/test_generate.py::TestLogitsFilterTopK::test_min_tokens_to_keep PASSED [ 53%] tests/utils/test_generate.py::TestLogitsFilterTopK::test_top_k_invalid PASSED [ 60%] tests/utils/test_generate.py::TestLogitsFilterTopK::test_default PASSED [ 66%] tests/utils/test_generate.py::TestLogitsFilterTopK::test_top_k PASSED [ 73%] tests/utils/test_generate.py::TestLogitsFilterTopP::test_min_tokens_to_keep PASSED [ 80%] tests/utils/test_generate.py::TestLogitsFilterTopP::test_top_p_invalid PASSED [ 86%] tests/utils/test_generate.py::TestLogitsFilterTopP::test_default PASSED [ 93%] tests/utils/test_generate.py::TestLogitsFilterTopP::test_top_p PASSED [100%] ``` Reviewed By: pbontrager Differential Revision: D43110663 Pulled By: RdoubleA fbshipit-source-id: d6c8665cb1ea102352e09c4557ff6982ffa40f84

pytorch-probot bot added the ciflow/default label Dec 15, 2021

CaoE force-pushed the exp_optimization2 branch 4 times, most recently from 8074406 to 0dfca39 Compare December 15, 2021 13:51

facebook-github-bot added the cla signed label Dec 15, 2021

pytorchbot added the open source label Dec 15, 2021

jbschlosser requested review from peterbell10, nikitaved and mruberry December 15, 2021 22:05

jbschlosser added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Dec 15, 2021

CaoE force-pushed the exp_optimization2 branch from 0dfca39 to f70c51f Compare December 16, 2021 04:06

mruberry requested review from VitalyFedyunin and removed request for nikitaved, peterbell10 and mruberry December 16, 2021 21:52

CaoE force-pushed the exp_optimization2 branch 3 times, most recently from b6c6ec9 to ea43e5f Compare December 31, 2021 07:41

CaoE force-pushed the exp_optimization2 branch 2 times, most recently from 9fd25c2 to a7abb44 Compare January 10, 2022 02:04

CaoE force-pushed the exp_optimization2 branch 5 times, most recently from f341218 to f75ba5b Compare January 24, 2022 03:01

CaoE force-pushed the exp_optimization2 branch from 093d308 to 3bd3445 Compare November 15, 2022 03:07

CaoE force-pushed the exp_optimization2 branch from 3bd3445 to 92a3cf1 Compare November 22, 2022 07:28

CaoE force-pushed the exp_optimization2 branch from 92a3cf1 to dcf51c6 Compare November 29, 2022 05:16

CaoE force-pushed the exp_optimization2 branch from dcf51c6 to 5091610 Compare December 13, 2022 03:19

pytorchmergebot closed this in eae0f3f Dec 13, 2022

CaoE added 6 commits December 13, 2022 11:17

add mkl implementation for exponential on CPU

8387b77

add mean and var test for exponential

3a38491

remove jit workaround for exponential

84b23c5

remove the vendor check

3c4a70e

add bfloat16 and float16 for test_exponential

ad70c6f

remove vendor check for bernoulli

5091610

RdoubleA mentioned this pull request Feb 2, 2023

[MUGEN] Update GenerateUtil test to remove multinomial dependency facebookresearch/multimodal#403

Closed

malfet mentioned this pull request Mar 11, 2024

Different Dropout behavior on macOS and Linux #121595

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add mkl implementation for exponential on CPU #69967

Add mkl implementation for exponential on CPU #69967

CaoE commented Dec 15, 2021 •

edited by pytorch-bot bot

pytorch-probot bot commented Dec 15, 2021 •

edited

⚛️ CI Flow

jbschlosser commented Dec 15, 2021

facebook-github-bot commented Dec 16, 2021 •

edited

mruberry commented Dec 16, 2021

izaitsevfb commented Nov 17, 2022

CaoE commented Nov 23, 2022

facebook-github-bot commented Nov 29, 2022

izaitsevfb commented Nov 29, 2022

malfet commented Nov 29, 2022

izaitsevfb commented Nov 29, 2022 •

edited

izaitsevfb commented Nov 30, 2022

izaitsevfb commented Nov 30, 2022

CaoE commented Dec 5, 2022

izaitsevfb commented Dec 6, 2022

CaoE commented Dec 12, 2022 •

edited

izaitsevfb commented Dec 13, 2022

CaoE commented Dec 13, 2022

pytorchmergebot commented Dec 13, 2022

Add mkl implementation for exponential on CPU #69967

Add mkl implementation for exponential on CPU #69967

Conversation

CaoE commented Dec 15, 2021 • edited by pytorch-bot bot

Description

Testing

pytorch-probot bot commented Dec 15, 2021 • edited

⚛️ CI Flow

jbschlosser commented Dec 15, 2021

facebook-github-bot commented Dec 16, 2021 • edited

🔗 Helpful links

✅ No Failures (0 Pending)

mruberry commented Dec 16, 2021

izaitsevfb commented Nov 17, 2022

CaoE commented Nov 23, 2022

facebook-github-bot commented Nov 29, 2022

izaitsevfb commented Nov 29, 2022

malfet commented Nov 29, 2022

izaitsevfb commented Nov 29, 2022 • edited

izaitsevfb commented Nov 30, 2022

izaitsevfb commented Nov 30, 2022

CaoE commented Dec 5, 2022

izaitsevfb commented Dec 6, 2022

CaoE commented Dec 12, 2022 • edited

izaitsevfb commented Dec 13, 2022

CaoE commented Dec 13, 2022

pytorchmergebot commented Dec 13, 2022

Merge started

CaoE commented Dec 15, 2021 •

edited by pytorch-bot bot

pytorch-probot bot commented Dec 15, 2021 •

edited

facebook-github-bot commented Dec 16, 2021 •

edited

izaitsevfb commented Nov 29, 2022 •

edited

CaoE commented Dec 12, 2022 •

edited