Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add AVX512 support in ATen & remove AVX support #56992

Closed
wants to merge 172 commits into from
Closed
Show file tree
Hide file tree
Changes from 111 commits
Commits
Show all changes
172 commits
Select commit Hold shift + click to select a range
2fc5d65
Rename namespace vec256 to vec
Apr 27, 2021
45cfd55
Disable AVX
imaginary-person Apr 27, 2021
ee07190
vec is the new vec256
imaginary-person Apr 27, 2021
46cb5b0
Fix aot test
imaginary-person Apr 27, 2021
7b69acc
Disable AVX for Bazel builds
imaginary-person Apr 27, 2021
7a56c94
Add AVX512 to aten.bzl later
imaginary-person Apr 27, 2021
66fd585
Fix test_cpp_extensions_aot_no_ninja
imaginary-person Apr 27, 2021
9d07871
Merge branch 'only_vec' of https://github.com/imaginary-person/pytorc…
imaginary-person Apr 27, 2021
66b7c89
Trigger CI to check if builds are using phantom code again
imaginary-person Apr 27, 2021
e96e3ce
Get latest changes from the main repo
imaginary-person Apr 27, 2021
82ae5fa
Undo a change by #56704
imaginary-person Apr 27, 2021
f104e2a
Undo change introduced by 55380
imaginary-person Apr 27, 2021
eebdcde
AVX512 support added
imaginary-person Apr 30, 2021
dd5af26
Fixed a few bugs
imaginary-person May 3, 2021
4c5bd83
[skip ci] I must've been half asleep
imaginary-person May 3, 2021
4749bb9
[skip ci] Used pen & paper to figure out sequence
imaginary-person May 3, 2021
3f0e229
Fix typo. All unit tests pass now
imaginary-person May 4, 2021
9f53242
Copy-paste file from local repo
imaginary-person May 4, 2021
62484ab
Functional simplified
limo1996 May 4, 2021
e312546
functional.h unified & simplified by @Limo1996
imaginary-person May 4, 2021
0694f8d
Fix typo
imaginary-person May 6, 2021
61ef16f
hadd_pd and hsub_pd improved
limo1996 May 8, 2021
ff89feb
Refactoring of `hadd_pd` & `hsub_pd` by @limo1996
imaginary-person May 8, 2021
b42dda4
hadd and hsub implemented with permutes
limo1996 May 10, 2021
ca82027
Revert "hadd and hsub implemented with permutes"
limo1996 May 10, 2021
7c60903
hadd and hsub implemented with permutes
limo1996 May 10, 2021
2c05a84
@limo1996 optimized hadd and hsub with permutes
imaginary-person May 10, 2021
e74f4a8
Merge with latest code in the master repo
imaginary-person May 13, 2021
93c96cc
Fix newlines & Clang compilation errors
May 13, 2021
89bd4a4
Merge branch 'master' of https://github.com/pytorch/pytorch into only…
May 13, 2021
5b67fb3
Added support for some older clang versions
imaginary-person May 13, 2021
2b2184f
Fix include files regression
imaginary-person May 13, 2021
ac4df02
Fix typo
imaginary-person May 14, 2021
e0998ec
Merge latest code from the master repo
imaginary-person May 15, 2021
5c078a4
Merge latest code from the main repo into only_vec
imaginary-person May 15, 2021
4608691
Somehow third party libs had gotten added
imaginary-person May 15, 2021
a08a982
Add new commits from main repo
imaginary-person May 15, 2021
3a6afe3
Get latest code from main repo
imaginary-person May 15, 2021
f610ca3
Update jenkins test script
imaginary-person May 16, 2021
24b13ac
Update jenkins test script
imaginary-person May 16, 2021
0d0cfff
Maybe jenkins script update only works for project members
imaginary-person May 16, 2021
4581bbc
Try adding NO_AVX512 to test.sh
imaginary-person May 16, 2021
608e171
Update gen_dependent_configs()
imaginary-person May 16, 2021
8026cd8
Update test.sh
imaginary-person May 16, 2021
fb49193
Add `CPU capability usage` string for AVX512
imaginary-person May 16, 2021
a4f4f6b
Rollback pytorch_build_definitions.py
imaginary-person May 16, 2021
5b22e26
Just checking if `-mno-avx512f` flag would help here
imaginary-person May 16, 2021
f4995c6
Add include/ATen/cpu/vec/vec512/*.h'
imaginary-person May 16, 2021
52a091e
Fix silent-duplicate-symbol-resolution of ATen symbols
imaginary-person May 17, 2021
79d209a
[skip ci] Skip test_lkj_cholesky_log_prob on MacOS
imaginary-person May 17, 2021
8934692
[skip ci] Get latest commits from the main repo
imaginary-person May 17, 2021
b8fae2c
[skip ci] Unskip skipped test
imaginary-person May 17, 2021
0bddf95
Rename Vectorize to Vectorized
imaginary-person May 18, 2021
c86bef9
Merge latest commits of the main repo's master branch
imaginary-person May 18, 2021
893b739
Try if this compiles with MSVC
imaginary-person May 19, 2021
8f3426e
Fix typo
imaginary-person May 21, 2021
be54f15
Fix hadd_ps() & hsub_ps()
imaginary-person May 21, 2021
28ad4db
Revert hsub_ps & hadd_ps but remove post-hsub/hadd permutes
imaginary-person May 21, 2021
adb894c
Fix BFloat16
imaginary-person May 25, 2021
6b4bccc
BFloat16 completely fixed
imaginary-person May 25, 2021
f368cc9
Make vec512_float.h parallel to vec512_double.h
imaginary-person May 25, 2021
d239f24
Rename `reduction128` to `vectorized_reduction`
imaginary-person May 25, 2021
bb624a9
`Sum` kernel requires `ilp_factor` 2 for AVX512
imaginary-person May 26, 2021
2aa9e9d
Add AVX512 code to quantized kernels.
imaginary-person May 26, 2021
16c08e0
Merge branch 'master' into only_vec
imaginary-person May 26, 2021
6915ed3
Remove whitespace to fix style
imaginary-person May 26, 2021
5199111
[skip ci] Revert change in comment
imaginary-person May 26, 2021
b143709
[skip ci] Revert typo in SoftMax.cu
imaginary-person May 26, 2021
04578d3
[skip ci] Revert typo
imaginary-person May 26, 2021
2f7f14d
[skip ci] Revert typo
imaginary-person May 26, 2021
3f0d604
[skip ci] Revert changes in this file
imaginary-person May 26, 2021
8bfa43f
[skip ci] Remove empty line
imaginary-person May 26, 2021
d2016b8
[skip ci] Remove empty line
imaginary-person May 26, 2021
5104181
Fix compilation error for the ROCm build
imaginary-person May 27, 2021
661efe2
[Temporary] Test AVX512 kernels on CI for more potential failures
imaginary-person May 27, 2021
e05ddb1
Remove trailing space
imaginary-person May 27, 2021
34232fe
Merge branch 'pytorch:master' into only_vec
imaginary-person May 27, 2021
36a0fab
Trigger AVX512 Linux Py3.8 coverage test1, test2 CI
imaginary-person May 27, 2021
0acfba4
avx_mathfun.h uses AVX512 as well now
imaginary-person May 28, 2021
80d83ee
Proactively merge #55202 into this draft
imaginary-person May 28, 2021
f6f16e6
Fix Windows builds
imaginary-person May 28, 2021
c7842ba
Get latest code from the master repo
imaginary-person May 29, 2021
73b1e6f
Revert LibKineto to its original state
imaginary-person May 30, 2021
98f77a6
Revise DistributionTemplates.h
imaginary-person May 30, 2021
d786b6b
Merge branch 'master' of https://github.com/pytorch/pytorch into only…
imaginary-person Jun 1, 2021
aa9821f
[skip ci] Make normal_fill produce produce same results for AVX512 & …
imaginary-person Jun 1, 2021
7e8e779
[skip ci] Prefetch the next cache line
imaginary-person Jun 1, 2021
756371e
[skip ci] Fix quantization kernels
imaginary-person Jun 9, 2021
36556b3
Delete Vec256 functional.h
imaginary-person Jun 14, 2021
76acb5b
[skip ci] [skip-ci] Add AVX512_256
imaginary-person Jun 14, 2021
a06ea95
[skip ci] [skip-ci] Change build settings for MSVC
imaginary-person Jun 14, 2021
187d4b2
Resolve merge conflict
imaginary-person Jun 14, 2021
4cab3d9
Merge branch 'master' of https://github.com/pytorch/pytorch into only…
imaginary-person Jun 14, 2021
6fc557a
Fix auto-merge
imaginary-person Jun 15, 2021
2ceaf6f
Make AVX512_256 default & skip 3 tests
imaginary-person Jun 15, 2021
5cbca05
Merge latest code from the master branch
imaginary-person Jun 15, 2021
9d11f42
Skip 3 tests
imaginary-person Jun 15, 2021
fad9d9c
Windows does not support setenv
imaginary-person Jun 15, 2021
4b23a91
ATEN_CPU_CAPABILITY is set by compute_cpu_capability() lazily
imaginary-person Jun 15, 2021
2883f7c
Fix dispatch
imaginary-person Jun 15, 2021
d2d82c5
Remove AVX512_256
imaginary-person Jun 16, 2021
47515c9
Change variable name
imaginary-person Jun 16, 2021
9f1341e
Merge branch 'master' of https://github.com/pytorch/pytorch into only…
imaginary-person Jun 16, 2021
545223d
SoftMaxKernel was modified recently
imaginary-person Jun 18, 2021
d9ac0c3
Merge branch 'master' of https://github.com/pytorch/pytorch into only…
imaginary-person Jun 18, 2021
c8c5780
Add changes in vec256 to vec512
imaginary-person Jun 18, 2021
2d7b336
Revert normal_fill to use AVX2
imaginary-person Jun 18, 2021
55164ab
SoftMaxKernel was changed recently
imaginary-person Jun 18, 2021
c60f5a1
Fix build flag
imaginary-person Jun 18, 2021
625515c
`test_sum_vs_numpy_cpu_float16` fails with `ilp
imaginary-person Jun 19, 2021
19c94ea
Continue Windows tests through error
imaginary-person Jun 20, 2021
12cf8fc
Disabled AVX512 kernels for sum & nansum, created GHA for pytorch-lin…
imaginary-person Jun 22, 2021
429d8fe
Add missing newline
imaginary-person Jun 22, 2021
632236e
Merge branch 'master' of https://github.com/pytorch/pytorch into only…
imaginary-person Jun 22, 2021
9165f8f
Update pytorch-linux-bionic-py3.8-gcc9-coverage.yml workflow
imaginary-person Jun 22, 2021
f880b43
Fix flake8
imaginary-person Jun 22, 2021
beb1026
Try to fix mypy for Python 3.6
imaginary-person Jun 22, 2021
ac8e281
pytorch-linux-bionic-py3.8-gcc9-coverage.yml should be run in PRs as …
imaginary-person Jun 22, 2021
f55e559
Have 2 test shards for pytorch-linux-bionic-py3.8-gcc9-coverage
imaginary-person Jun 22, 2021
8e78bab
Fix spacing
imaginary-person Jun 22, 2021
eaef540
Skip test_compare_model_outputs_functional_static on Windows
imaginary-person Jun 22, 2021
739445e
`test_compare_model_stub_functional_static` also failed on Windows
imaginary-person Jun 22, 2021
287a211
Don't use AVX512 ATen Quantized kernels on Windows
imaginary-person Jun 22, 2021
355ed41
Merge branch 'master' of https://github.com/pytorch/pytorch into only…
imaginary-person Jun 22, 2021
012a798
Fix whitespace in GridSampler.cu
imaginary-person Jun 22, 2021
6359193
Revise build condition for AVX512_256
imaginary-person Jun 22, 2021
1d1d8eb
Add recent changes from 55202
imaginary-person Jun 22, 2021
2291e20
Remove redundant line
imaginary-person Jun 22, 2021
ef440b4
Merge branch 'master' of https://github.com/pytorch/pytorch into only…
imaginary-person Jun 22, 2021
72be33c
Break vec/functional.h into two files
imaginary-person Jun 22, 2021
e42e312
Merge branch 'master' of https://github.com/pytorch/pytorch into only…
imaginary-person Jun 22, 2021
919d513
Add trailing newline at the end of the file
imaginary-person Jun 22, 2021
3ecb2e6
Merge master
imaginary-person Jun 24, 2021
a6ca2c6
Fix flake8 & build
imaginary-person Jun 25, 2021
db5db2b
Fix style
imaginary-person Jun 25, 2021
f7e4fb2
Fix copy-paste bug
imaginary-person Jun 28, 2021
da38985
Merge branch 'master' of https://github.com/pytorch/pytorch into only…
imaginary-person Jun 28, 2021
ad8939a
Make diff simpler
imaginary-person Jun 28, 2021
04e62ad
[SKIP CI] [SKIP-CI] Had forgotten to change line after removing some …
imaginary-person Jun 30, 2021
503c70b
Remove `CPU_CAPABILITY_AVX2` from vec512_base.h
imaginary-person Jun 30, 2021
977ae7b
Merge clauses in intrinsics.h
imaginary-person Jun 30, 2021
6dccb16
Make duplicate comment concise by referring to the original copy
imaginary-person Jun 30, 2021
ada4606
Make duplicate comment concise by referring to the original copy
imaginary-person Jun 30, 2021
3c94696
Merge branch 'master' of https://github.com/pytorch/pytorch into only…
imaginary-person Jun 30, 2021
ff34f09
Revise comments
imaginary-person Jun 30, 2021
89de464
Remove duplicate code
imaginary-person Jun 30, 2021
5cc5016
SumKernel has been fixed now
imaginary-person Jun 30, 2021
197ca83
Revise comments
imaginary-person Jun 30, 2021
82c5b9d
SumKernel.cpp also has intrinsics now
imaginary-person Jun 30, 2021
97a7773
Get changes from the master branch
imaginary-person Jun 30, 2021
0e5116d
VC2019 toolchain wasn't found, so retriggering build
imaginary-person Jun 30, 2021
1abb401
Merge branch 'pytorch:master' into only_vec
imaginary-person Jun 30, 2021
3fe9fd0
A recent change was made to GHA workflows
imaginary-person Jun 30, 2021
2773bce
Merge master branch
imaginary-person Jul 2, 2021
6a2553b
Merge master branch
imaginary-person Jul 2, 2021
068ad8e
Remove unnecessary intrinsics' casts & change an AVX2 intrinsic
imaginary-person Jul 6, 2021
f6f0032
Fix comments & line-endings
imaginary-person Jul 7, 2021
fcf0a25
Merge branch 'master' of https://github.com/pytorch/pytorch into only…
imaginary-person Jul 7, 2021
5036dca
Add AVX512BW check to CMake
imaginary-person Jul 7, 2021
65ff331
clear whitespace
imaginary-person Jul 7, 2021
79fddfe
Fix MacOS build
imaginary-person Jul 7, 2021
7f03a23
Unified vec_base
imaginary-person Jul 9, 2021
432474d
Merge master
imaginary-person Jul 9, 2021
86d6630
Merge master
imaginary-person Jul 9, 2021
101a6a6
Improve code readability
imaginary-person Jul 9, 2021
c08980e
Improve code readability
imaginary-person Jul 9, 2021
72e8032
Merge branch 'master' of https://github.com/pytorch/pytorch into only…
imaginary-person Jul 9, 2021
914cb7b
Fix whitespace
imaginary-person Jul 9, 2021
d773a77
Fix elif
imaginary-person Jul 9, 2021
cfcd9e2
Merge master
imaginary-person Jul 20, 2021
ecbc8de
Fix Windows build
imaginary-person Jul 20, 2021
6a059f0
Fix Windows build
imaginary-person Jul 20, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
4 changes: 2 additions & 2 deletions .circleci/config.yml
Expand Up @@ -7170,14 +7170,14 @@ workflows:
- pytorch_linux_bionic_py3_8_gcc9_coverage_build
build_environment: "pytorch-linux-bionic-py3.8-gcc9-coverage-test1"
docker_image: "308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-py3.8-gcc9"
resource_class: large
resource_class: xlarge
imaginary-person marked this conversation as resolved.
Show resolved Hide resolved
- pytorch_linux_test:
name: pytorch_linux_bionic_py3_8_gcc9_coverage_test2
requires:
- pytorch_linux_bionic_py3_8_gcc9_coverage_build
build_environment: "pytorch-linux-bionic-py3.8-gcc9-coverage-test2"
docker_image: "308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-py3.8-gcc9"
resource_class: large
resource_class: xlarge
- pytorch_linux_build:
name: pytorch_linux_bionic_rocm3_9_py3_6_build
requires:
Expand Down
6 changes: 3 additions & 3 deletions .jenkins/pytorch/test.sh
Expand Up @@ -128,10 +128,10 @@ if [[ "$BUILD_ENVIRONMENT" == *asan* ]]; then
(cd test && ! get_exit_code python -c "import torch; torch._C._crash_if_aten_asan(3)")
fi

if [[ "${BUILD_ENVIRONMENT}" == *-NO_AVX-* ]]; then
if [[ "${BUILD_ENVIRONMENT}" == *-NO_AVX-* ]] || [[ "${BUILD_ENVIRONMENT}" == *-NO_AVX2-* ]]; then
export ATEN_CPU_CAPABILITY=default
elif [[ "${BUILD_ENVIRONMENT}" == *-NO_AVX2-* ]]; then
export ATEN_CPU_CAPABILITY=avx
elif [[ "${BUILD_ENVIRONMENT}" == *-NO_AVX512-* ]]; then
export ATEN_CPU_CAPABILITY=avx2
fi

if [ -n "$IN_PULL_REQUEST" ] && [[ "$BUILD_ENVIRONMENT" != *coverage* ]]; then
Expand Down
2 changes: 1 addition & 1 deletion .jenkins/pytorch/win-test-helpers/test_python.bat
@@ -1,3 +1,3 @@
call %SCRIPT_HELPERS_DIR%\setup_pytorch_env.bat
cd test && python run_test.py --exclude-jit-executor --verbose --determine-from="%1" && cd ..
cd test && python run_test.py --exclude-jit-executor --verbose --continue-through-error --determine-from="%1" && cd ..
if ERRORLEVEL 1 exit /b 1
3 changes: 1 addition & 2 deletions aten.bzl
@@ -1,9 +1,8 @@
load("@rules_cc//cc:defs.bzl", "cc_library")

CPU_CAPABILITY_NAMES = ["DEFAULT", "AVX", "AVX2"]
CPU_CAPABILITY_NAMES = ["DEFAULT", "AVX2"]
imaginary-person marked this conversation as resolved.
Show resolved Hide resolved
CAPABILITY_COMPILER_FLAGS = {
"AVX2": ["-mavx2", "-mfma"],
"AVX": ["-mavx"],
"DEFAULT": [],
}

Expand Down
2 changes: 1 addition & 1 deletion aten/src/ATen/CMakeLists.txt
Expand Up @@ -41,7 +41,7 @@ if(NOT BUILD_LITE_INTERPRETER)
endif()
EXCLUDE(ATen_CORE_SRCS "${ATen_CORE_SRCS}" ${ATen_CORE_TEST_SRCS})

file(GLOB base_h "*.h" "detail/*.h" "cpu/*.h" "cpu/vec/vec256/*.h" "cpu/vec/*.h" "quantized/*.h")
file(GLOB base_h "*.h" "detail/*.h" "cpu/*.h" "cpu/vec/vec512/*.h" "cpu/vec/vec256/*.h" "cpu/vec/*.h" "quantized/*.h")
file(GLOB base_cpp "*.cpp" "detail/*.cpp" "cpu/*.cpp")
file(GLOB cuda_h "cuda/*.h" "cuda/detail/*.h" "cuda/*.cuh" "cuda/detail/*.cuh")
file(GLOB cuda_cpp "cuda/*.cpp" "cuda/detail/*.cpp")
Expand Down
6 changes: 3 additions & 3 deletions aten/src/ATen/Version.cpp
Expand Up @@ -108,12 +108,12 @@ std::string used_cpu_capability() {
case native::CPUCapability::DEFAULT:
ss << "NO AVX";
break;
case native::CPUCapability::AVX:
ss << "AVX";
break;
case native::CPUCapability::AVX2:
ss << "AVX2";
break;
case native::CPUCapability::AVX512:
ss << "AVX512";
break;
#endif
default:
break;
Expand Down
5 changes: 5 additions & 0 deletions aten/src/ATen/cpu/FlushDenormal.cpp
@@ -1,6 +1,11 @@
#include <ATen/cpu/FlushDenormal.h>

#if defined(CPU_CAPABILITY_AVX512)
dskhudia marked this conversation as resolved.
Show resolved Hide resolved
#include <ATen/cpu/vec/vec512/intrinsics.h>
#else
#include <ATen/cpu/vec/vec256/intrinsics.h>
#endif

#include <cpuinfo.h>

namespace at { namespace cpu {
Expand Down