[vulkan] Efficient gemm implementation #49609

SS-JIA · 2020-12-18T19:47:54Z

Stack from ghstack:

[vulkan] Efficient gemm implementation #49609 [vulkan] Efficient gemm implementation

Differential Revision: D26209677

[ghstack-poisoned]

ghstack-source-id: 0fc0936 Pull Request resolved: #49609

facebook-github-bot · 2020-12-18T19:48:08Z

💊 CI failures summary and remediations

As of commit 19eb3aa (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

[ghstack-poisoned]

ghstack-source-id: 48f4a4b Pull Request resolved: #49609

[ghstack-poisoned]

ghstack-source-id: b856914 Pull Request resolved: #49609

[ghstack-poisoned]

ghstack-source-id: a974859 Pull Request resolved: #49609

[ghstack-poisoned]

ghstack-source-id: accedc6 Pull Request resolved: #49609

[ghstack-poisoned]

ghstack-source-id: e70be38 Pull Request resolved: #49609

[ghstack-poisoned]

ghstack-source-id: 2d55a26 Pull Request resolved: #49609

[ghstack-poisoned]

ghstack-source-id: 19e047b Pull Request resolved: #49609

[ghstack-poisoned]

ghstack-source-id: ad85e8a Pull Request resolved: #49609

[ghstack-poisoned]

ghstack-source-id: 550874d Pull Request resolved: #49609

[ghstack-poisoned]

ghstack-source-id: 5291806 Pull Request resolved: #49609

[ghstack-poisoned]

ghstack-source-id: afff895 Pull Request resolved: #49609

[ghstack-poisoned]

ghstack-source-id: 9ac8cff Pull Request resolved: #49609

[ghstack-poisoned]

ghstack-source-id: 5065c40 Pull Request resolved: #49609

[ghstack-poisoned]

ghstack-source-id: 14f2590 Pull Request resolved: #49609

[ghstack-poisoned]

ghstack-source-id: e4d9e29 Pull Request resolved: #49609

Differential Revision: [D26209677](https://our.internmc.facebook.com/intern/diff/D26209677) [ghstack-poisoned]

ghstack-source-id: e48a05d Pull Request resolved: #49609

Differential Revision: [D26209677](https://our.internmc.facebook.com/intern/diff/D26209677) [ghstack-poisoned]

ghstack-source-id: 3dd6957 Pull Request resolved: #49609

AshkanAliabadi

Thanks Stephen!

AshkanAliabadi · 2021-02-09T19:44:10Z

aten/src/ATen/native/vulkan/glsl/hw_to_image2d.glsl

+
+  if (all(lessThan(pos, uBlock.size.xyz))) {
+    const int base_x = 2*pos.x;
+    const int base_y = 2*pos.y;


Think in terms of vectors. Not sure if it will perform better on modern scalar GPUs with a SIMT architecture (shouldn't be worse anyway) but should perform better on older VLIW.

By the way, swizzling in shaders is free.

const int2 base = 2 * pos.xy;

AshkanAliabadi · 2021-02-09T19:45:49Z

aten/src/ATen/native/vulkan/glsl/hw_to_image2d.glsl

+    const ivec4 index = base + ivec4(0, 1 ,uBlock.orig_size.x, uBlock.orig_size.x+1);
+
+    vec4 outvec = vec4(0,0,0,0);
+    if (base_x < uBlock.orig_size.x && base_y < uBlock.orig_size.y) {


This shader is not performance sensitive if it's just a one time transformation but still branches are expensive in shaders. Generally if you can rework the logic to avoid branches it is better.

AshkanAliabadi · 2021-02-09T19:48:41Z

aten/src/ATen/native/vulkan/api/Context.h

+      const Shader::Descriptor& shader_descriptor,
+      const Shader::WorkGroup& global_work_group,
+      const Shader::WorkGroup& local_work_group_size,
+      Arguments&&... arguments);


Please delete the old version of this function that does not take local work group size explicitly, replacing it with this new version only. Then pass local_work_group_size (adapter->blah_blah() - don't remember the name) explicitly at all call sites. We are going to need that flexibility anyway for tweaking local work group size.

AshkanAliabadi · 2021-02-09T19:49:14Z

aten/src/ATen/native/vulkan/api/Resource.h

+    VK_IMAGE_PACK_NC4HW_3D = 0,
+    VK_IMAGE_PACK_NC4HW_2D = 1,
+    VK_IMAGE_PACK_H2W2 = 2,
+  } VkImagePackFormat;


Do we still need this? Sorry it may become apparent as I scroll down.

AshkanAliabadi · 2021-02-09T19:50:18Z

aten/src/ATen/native/vulkan/glsl/addmm.glsl

+      vec4 texel1 = texelFetch(uM1, ivec3(k, pos.y, pos.z), 0);
+      vec4 texel2 = texelFetch(uM2, ivec3(pos.x, k, pos.z), 0);
+      sum = fma(texel1.xxzz, texel2.xyxy, sum);
+      sum = fma(texel1.yyww, texel2.zwzw, sum);


Is this a by-product of our new packing?

Yes, the new packing makes use of the entire input texel.

AshkanAliabadi · 2021-02-09T20:03:32Z

aten/src/ATen/native/vulkan/ops/Packing.cpp

+    },
+    v_src.options()
+  };
+  const struct {


Same comment regarding anonymous structs on GCC.

AshkanAliabadi · 2021-02-09T20:04:11Z

aten/src/ATen/native/vulkan/ops/Packing.cpp

+  };
+
+  uint32_t orig_w = output_sizes[output_sizes.size() - 1];
+  uint32_t orig_h = output_sizes[output_sizes.size() - 2];


const. const everywhere please. I am a const zealot. :)

AshkanAliabadi · 2021-02-09T20:06:37Z

aten/src/ATen/native/vulkan/ops/Packing.cpp

+  return v_src_unpacked;
+}
+
+vTensor unpack_image1x1(vTensor v_src, c10::SmallVector<int64_t, 4u> output_sizes, api::Context* context, api::Command::Buffer& command_buffer) {


Pass all objects greater than the size of the two machine words (2 x 64-bits on 64-bit, 2 x 32-bit for 32-bits) by [const] reference. I add a fudge factor of 2 since pointer chasing and dereferencing (which is effectively what references are - just syntactic sugar for pointers) has a cost so it's best avoided when the cost of passing by value is small.

AshkanAliabadi · 2021-02-09T20:09:08Z

aten/src/ATen/native/vulkan/ops/Packing.h

+vTensor pack_image2d_h2w2(vTensor v_src, api::Context* context, api::Command::Buffer& command_buffer);
+vTensor unpack_image2d_h2w2(vTensor v_src, c10::SmallVector<int64_t, 4u> output_sizes, api::Context* context, api::Command::Buffer& command_buffer);
+
+vTensor unpack_image1x1(vTensor v_src, c10::SmallVector<int64_t, 4u> output_sizes, api::Context* context, api::Command::Buffer& command_buffer);


If these functions are only used in one single implementation file, please remove the common header. Reason: Software engineering is the art (since it is not all science unfortunately) and science of change management, and the bedrock of managing changes is limiting scope. Limiting scope in general is the single most important tool software engineers have to get a handle on entropy.

AshkanAliabadi · 2021-02-09T20:10:55Z

aten/src/ATen/test/vulkan_api_test.cpp


  const auto check = almostEqual(out_cpu, out_vulkan.cpu());
  if (!check) {
-    std::cout << "Expected:\n" << out_cpu << std::endl;
-    std::cout << "Got:\n" << out_vulkan.cpu() << std::endl;
+    showRtol(out_cpu, out_vulkan.cpu());


Change other places to this function as well.

Differential Revision: [D26209677](https://our.internmc.facebook.com/intern/diff/D26209677) [ghstack-poisoned]

ghstack-source-id: b5f1c5d Pull Request resolved: #49609

Differential Revision: [D26209677](https://our.internmc.facebook.com/intern/diff/D26209677) [ghstack-poisoned]

ghstack-source-id: 2bea274 Pull Request resolved: #49609

facebook-github-bot · 2021-02-12T02:26:50Z

@SS-JIA merged this pull request in 6385c13.

Summary: Pull Request resolved: pytorch#49609 Test Plan: Imported from OSS Reviewed By: AshkanAliabadi Differential Revision: D26209677 Pulled By: SS-JIA fbshipit-source-id: 773a944559bf0deb3cf3e233d833220a12f9f2ab

[vulkan] Efficient gemm implementation

3fed5d3

[ghstack-poisoned]

facebook-github-bot added the cla signed label Dec 18, 2020

SS-JIA pushed a commit that referenced this pull request Dec 18, 2020

[vulkan] Efficient gemm implementation

09ef8e4

ghstack-source-id: 0fc0936 Pull Request resolved: #49609

Update on "[vulkan] Efficient gemm implementation"

d0eb653

[ghstack-poisoned]

Update on "[vulkan] Efficient gemm implementation"

25feaec

[ghstack-poisoned]

SS-JIA pushed a commit that referenced this pull request Dec 19, 2020

[vulkan] Efficient gemm implementation

cc58ee5

ghstack-source-id: 48f4a4b Pull Request resolved: #49609

Update on "[vulkan] Efficient gemm implementation"

09aa7fc

[ghstack-poisoned]

SS-JIA pushed a commit that referenced this pull request Dec 21, 2020

[vulkan] Efficient gemm implementation

43adc09

ghstack-source-id: b856914 Pull Request resolved: #49609

Update on "[vulkan] Efficient gemm implementation"

422dc2a

[ghstack-poisoned]

SS-JIA pushed a commit that referenced this pull request Dec 22, 2020

[vulkan] Efficient gemm implementation

38e6265

ghstack-source-id: a974859 Pull Request resolved: #49609

Update on "[vulkan] Efficient gemm implementation"

3f5d3cd

[ghstack-poisoned]

SS-JIA pushed a commit that referenced this pull request Dec 22, 2020

[vulkan] Efficient gemm implementation

fd6d5e7

ghstack-source-id: accedc6 Pull Request resolved: #49609

Update on "[vulkan] Efficient gemm implementation"

7f1dd01

[ghstack-poisoned]

SS-JIA pushed a commit that referenced this pull request Dec 22, 2020

[vulkan] Efficient gemm and mean2d implementations

64dcff1

ghstack-source-id: e70be38 Pull Request resolved: #49609

Update on "[vulkan] Efficient gemm implementation"

6c7a11a

[ghstack-poisoned]

Update on "[vulkan] Efficient gemm implementation"

3135de3

[ghstack-poisoned]

SS-JIA pushed a commit that referenced this pull request Dec 28, 2020

[vulkan] Efficient gemm and mean2d implementations

44ba243

ghstack-source-id: 2d55a26 Pull Request resolved: #49609

Update on "[vulkan] Efficient gemm implementation"

d3e431e

[ghstack-poisoned]

SS-JIA pushed a commit that referenced this pull request Dec 29, 2020

[vulkan] Efficient gemm and mean2d implementations

2fa7a7e

ghstack-source-id: 19e047b Pull Request resolved: #49609

Update on "[vulkan] Efficient gemm implementation"

2677711

[ghstack-poisoned]

SS-JIA pushed a commit that referenced this pull request Dec 30, 2020

[vulkan] Efficient gemm and mean2d implementations

f5f724d

ghstack-source-id: ad85e8a Pull Request resolved: #49609

Update on "[vulkan] Efficient gemm implementation"

fc07fcb

[ghstack-poisoned]

SS-JIA pushed a commit that referenced this pull request Jan 5, 2021

[vulkan] Efficient gemm and mean2d implementations

3a056a7

ghstack-source-id: 550874d Pull Request resolved: #49609

Update on "[vulkan] Efficient gemm implementation"

75b89da

[ghstack-poisoned]

SS-JIA pushed a commit that referenced this pull request Jan 6, 2021

[vulkan] Efficient gemm and mean2d implementations

9232439

ghstack-source-id: 5291806 Pull Request resolved: #49609

Update on "[vulkan] Efficient gemm implementation"

cc166d8

[ghstack-poisoned]

SS-JIA pushed a commit that referenced this pull request Jan 7, 2021

[vulkan] Efficient gemm and mean2d implementations

9d77dcd

ghstack-source-id: afff895 Pull Request resolved: #49609

Update on "[vulkan] Efficient gemm implementation"

cc34bf1

[ghstack-poisoned]

SS-JIA pushed a commit that referenced this pull request Jan 7, 2021

[vulkan] Efficient gemm and mean2d implementations

8165eb7

ghstack-source-id: 9ac8cff Pull Request resolved: #49609

Update on "[vulkan] Efficient gemm implementation"

378a302

[ghstack-poisoned]

SS-JIA pushed a commit that referenced this pull request Jan 8, 2021

[vulkan] Efficient gemm and mean2d implementations

cd199b1

ghstack-source-id: 5065c40 Pull Request resolved: #49609

Update on "[vulkan] Efficient gemm implementation"

465b333

[ghstack-poisoned]

SS-JIA pushed a commit that referenced this pull request Feb 2, 2021

[vulkan] Efficient gemm and mean2d implementations

2ba5bc4

ghstack-source-id: 14f2590 Pull Request resolved: #49609

SS-JIA requested a review from AshkanAliabadi February 2, 2021 19:15

Update on "[vulkan] Efficient gemm implementation"

65b3616

[ghstack-poisoned]

SS-JIA pushed a commit that referenced this pull request Feb 2, 2021

[vulkan] Efficient gemm and mean2d implementations

5a19e45

ghstack-source-id: e4d9e29 Pull Request resolved: #49609

Update on "[vulkan] Efficient gemm implementation"

f4c6820

Differential Revision: [D26209677](https://our.internmc.facebook.com/intern/diff/D26209677) [ghstack-poisoned]

SS-JIA pushed a commit that referenced this pull request Feb 9, 2021

[vulkan] Efficient gemm and mean2d implementations

a2c5678

ghstack-source-id: e48a05d Pull Request resolved: #49609

Update on "[vulkan] Efficient gemm implementation"

a771e6e

Differential Revision: [D26209677](https://our.internmc.facebook.com/intern/diff/D26209677) [ghstack-poisoned]

SS-JIA pushed a commit that referenced this pull request Feb 9, 2021

[vulkan] Efficient gemm and mean2d implementations

eea604b

ghstack-source-id: 3dd6957 Pull Request resolved: #49609

AshkanAliabadi approved these changes Feb 9, 2021

View reviewed changes

SS-JIA mentioned this pull request Feb 10, 2021

[Vulkan] Streamline 2D Tensor packing workflow #52074

Closed

Update on "[vulkan] Efficient gemm implementation"

597eefa

Differential Revision: [D26209677](https://our.internmc.facebook.com/intern/diff/D26209677) [ghstack-poisoned]

SS-JIA pushed a commit that referenced this pull request Feb 10, 2021

[vulkan] Efficient gemm and mean2d implementations

9c64970

ghstack-source-id: b5f1c5d Pull Request resolved: #49609

Update on "[vulkan] Efficient gemm implementation"

1d4faa8

Differential Revision: [D26209677](https://our.internmc.facebook.com/intern/diff/D26209677) [ghstack-poisoned]

Update on "[vulkan] Efficient gemm implementation"

a194312

Differential Revision: [D26209677](https://our.internmc.facebook.com/intern/diff/D26209677) [ghstack-poisoned]

Update on "[vulkan] Efficient gemm implementation"

19eb3aa

Differential Revision: [D26209677](https://our.internmc.facebook.com/intern/diff/D26209677) [ghstack-poisoned]

SS-JIA pushed a commit that referenced this pull request Feb 11, 2021

[vulkan] Efficient gemm and mean2d implementations

b8ede16

ghstack-source-id: 2bea274 Pull Request resolved: #49609

facebook-github-bot closed this in 6385c13 Feb 11, 2021

facebook-github-bot added the Merged label Feb 12, 2021

facebook-github-bot deleted the gh/SS-JIA/15/head branch February 15, 2021 15:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[vulkan] Efficient gemm implementation #49609

[vulkan] Efficient gemm implementation #49609

Uh oh!

SS-JIA commented Dec 18, 2020 •

edited

Loading

Uh oh!

facebook-github-bot commented Dec 18, 2020 •

edited

Loading

Uh oh!

AshkanAliabadi left a comment

Uh oh!

AshkanAliabadi Feb 9, 2021

Uh oh!

AshkanAliabadi Feb 9, 2021

Uh oh!

AshkanAliabadi Feb 9, 2021

Uh oh!

AshkanAliabadi Feb 9, 2021

Uh oh!

AshkanAliabadi Feb 9, 2021

Uh oh!

SS-JIA Feb 10, 2021

Uh oh!

AshkanAliabadi Feb 9, 2021

Uh oh!

AshkanAliabadi Feb 9, 2021

Uh oh!

AshkanAliabadi Feb 9, 2021

Uh oh!

AshkanAliabadi Feb 9, 2021

Uh oh!

AshkanAliabadi Feb 9, 2021

Uh oh!

facebook-github-bot commented Feb 12, 2021

Uh oh!

Uh oh!

[vulkan] Efficient gemm implementation #49609

[vulkan] Efficient gemm implementation #49609

Uh oh!

Conversation

SS-JIA commented Dec 18, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot commented Dec 18, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

Uh oh!

AshkanAliabadi left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Feb 12, 2021

Uh oh!

Uh oh!

SS-JIA commented Dec 18, 2020 •

edited

Loading

facebook-github-bot commented Dec 18, 2020 •

edited

Loading