Make JIT not assume that the device is CUDA. #54238

chengjunlu · 2021-03-18T05:30:04Z

Decouple the JIT argument spec and shape analysis with CUDA.

facebook-github-bot · 2021-03-18T05:30:12Z

💊 CI failures summary and remediations

As of commit 70087ff (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

gujinghui · 2021-03-19T10:14:41Z

@ezyang Could you please help review? Thanks.

ezyang · 2021-03-19T13:53:32Z

torch/csrc/jit/runtime/argument_spec.h

 struct ArgumentInfo {
  friend struct ArgumentSpec;
-  using plain_data_type = uint32_t;
+  using plain_data_type = uint64_t;


Huh, what's going on with this

Ah, struct sized increased

ezyang · 2021-03-19T13:54:23Z

torch/csrc/jit/runtime/argument_spec.h

@@ -61,6 +55,8 @@ struct ArgumentInfo {
  int device_ : 8; // NOTE: this needs to be signed because we use -1 to


This comment is out of date now.

I will update the comments correspondingly.

ezyang · 2021-03-19T13:54:57Z

torch/csrc/jit/runtime/argument_spec.h

-  uint32_t total_dims; // all TensorInfoPODs are in CompleteArgumentSpec's
+  unsigned dev_type : 16;
+  unsigned
+      total_dims : 16; // all TensorInfoPODs are in CompleteArgumentSpec's


This type is narrower than the olde type

Is it large enough for holding the number of all the tensor operands' dimensions?
I will add overflow check when save the dimensions number to the CompleteArgumentInfoPOD::total_dims.

we seem to have both device and dev_type (30 bits in total) while Aten Device only needs 16 bits. Are both necessary?

ezyang · 2021-03-19T13:55:44Z

This seems OK but it should really get a review from the JIT side. @SplitInfinity, do you think you know who could look at this?

SplitInfinity · 2021-03-20T01:10:50Z

Adding @ZolotukhinM @Krovatkin based on GitHub recommendation. I know they work on/used to work on that part of the codebase. If neither of you are the right reviewer, feel free to tag the right person and remove yourselves.

gujinghui · 2021-03-27T16:07:58Z

@ZolotukhinM @Krovatkin Could you give comments for this PR? Thanks a lot.

gujinghui · 2021-03-29T02:52:30Z

@ezyang & @SplitInfinity
Could you help us to get any feedback? Many thanks.

Krovatkin · 2021-03-29T03:01:43Z

@gujinghui sorry for the delay, I'll take a look tomorrow!!!

gujinghui · 2021-04-03T05:36:39Z

@gujinghui sorry for the delay, I'll take a look tomorrow!!!

@Krovatkin Thanks for your reply. Could you give suggestion for this PR? Thanks.

gujinghui · 2021-05-14T09:27:02Z

@chengjunlu pls help resolve the conflict.

chengjunlu · 2021-05-17T08:17:44Z

The changes has been rebased with the latest main trunk.

codecov · 2021-05-17T10:24:23Z

Codecov Report

Merging #54238 (70087ff) into master (71f4c5c) will increase coverage by 0.00%.
The diff coverage is 45.45%.

@@           Coverage Diff           @@
##           master   #54238   +/-   ##
=======================================
  Coverage   76.47%   76.47%           
=======================================
  Files        1992     1992           
  Lines      199858   199866    +8     
=======================================
+ Hits       152840   152850   +10     
+ Misses      47018    47016    -2

gujinghui · 2021-05-17T16:46:49Z

@ezyang @Krovatkin
The PR is rebased. Could you help review it? Thanks a lot.

ezyang · 2021-05-18T14:38:39Z

I just pinged @Krovatkin; if there is still no action in a few days I'll just start landing this

gujinghui · 2021-05-27T07:17:29Z

@ezyang could you help merge this PR? Thanks so much. :)

Krovatkin

ArgumentSpec seems to be an overly clever structure, which makes it a bit hard to reason about (at least for me) Since we are sticking with int64_t as an element size, we should be okay modulo a few nitpicks

Krovatkin · 2021-05-27T07:55:36Z

torch/csrc/jit/runtime/argument_spec.h

  unsigned type_ : 8;
+  unsigned dev_type_ : 16;
+  unsigned : 16;


Aten Device needs 16 bits for both an index and device type here we seem to be using 8 + 16 = 24?

The DeviceType and DeviceIndex has been narrowed in the commit #47023
I will change the code to align the new size of the at::Device correspondingly.

Krovatkin · 2021-05-27T07:58:53Z

torch/csrc/jit/runtime/argument_spec.h

-  uint32_t total_dims; // all TensorInfoPODs are in CompleteArgumentSpec's
+  unsigned dev_type : 16;
+  unsigned
+      total_dims : 16; // all TensorInfoPODs are in CompleteArgumentSpec's


we seem to have both device and dev_type (30 bits in total) while Aten Device only needs 16 bits. Are both necessary?

Krovatkin · 2021-05-27T08:00:31Z

torch/csrc/jit/runtime/argument_spec.h

@@ -271,6 +274,10 @@ struct CompleteArgumentSpec {
        }
      }
      // each POD has a running tally of all dimensions including its own
+      TORCH_CHECK(


I hope we won't hit this limit. Practically, it could probably only happen with code-generated models with lots of arguments.

Krovatkin · 2021-05-27T08:05:48Z

torch/csrc/jit/runtime/argument_spec.h

-      arg.device_ = t->is_cuda() ? t->get_device() : -1;
+      at::Device device = t->device();
+      arg.dev_type_ =
+          static_cast<std::underlying_type<DeviceType>::type>(device.type());


this will be a widening (but unsigned cast), so we should be okay

facebook-github-bot · 2021-06-01T17:22:59Z

@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2021-06-04T05:22:39Z

@Krovatkin merged this pull request in db90533.

Summary: Decouple the JIT argument spec and shape analysis with CUDA. Pull Request resolved: pytorch#54238 Reviewed By: ngimel Differential Revision: D28802085 Pulled By: Krovatkin fbshipit-source-id: 4068c9460cdec2d80733f001ca90ea3f5e6d3a7e

facebook-github-bot added oncall: jit Add this issue/PR to JIT oncall triage queue cla signed labels Mar 18, 2021

pytorchbot added the open source label Mar 18, 2021

chengjunlu force-pushed the decouple_jit_with_cuda branch 2 times, most recently from 70953c2 to 11d3e80 Compare March 18, 2021 06:06

ezyang reviewed Mar 19, 2021

View reviewed changes

SplitInfinity requested review from ZolotukhinM and Krovatkin March 20, 2021 01:09

chengjunlu force-pushed the decouple_jit_with_cuda branch 2 times, most recently from 3860a26 to d99d232 Compare March 26, 2021 06:42

chengjunlu force-pushed the decouple_jit_with_cuda branch from d99d232 to 70087ff Compare May 17, 2021 06:00

Make JIT not assume that the device is CUDA.

70087ff

Krovatkin approved these changes May 27, 2021

View reviewed changes

Krovatkin reviewed May 27, 2021

View reviewed changes

facebook-github-bot closed this in db90533 Jun 4, 2021

facebook-github-bot added the Merged label Jun 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make JIT not assume that the device is CUDA. #54238

Make JIT not assume that the device is CUDA. #54238

chengjunlu commented Mar 18, 2021

facebook-github-bot commented Mar 18, 2021 •

edited

Loading

gujinghui commented Mar 19, 2021

ezyang Mar 19, 2021

ezyang Mar 19, 2021

ezyang Mar 19, 2021

chengjunlu Mar 22, 2021

ezyang Mar 19, 2021

chengjunlu Mar 22, 2021

Krovatkin May 27, 2021

ezyang commented Mar 19, 2021

SplitInfinity commented Mar 20, 2021

gujinghui commented Mar 27, 2021

gujinghui commented Mar 29, 2021

Krovatkin commented Mar 29, 2021

gujinghui commented Apr 3, 2021

gujinghui commented May 14, 2021

chengjunlu commented May 17, 2021

codecov bot commented May 17, 2021

gujinghui commented May 17, 2021

ezyang commented May 18, 2021

gujinghui commented May 27, 2021

Krovatkin left a comment

Krovatkin May 27, 2021

chengjunlu May 28, 2021

Krovatkin May 27, 2021

Krovatkin May 27, 2021

Krovatkin May 27, 2021

facebook-github-bot commented Jun 1, 2021

facebook-github-bot commented Jun 4, 2021

		@@ -61,6 +55,8 @@ struct ArgumentInfo {
		int device_ : 8; // NOTE: this needs to be signed because we use -1 to

Make JIT not assume that the device is CUDA. #54238

Make JIT not assume that the device is CUDA. #54238

Conversation

chengjunlu commented Mar 18, 2021

facebook-github-bot commented Mar 18, 2021 • edited Loading

💊 CI failures summary and remediations

gujinghui commented Mar 19, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ezyang commented Mar 19, 2021

SplitInfinity commented Mar 20, 2021

gujinghui commented Mar 27, 2021

gujinghui commented Mar 29, 2021

Krovatkin commented Mar 29, 2021

gujinghui commented Apr 3, 2021

gujinghui commented May 14, 2021

chengjunlu commented May 17, 2021

codecov bot commented May 17, 2021

Codecov Report

gujinghui commented May 17, 2021

ezyang commented May 18, 2021

gujinghui commented May 27, 2021

Krovatkin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

facebook-github-bot commented Jun 1, 2021

facebook-github-bot commented Jun 4, 2021

facebook-github-bot commented Mar 18, 2021 •

edited

Loading