Enable SPMD + dynamo for inference #5002

JackCaoG · 2023-05-12T01:29:08Z

This work was done by @yeounoh and I am trying to land this pr in his behalf. The last attempt was made for @steventk-g in #4862.

Currently test failed with an Check failed: handle->HasValue(), so still WIP.

JackCaoG · 2023-05-13T00:28:32Z

OK There are 2 issues

Dynamo async function currently silently fail if there is an exception happened. we need to add a rethrow logic
dynamo's PjRtComputationClient::PjRtData::Assign(const Data& data) failed with bad_cast error

void PjRtComputationClient::PjRtData::Assign(const Data& data) {
  TF_VLOG(3) << "enter assign\n";
  const PjRtData& pjrt_data = dynamic_cast<const PjRtData&>(data);
  if (&pjrt_data != this) {
    buffer = pjrt_data.buffer;
  }
  TF_VLOG(3) << "left assign\n";
}

2023-05-13 00:26:38.471263: I third_party/xla_client/pjrt_computation_client.cc:135] enter assign

E
RuntimeError: std::bad_cast

JackCaoG · 2023-05-13T00:32:10Z

Ah Ok.. I think I know what's the problem, the result of the dynamo graph is a PjRtShardedData, and we tried to cast it to PjRtData. This might has to do with @jonb377 's recent pr that make most things implicitly replicated. This should be a easy fix, I can work on it next week.

FYI @yeounoh

yeounoh · 2023-05-15T17:32:49Z

Ah Ok.. I think I know what's the problem, the result of the dynamo graph is a PjRtShardedData, and we tried to cast it to PjRtData. This might has to do with @jonb377 's recent pr that make most things implicitly replicated. This should be a easy fix, I can work on it next week.

FYI @yeounoh

Thanks @JackCaoG , I am going to merge a output param sharding patch, which might change the code path a bit. Let's chat offline, I can explain further.

…aceholder if SPMD is enabled

JackCaoG · 2023-05-16T01:45:41Z

torch_xla/csrc/init_python_bindings.cpp

+          // Device will be Virtual device if SPMD is enabled.
+          torch::lazy::BackendDevice device =
+              ShardingUtil::UseVirtualDevice() ? ParseDeviceString("SPMD:0")
+                                               : torch_xla::GetCurrentDevice();


@yeounoh I am not sure if we should just update GetCurrentDevice, any thought? We need to sit down and think about how to surface this virtual device to user soon..

I voted for GetCurrentDevice as there might be other scenario where the caller will also need to distinguish SPMD:0 with XLA:0.

GetCurrentDevice is being used over 30 places in our code base now, mostly during tracing and caller trying to figure out the hw type. I think it should be fine as long as SPMD:0 can be resolved into correct hardware type. I would leave that in a separate pr since it touches too many codes and might introduce noise.

test/spmd/test_dynamo_spmd.py

JackCaoG · 2023-05-16T18:21:01Z

I think this one is ready for review, I will add more test cases(input data sharding, which I am not sure if it works or not) and features in the next pr.

test/spmd/test_dynamo_spmd.py

alanwaketan

Generally, LGTM.

alanwaketan · 2023-05-16T19:37:30Z

torch_xla/csrc/xla_graph_executor.cpp

@@ -590,9 +593,21 @@ XLAGraphExecutor::ExecuteComputationWithBarrier(
    torch::lazy::BackendDataPtr handle =
        WrapXlaData(xla::ComputationClient::Get()->CreateDataPlaceholder(
            device.toString(), std::move(shape)));
+    // if SPMD is enabled, we assume all output will be replicated
+    if (ShardingUtil::UseVirtualDevice()) {


Why we now start adding this for the dynamo path? We don't need this for the LTC path?

Looks like this patch is dynamo exclusive... Should we hint this somewhere?

the lazy code path already have this logic, in fact I copt this logic from lazy code path lol

I smell an opportunity to merge two code paths more. But let's do it in a follow up.

torch_xla/csrc/xla_graph_executor.cpp

alanwaketan · 2023-05-16T19:41:45Z

torch_xla/csrc/xla_graph_executor.cpp

@@ -608,6 +623,9 @@ XLAGraphExecutor::ExecuteComputationWithBarrier(
      if (auto xla_tensor_ptr = bridge::TryGetXlaTensor(ivalue.toTensor())) {
        dataptr = xla_tensor_ptr->GetXlaData();
      } else {
+        XLA_CHECK(device.type() != (int8_t)XlaDeviceType::SPMD)


What's this XLA_CHECK for?

hmm, not sure, I copy this from @yeounoh 's diff. @yeounoh any idea?

It's not needed, but more for a sanity check I probably added to ensure that this doesn't happen. Basically, we want to make sure that the SPMD device type is always on the backend (device data).

jonb377

LGTM, thanks Jack

jonb377 · 2023-05-17T18:23:47Z

test/spmd/test_dynamo_spmd.py

+    # Add an additional 1x1 layer at the end to ensure the final layer
+    # is not sharded.
+    self.fc3 = nn.Linear(1, 1)


Is this due to the lack of output sharding propagation?

yea, in this pr I tried to keep it that output is replicated. We can expand this after output sharding pr is ready.

yeounoh · 2023-05-17T20:31:06Z

I think this one is ready for review, I will add more test cases(input data sharding, which I am not sure if it works or not) and features in the next pr.

Input sharding should (used to) work if the sharded input is used for the torch compilation. Let me know. Will take a pass on the chages now as well, thanks.

yeounoh · 2023-05-17T20:35:10Z

torch_xla/csrc/xla_graph_executor.cpp

@@ -590,6 +593,15 @@ XLAGraphExecutor::ExecuteComputationWithBarrier(
    torch::lazy::BackendDataPtr handle =
        WrapXlaData(xla::ComputationClient::Get()->CreateDataPlaceholder(


If it's SPMD virtual device, then we should always use PjRtShardedData handle.

hmm, is the logic below to call WrapDataShards not enough? This code path is shared between spmd and non-spmd code path.

torch_xla/csrc/xla_graph_executor.cpp

JackCaoG added 2 commits May 12, 2023 00:00

initial code change

2417643

Add simple test, which currently failed

8185065

JackCaoG added the SPMD label May 12, 2023

JackCaoG mentioned this pull request May 12, 2023

[WIP] Integrate Dynamo + SPMD for Inference #4862

Closed

JackCaoG added the dynamo label May 13, 2023

yeounoh self-requested a review May 15, 2023 17:31

Add try catch to dynamo, use SPMD device in dynamo, create sharded pl…

89113ca

…aceholder if SPMD is enabled

JackCaoG commented May 16, 2023

View reviewed changes

test/spmd/test_dynamo_spmd.py Show resolved Hide resolved

JackCaoG marked this pull request as ready for review May 16, 2023 18:20

JackCaoG changed the title ~~[WIP] Enable SPMD + dynamo for inference~~ Enable SPMD + dynamo for inference May 16, 2023

JackCaoG requested review from alanwaketan, wonjoolee95 and jonb377 May 16, 2023 18:20

JackCaoG commented May 16, 2023

View reviewed changes

test/spmd/test_dynamo_spmd.py Show resolved Hide resolved

alanwaketan approved these changes May 16, 2023

View reviewed changes

fix review comments

e47acfa

JackCaoG requested a review from alanwaketan May 17, 2023 00:05

jonb377 approved these changes May 17, 2023

View reviewed changes

alanwaketan approved these changes May 17, 2023

View reviewed changes

yeounoh reviewed May 17, 2023

View reviewed changes

torch_xla/csrc/xla_graph_executor.cpp Show resolved Hide resolved

change comments

2a8cc73

JackCaoG merged commit dc057dc into master May 18, 2023

wonjoolee95 mentioned this pull request Jul 14, 2023

[WIP] Materialize SPMD tensors during Dynamo graph compilation #4859

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable SPMD + dynamo for inference #5002

Enable SPMD + dynamo for inference #5002

JackCaoG commented May 12, 2023

JackCaoG commented May 13, 2023

JackCaoG commented May 13, 2023

yeounoh commented May 15, 2023

JackCaoG May 16, 2023

alanwaketan May 16, 2023

JackCaoG May 17, 2023

yeounoh May 17, 2023

JackCaoG commented May 16, 2023

alanwaketan left a comment

alanwaketan May 16, 2023

alanwaketan May 16, 2023

JackCaoG May 16, 2023

alanwaketan May 16, 2023 •

edited

Loading

alanwaketan May 16, 2023

JackCaoG May 16, 2023

yeounoh May 17, 2023

jonb377 left a comment

jonb377 May 17, 2023

JackCaoG May 17, 2023

yeounoh commented May 17, 2023

yeounoh May 17, 2023

JackCaoG May 17, 2023

		@@ -590,6 +593,15 @@ XLAGraphExecutor::ExecuteComputationWithBarrier(
		torch::lazy::BackendDataPtr handle =
		WrapXlaData(xla::ComputationClient::Get()->CreateDataPlaceholder(

Enable SPMD + dynamo for inference #5002

Enable SPMD + dynamo for inference #5002

Conversation

JackCaoG commented May 12, 2023

JackCaoG commented May 13, 2023

JackCaoG commented May 13, 2023

yeounoh commented May 15, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JackCaoG commented May 16, 2023

alanwaketan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alanwaketan May 16, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jonb377 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yeounoh commented May 17, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alanwaketan May 16, 2023 •

edited

Loading