run prefetch prog on server #9593

Yancey1989 · 2018-04-03T06:27:06Z

typhoonzero · 2018-04-03T07:14:56Z

paddle/fluid/operators/detail/grpc_server.cc


-    VLOG(3) << "RequestPrefetch Process in";
+    executor_->Run(*program_, scope_, blkid_, false, false);


Process runs in a separated thread, executor_ may be accessed in different threads at the same time, don't know whether this is safe.

The problem is that the prefetch and optimize may happen at the same time, they will both access the lookup_table parameter.

I think the final solution may be that table optimization also be a separate thread, and the prefetch thread and update thread try to get the same lock.

Currently, we will run update operators withing optimize block, should find a way to a avoid the conflict.

The problem is that the prefetch and optimize may happen at the same time

May not, for the current process, prefetch request would happen before sending gradients. And there is a SEND BARRIER to make sure that optimize process would happen after prefetch request

But I think it's not threaded safe for the current implementation because we use one scope to create the output variable, if there are more then two prefetch request, the output variable would be replaced, and the serialize function would be failed.
Maybe a way to solve this is to use the different scope to create output var.

We can create a sub_scope here to store the output variable.

But NewScope may also not thread safe...
Maybe another way is to create multiple output vars with the different suffix such as out_trainer0, out_trainer1 in Distributed transpiler.

After discuss with @Yancey1989, we decided to use NewScope to run each Process.

jacquesqiao · 2018-04-03T12:02:52Z

paddle/fluid/operators/detail/grpc_server_test.cc

+
+void InitTensorsInScope(framework::Scope &scope, platform::CPUPlace &place) {
+  auto w_var = scope.Var("w");
+  auto w = w_var->GetMutable<framework::LoDTensor>();


W should be SelectedRows

jacquesqiao · 2018-04-04T01:22:36Z

paddle/fluid/operators/detail/grpc_server_test.cc

+  return block;
+}
+
+void InitTensorsInScope(framework::Scope &scope, platform::CPUPlace &place) {


Should split InitTensorsInScope into InitTensorsInClientScope and InitTensorsInServerScope

jacquesqiao · 2018-04-04T01:24:07Z

paddle/fluid/operators/detail/grpc_server.cc


-    VLOG(3) << "RequestPrefetch Process in";
+    executor_->Run(*program_, scope_, blkid_, false, false);


Need DeSerialize the Request into the current scope before running prefetch block.

…_prog_on_server

typhoonzero · 2018-04-04T11:26:07Z

paddle/fluid/operators/detail/grpc_server.cc

+    auto* var = local_scope->FindVar(var_name);
+    InitializeVariable(var, var_desc->GetType());
+
+    executor_->Run(*program_, local_scope, blkid_, false, false);


If executor_ is the member of RequestPrefetch, it will be created every time the request is send to the server, it's expensive, can make it the member of the server instance.

Also can prepare it before run.

The type of executor_ is a pointer, so maybe we would create it for every request.

Also can prepare it before run

A good idea and I will do that.

typhoonzero · 2018-04-04T11:28:41Z

paddle/fluid/operators/detail/send_recv.proto

@@ -67,6 +67,8 @@ message VariableMessage {
  bytes serialized = 8;
  // selected_rows data
  bytes rows = 9;
+  // prefetch var name


Look up table block execution output variable name.

Seems not updated?

Sorry...updated by the comments.

typhoonzero · 2018-04-04T11:28:54Z

paddle/fluid/operators/detail/grpc_server_test.cc

  detail::RPCClient client;
  client.AsyncPrefetchVariable("127.0.0.1:8889", ctx, scope, in_var_name,
                               out_var_name);
  client.Wait();

+  // auto out_var = scope.Var(out_var_name);


delete this comment.

…_prog_on_server

jacquesqiao

LGTM!

run prefetch prog on server

0cafe39

Yancey1989 requested review from jacquesqiao and typhoonzero April 3, 2018 06:27

typhoonzero reviewed Apr 3, 2018

View reviewed changes

jacquesqiao mentioned this pull request Apr 3, 2018

Support Distribute Lookup Table #9211

Closed

15 tasks

jacquesqiao added this to In progress in distributed lookup table Apr 3, 2018

jacquesqiao reviewed Apr 3, 2018

View reviewed changes

jacquesqiao reviewed Apr 4, 2018

View reviewed changes

Yancey1989 added 3 commits April 4, 2018 15:35

prefetch prog run on new scope

1842758

Merge branch 'develop' of github.com:PaddlePaddle/Paddle into prefech…

abfd9fe

…_prog_on_server

prefetch selected rows

4698966

typhoonzero reviewed Apr 4, 2018

View reviewed changes

Yancey1989 added 3 commits April 8, 2018 15:35

prepare prefetch context

f132f51

Merge branch 'develop' of github.com:PaddlePaddle/Paddle into prefech…

9eaf445

…_prog_on_server

update by comment

974b253

jacquesqiao approved these changes Apr 8, 2018

View reviewed changes

Yancey1989 merged commit be85385 into PaddlePaddle:develop Apr 8, 2018

distributed lookup table automation moved this from In progress to Done Apr 8, 2018

Yancey1989 deleted the prefech_prog_on_server branch April 8, 2018 09:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

run prefetch prog on server #9593

run prefetch prog on server #9593

Yancey1989 commented Apr 3, 2018 •

edited by jacquesqiao

Loading

typhoonzero Apr 3, 2018

jacquesqiao Apr 3, 2018

Yancey1989 Apr 3, 2018

Yancey1989 Apr 3, 2018 •

edited

Loading

jacquesqiao Apr 3, 2018

Yancey1989 Apr 3, 2018 •

edited

Loading

jacquesqiao Apr 3, 2018

Yancey1989 Apr 4, 2018

jacquesqiao Apr 3, 2018

Yancey1989 Apr 4, 2018

jacquesqiao Apr 4, 2018

Yancey1989 Apr 4, 2018

jacquesqiao Apr 4, 2018

Yancey1989 Apr 4, 2018

typhoonzero Apr 4, 2018

typhoonzero Apr 4, 2018

Yancey1989 Apr 8, 2018 •

edited

Loading

Yancey1989 Apr 8, 2018

typhoonzero Apr 4, 2018

Yancey1989 Apr 8, 2018

typhoonzero Apr 8, 2018

Yancey1989 Apr 8, 2018

typhoonzero Apr 4, 2018

Yancey1989 Apr 8, 2018

jacquesqiao left a comment


		VLOG(3) << "RequestPrefetch Process in";
		executor_->Run(*program_, scope_, blkid_, false, false);

run prefetch prog on server #9593

run prefetch prog on server #9593

Conversation

Yancey1989 commented Apr 3, 2018 • edited by jacquesqiao Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Yancey1989 Apr 3, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Yancey1989 Apr 3, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Yancey1989 Apr 8, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jacquesqiao left a comment

Choose a reason for hiding this comment

Yancey1989 commented Apr 3, 2018 •

edited by jacquesqiao

Loading

Yancey1989 Apr 3, 2018 •

edited

Loading

Yancey1989 Apr 3, 2018 •

edited

Loading

Yancey1989 Apr 8, 2018 •

edited

Loading