Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

do not copy when deserialize #9209

Closed
wants to merge 5 commits into from

Conversation

gongweibao
Copy link
Contributor

@gongweibao gongweibao commented Mar 19, 2018

Fix part of #8638

@gongweibao gongweibao changed the title Parse protobuf mannually. do not copy when deserialize Mar 19, 2018
Copy link
Contributor

@helinwang helinwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing!

if (!input->GetDirectBufferPointer(&data, &size_to_write)) {
return false;
}
// TODO(gongwb): don't copy if it's aligned?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we need to copy anyway? I guess the buffer that we copy from will become invalid afterwards.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ByteBuffer is in userspace, and it uses the Slice to manage discontinuous memory.
If our data is in a single slice, we can use it directly--It needs to serialize the send buffer manually and it's dangerous.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry maybe I did not made it clear, I thought we should not use the underlying buffer, since the buffer belongs to protobuf, and the content could change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个也是猜的:
https://github.com/gongweibao/Paddle/blob/optsend/paddle/fluid/operators/detail/grpc_service.h#L65

好像内存可以保留下来。另外bytes的对应数据就是一块内存。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this comment to "Can we avoid copy?".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, thanks!

auto* slr = var->GetMutable<framework::SelectedRows>();
int64_t* rows_data = slr->mutable_rows()->data();

// copy rows CPU data, GPU data will be copied lazly
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lazly -> lazily

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.Thanks.

varmsg.SerializeToString(&str);

// message bytebuffer
::grpc::Slice slices_2[1];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you test with a bigger sized buffer? A 1 sized buffer perhaps is not general enough. Or even better, test in a loop, like for i := 1; i < 100; i++.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two styles of repeated field: packed and not packed and protobuf may change it auto even you don't specify packed.

  • packed: <tag,data> <tag,data>....
  • not packed: <tag,len><data, data,data..>...
    This code only guarantees we can parse manual serialized buffer and the standard buffer.

The byte buffer may have many slices already.

return false;
}
// TODO(gongwb): don't copy if it's aligned?
memcpy(reinterpret_cast<void*>(p), data, size_to_write);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can use memory::Copy here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

return true;
}

int TensorResponse::Parse(::grpc::ByteBuffer& byte_buffer,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't actually need a class TensorResponse, just put functions in proto_encoder_helper.h (or .cc) and put this content in DeserializeFromByteBuffer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class is needed for parse tensor response.

Copy link
Contributor

@typhoonzero typhoonzero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except two minor comments.

namespace operators {
namespace detail {

class TensorResponse {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should call it VariableResponse or VariableMessageParser because the response contains not only tensor

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file name should also be changed I suppose.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

int Parse(const ::grpc::ByteBuffer& byte_buffer,
const platform::DeviceContext& dev_ctx);

// should call parse first.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interface comments are too simple and not explaining what the function does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@gongweibao
Copy link
Contributor Author

Move to #9271 which is the whole PR.

@gongweibao gongweibao closed this Mar 21, 2018
@gongweibao gongweibao deleted the optsend2 branch January 17, 2021 07:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants