Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Rust API for Storage Channel (#77) #137

Conversation

michael-kernel-sanders
Copy link
Contributor

@michael-kernel-sanders michael-kernel-sanders commented Jul 8, 2019

This change is Reviewable

@@ -318,6 +320,13 @@ grpc::Status OakNode::ProcessModuleInvocation(grpc::GenericServerContext* contex
channels_[LOGGING_CHANNEL_HANDLE] = std::move(logging_channel);
LOG(INFO) << "Created logging channel";

// TODO: Move to initialization method.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect this to end up being a new type of (pseudo-)node:

oneof node_type {
LogNode log_node = 2;
GrpcServerNode grpc_server_node = 3;
WebAssemblyNode web_assembly_node = 4;
}

}

message RollbackResponse {
}

// Wrapper message used to send requests via the StorageChannel interface.
message StorageOperationRequest {
// The name of the storage on which to perform the operation.
// The StorageChannel generates a name-based (version 5) UUID from this value.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I follow this, is the UUID deterministically generated from the name? What purpose does it serve?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated comment.

@@ -132,3 +137,65 @@ pub extern "C" fn oak_handle_grpc_call() {
}
});
}

pub fn execute_storage_operation(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest moving the storage-related stuff in its own module to start with (so it can be invoked as oak::storage::execute_storage_operation or similar, unless previously imported)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@googlebot
Copy link

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

@googlebot
Copy link

CLAs look good, thanks!

ℹ️ Googlers: Go here for more info.

@michael-kernel-sanders michael-kernel-sanders force-pushed the mks_storage_module_api branch 2 times, most recently from 6ffd8b8 to 06226e7 Compare August 13, 2019 13:36
@michael-kernel-sanders michael-kernel-sanders force-pushed the mks_storage_module_api branch 2 times, most recently from acc5323 to 5ec2cdc Compare August 22, 2019 10:10
Copy link
Collaborator

@tiziano88 tiziano88 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, adding @daviddrysdale to look at the channel stuff more closely

STORAGE_READ = 4;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think STORAGE_IN and STORAGE_OUT make more sense here, both for consistency, and also so they don't suggest that one is for reading and the other for writing data to storage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -183,7 +186,15 @@ static void RunModule(wabt::interp::Environment* env, wabt::interp::DefinedModul
}
}

OakNode::OakNode() : Service() {}
OakNode::OakNode() : Service() {
std::unique_ptr<ChannelHalf> storage_read_channel =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be around

oak/oak/server/oak_node.cc

Lines 188 to 263 in 7d75624

std::unique_ptr<OakNode> OakNode::Create(const std::string& module) {
LOG(INFO) << "Creating Oak Node";
std::unique_ptr<OakNode> node = absl::WrapUnique(new OakNode());
node->InitEnvironment(&node->env_);
LOG(INFO) << "Host func count: " << node->env_.GetFuncCount();
wabt::Errors errors;
LOG(INFO) << "Reading module";
wabt::Result result = ReadModule(module, &node->env_, &errors, &node->module_);
if (!wabt::Succeeded(result)) {
LOG(WARNING) << "Could not read module: " << result;
LOG(WARNING) << "Errors: " << wabt::FormatErrorsToString(errors, wabt::Location::Type::Binary);
return nullptr;
}
LOG(INFO) << "Reading module done";
if (!CheckModuleExports(&node->env_, node->module_)) {
LOG(WARNING) << "Failed to validate module";
return nullptr;
}
// Create a logging channel for the module.
{
std::shared_ptr<MessageChannel> channel = std::make_shared<MessageChannel>();
node->channel_halves_[ChannelHandle::LOGGING] =
absl::make_unique<MessageChannelWriteHalf>(channel);
LOG(INFO) << "Created logging channel " << ChannelHandle::LOGGING;
// Spawn a thread that reads and logs messages on this channel forever.
std::thread t([channel] {
std::unique_ptr<MessageChannelReadHalf> read_chan =
absl::make_unique<MessageChannelReadHalf>(channel);
while (true) {
ReadResult result = read_chan->BlockingRead(INT_MAX);
if (result.required_size > 0) {
LOG(ERROR) << "Message size too large: " << result.required_size;
return;
}
LOG(INFO) << "LOG: " << std::string(result.data->data(), result.data->size());
}
});
// TODO: join() instead when we have node termination
t.detach();
}
// Create the channels needed for gRPC interactions.
{
// Incoming request channel: keep the write half in C++, but map the read
// half to a well-known channel handle.
std::shared_ptr<MessageChannel> channel = std::make_shared<MessageChannel>();
node->channel_halves_[ChannelHandle::GRPC_IN] =
absl::make_unique<MessageChannelReadHalf>(channel);
node->req_half_ = absl::make_unique<MessageChannelWriteHalf>(channel);
LOG(INFO) << "Created gRPC input channel: " << ChannelHandle::GRPC_IN;
}
{
// Outgoing response channel: keep the read half in C++, but map the write
// half to a well-known channel handle.
std::shared_ptr<MessageChannel> channel = std::make_shared<MessageChannel>();
node->channel_halves_[ChannelHandle::GRPC_OUT] =
absl::make_unique<MessageChannelWriteHalf>(channel);
node->rsp_half_ = absl::make_unique<MessageChannelReadHalf>(channel);
LOG(INFO) << "Created gRPC output channel: " << ChannelHandle::GRPC_IN;
}
// Spin up a per-node Wasm thread to run forever; the Node object must
// outlast this thread.
LOG(INFO) << "Executing module oak_main";
std::thread t([&node] { RunModule(&node->env_, node->module_); });
// TODO: join() instead when we have node termination
t.detach();
LOG(INFO) << "Started module execution thread";
return node;
}

?

cc @daviddrysdale

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, keeping all of the well-known channel creation stuff together makes sense, either here or in CreateNode (and I'd probably go for the latter because we don't want to start a logging thread until we're sure the Node instance is good to go).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

namespace oak {

// A channel implementation that reads from a StorageService.
class StorageReadChannel final : public ChannelHalf {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this extend

class MessageChannel : public ChannelHalf {
instead?

BTW, @daviddrysdale maybe channel_halves_ should keep track of MessageChannel instance rather than ChannelHalf? I guess we expect all channels to provide async methods for reading / writing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment we've got a single (per-Node) numbering space for channel handles, regardless of direction, so channel_halves_ maps to a channel half rather than a channel. (To put it another way: when we get as far as creating channels at runtime, the creator will get two numbers to refer to the new channel, one for read and one for write.)

More generally, I'm idly wondering if the whole storage channel thing could be replaced with a couple of MessageChannel instances that connect to some C++ code (i.e. a pseudo-node, to use the terminology from manager.proto).

But I'm not yet ramped up enough on the storage design to know if that's a sensible or a silly idea…

@@ -183,7 +186,15 @@ static void RunModule(wabt::interp::Environment* env, wabt::interp::DefinedModul
}
}

OakNode::OakNode() : Service() {}
OakNode::OakNode() : Service() {
std::unique_ptr<ChannelHalf> storage_read_channel =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, keeping all of the well-known channel creation stuff together makes sense, either here or in CreateNode (and I'd probably go for the latter because we don't want to start a logging thread until we're sure the Node instance is good to go).


private:
// The serialized StorageOperationResponse message set by WriteResponseData.
std::string operation_response_data_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this class be replaced by an instance of MessageChannel ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably follow up with a refactoring... will discuss offline.

StorageManager() {}

std::unique_ptr<Message> ReadResponseData(uint32_t size) {
absl::Span<const char> response_data = operation_response_view_.subspan(0, size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it's aligned more with the previous stream-based channels, rather than the current message-based approach?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, @michael-kernel-sanders do you plan to fix this in this PR or to follow up later? If the latter, perhaps add a TODO somewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added TODO.

STORAGE_READ = 4;
STORAGE_WRITE = 5;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update README.md to describe the new well-known channel handles?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}

message RollbackResponse {
}

// Wrapper message used to send requests via the StorageChannel interface.
message StorageOperationRequest {
// The name of the storage on which to perform the operation.
// The StorageChannel generates a deterministic name-based (version 5) UUID
// from this value and the Oak Module to use as the storage identifier.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I didn't really follow this, but that might be lack of background)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also still not entirely sure what this is used for or whether it's necessary / nice to have (and why).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what sense does the StorageChannel generate something from this value and the Oak Module? Do you mean the name of the module? or the hash of the code? or something else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarified.

namespace oak {

// A channel implementation that reads from a StorageService.
class StorageReadChannel final : public ChannelHalf {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment we've got a single (per-Node) numbering space for channel handles, regardless of direction, so channel_halves_ maps to a channel half rather than a channel. (To put it another way: when we get as far as creating channels at runtime, the creator will get two numbers to refer to the new channel, one for read and one for write.)

More generally, I'm idly wondering if the whole storage channel thing could be replaced with a couple of MessageChannel instances that connect to some C++ code (i.e. a pseudo-node, to use the terminology from manager.proto).

But I'm not yet ramped up enough on the storage design to know if that's a sensible or a silly idea…

STORAGE_IN = 4;
STORAGE_OUT = 5;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update README.md to describe the new well-known channel handles?

@@ -77,7 +77,9 @@ message RollbackResponse {
message StorageOperationRequest {
// The name of the storage on which to perform the operation.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think this is quite vague, what is a "storage" and what defines its name? Maybe this should point to some high level comment? Or this proto message should contain more information on how the storage works? e.g. "a storage backend is subdivided into different namespaces, whose values are UUIDs derived from a user-provided string... if multiple mutually distrusting clients end up choosing the same name then $X happens" etc. . Maybe this is already present in the code, so you can just add a link to it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Attempted to clarify.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separated the storage channel and service messages.

@michael-kernel-sanders michael-kernel-sanders force-pushed the mks_storage_module_api branch 2 times, most recently from 411a11d to 43d3731 Compare August 30, 2019 11:33
Copy link
Collaborator

@tiziano88 tiziano88 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Michael and sorry for the delay!

@michael-kernel-sanders michael-kernel-sanders merged commit 9574991 into project-oak:master Sep 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants