-
Notifications
You must be signed in to change notification settings - Fork 615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor Executor and OpGraph #540
Conversation
dali/common.h
Outdated
case DALIOpType::SUPPORT: | ||
return "SUPPORT"; | ||
default: | ||
return "INVALID OP TYPE"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer something that doesn't contain spaces and is obviously wrong - like "<INVALID>"
. Motivation for not having spaces is easier parsing, should it ever be necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
dali/pipeline/executor/executor.h
Outdated
workspace_owner_t op_data; | ||
|
||
void Resize(int support, int cpu, int mixed, int gpu) { | ||
std::get<static_cast<int>(DALIOpType::SUPPORT)>(op_data).resize(support); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This hurts my eyes, really. If it were not an enum class
the same line would look like this:
std::get<DALI_OP_SUPPORT>(op_data).resize(support)
|
||
template <typename Backend> | ||
bool IsPinned(HostWorkspace::output_t<Backend> &t) { | ||
bool is_pinned = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see a contradiction here. On one hand, you assume that pinned
is default, because empty workspace is implicitly pinned. On the other hand, you implement all_of
predicate here, making it no-so-default after all (it takes just one non-pinned tensor in the workspace to mark it as non-pinned).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing is, currently we pin the memory for the whole batch, and I wanted to be able to call SetPinned(any_of_workspace_output_types, true)
in consistent manner, instead of having to write
device_output->set_pinned(...)
and
for (auto t in host_output)
t->set_pinned(...)
dali/pipeline/executor/executor.h
Outdated
mixed_op_data.clear(); | ||
gpu_op_data.clear(); | ||
support_op_data.clear(); | ||
std::get<0>(op_data).clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't those be enum values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are, in some next PR, forgot to propagate it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so maybe good idea to add it here before merging
dali/pipeline/executor/executor.h
Outdated
// We instantiate the operation of adding the input only for parent op_type and device | ||
// that are specifically allowed | ||
template <DALIOpType op_type, DALIOpType producer_type, DALITensorDevice device> | ||
en_if_t<allows_op_input<op_type>(producer_type) && allows_tensor_input<op_type>(device)> add_input( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose that en_if_t
means enable_if_t
. I'd rather splurge on having the extra characters :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...or make it if_t
;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
// std::tuple<storage_gen_t<0>, storage_gen_t<1>, storage_gen_t<2>, storage_gen_t<3>, | ||
// storage_gen_t<4>, storage_gen_t<5>, storage_gen_t<6>, storage_gen_t<7>>; | ||
|
||
using storage_owner_t = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to read through a lot of code to grasp what this "storage owner" is. Consider renaming it to something less generic, e.g. WorkspaceDataStore
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I'm happy for good type name suggestions. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went with tensor_data_store_t as it is more about covering the TensorNodes in graph. Also I want to differentiate from workspace_store_t.
DALI_ENFORCE(device == DALITensorDevice::CPU, "Only CPU outputs allowed"); | ||
// Allocate `batch_size` Tensors for this ops | ||
// results and add them to the workspace. | ||
storage_gen_t<GetStorageIndex(DALIOpType::CPU, device)> output(batch_size, nullptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This deserves an alias:
template <DALIOpType op_type, DALITensorDevice device>
using WorkspaceStorage = storage_gen_t<GetStorageIndex(op_type, device)>;
5d7f9cd
to
1196ec0
Compare
1196ec0
to
a0c931c
Compare
b7bc3c4
to
f3f022f
Compare
dali/pipeline/graph/op_graph.h
Outdated
|
||
std::vector<OpNode> op_nodes_; | ||
std::vector<TensorNode> tensor_nodes_; | ||
std::vector<std::vector<OpNodeId>> op_paritions_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::vector<std::vector<OpNodeId>> op_paritions_; | |
std::vector<std::vector<OpNodeId>> op_partitions_; |
Amazing, what code completion can propagate...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done :)
dali/pipeline/graph/graph_descr.cc
Outdated
void CheckOpConstraints(const OpSpec &spec) { | ||
const OpSchema &schema = SchemaRegistry::GetSchema(spec.name()); | ||
|
||
bool allows_multiple_inputs = schema.AllowsMultipleInputSets(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bool allows_multiple_inputs = schema.AllowsMultipleInputSets(); | |
bool allows_multiple_input_sets = schema.AllowsMultipleInputSets(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
fe3b08d
to
c99e211
Compare
@@ -0,0 +1,150 @@ | |||
// Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: Format this document after changing DALIOpType to OpType.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
f4ca954
to
c059e65
Compare
Build 665966 |
c059e65
to
f6c77c6
Compare
dali/pipeline/executor/executor.h
Outdated
#include "dali/pipeline/util/event_pool.h" | ||
#include "dali/pipeline/util/stream_pool.h" | ||
#include "dali/pipeline/util/thread_pool.h" | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need for those two empty lines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
dali/pipeline/executor/executor.h
Outdated
mixed_op_data.clear(); | ||
gpu_op_data.clear(); | ||
support_op_data.clear(); | ||
std::get<0>(op_data).clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so maybe good idea to add it here before merging
dali/pipeline/executor/executor.h
Outdated
|
||
template <> | ||
inline void Executor::SetupStreamsAndEvents<OpType::MIXED>(MixedWorkspace &ws, | ||
const OpGraph &graph, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
dali/pipeline/executor/executor.h
Outdated
|
||
template <> | ||
inline void Executor::SetupStreamsAndEvents<OpType::GPU>(DeviceWorkspace &ws, | ||
const OpGraph &graph, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
namespace dali { | ||
|
||
std::vector<tensor_data_store_t> CreateBackingStorageForTensorNodes(const OpGraph &op_graph, | ||
int batch_size) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
namespace dali { | ||
|
||
std::vector<tensor_data_store_t> CreateBackingStorageForTensorNodes(const OpGraph &op_graph, | ||
int batch_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
void CheckArgumentInputConstraints(const OpGraph& op_graph, const OpNode& op) { | ||
static const auto allows_argument_input = ArgumentInputConstraints(); | ||
bool arg_in_allowed = allows_argument_input[static_cast<int>(op.op_type)]; | ||
if (!arg_in_allowed) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make sure that arg inputs are allowed for mixed ops as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will do so in a followup PR, do not want to introduce changes that can break something unless it is to fix a bug.
f6c77c6
to
e8bdb49
Compare
Build: 666123 (it's stuck due to not enough free workers). |
Check some constraints on OpGraph separately from Executor processing the OpGraph Add static traits for OpGraph constraints Unify OpNodes processing for different OpType Executor "owns" buffer for corresponding TensorNodes, using data factory and related types Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
e8bdb49
to
b83c0cd
Compare
Build: 670584 |
Check some constraints on OpGraph separately from Executor processing the OpGraph Add static traits for OpGraph constraints Unify OpNodes processing for different OpType Executor "owns" buffer for corresponding TensorNodes, using data factory and related types Allow for Argument Inputs in Mixed Ops Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Check some constraints on OpGraph separately from Executor
processing the OpGraph
Add static traits for OpGraph constraints
Unify OpNodes processing for different OpType
Executor "owns" buffer for corresponding TensorNodes,
using data factory and related types
next: #551