Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
85 commits
Select commit Hold shift + click to select a range
2d1445c
fixed no of outputs
Dec 2, 2019
36a0b61
Some minor changes
Dec 2, 2019
e65d66f
initialize only pipelined tensors
Dec 2, 2019
71d0cad
initialize only pipelined tensors
Dec 2, 2019
cec9962
Merge branch 'shrestha/prefetch_right_inputs' of https://github.com/t…
Dec 2, 2019
8e429a5
GetPrefetchedTensors
Dec 3, 2019
2bf99a4
Added test
Dec 3, 2019
ac9e32d
removed test
Dec 3, 2019
2db801d
refactor pipeline
Dec 4, 2019
9c0e03f
Shared data keeps track of prefetched input indexes
Dec 4, 2019
8dfc795
Working state
Dec 4, 2019
1fe8f83
Merge remote-tracking branch 'origin/master' into shrestha/prefetch_r…
Dec 4, 2019
5464b7a
bazel fix
Dec 4, 2019
5233e69
Added test
Dec 4, 2019
1e2dd34
Merge remote-tracking branch 'origin/master' into shrestha/prefetch_r…
Dec 4, 2019
94cc450
Changed tests. Put pipelined io tensors together to avoid unnecessary…
Dec 5, 2019
8bc22bd
refactored
Dec 5, 2019
33d4431
Indexes utilities
Dec 5, 2019
1c1aaff
Fix test
Dec 6, 2019
2e51f44
Merge remote-tracking branch 'origin/master' into shrestha/prefetch_r…
Dec 10, 2019
958f62e
renamed the files
Dec 11, 2019
bfef9c0
Fixed Prefetch Tests
Dec 11, 2019
065db00
fixed tests
Dec 11, 2019
30fec16
Removed couts
Dec 12, 2019
5b0bd52
minor
Dec 12, 2019
bface40
fixed tests
Dec 12, 2019
d7a735f
added log
Dec 12, 2019
fa301de
Added logs
Dec 12, 2019
366c78f
Added prefetch test
Dec 12, 2019
d06de1f
Merge remote-tracking branch 'origin/master' into shrestha/prefetch_r…
Dec 12, 2019
f052c6d
Format, removed extended file
Dec 12, 2019
120621c
minor
Dec 12, 2019
7fc9507
fixed test
Dec 12, 2019
601c135
Merge remote-tracking branch 'origin/master' into shrestha/prefetch_r…
Dec 12, 2019
6d4c037
FindComplement modified
Dec 12, 2019
31620bb
Apply suggestions from code review
Dec 13, 2019
1915493
incorporate review comemnts
Dec 13, 2019
0593f29
Merge branch 'shrestha/prefetch_right_inputs' of https://github.com/t…
Dec 13, 2019
a4f9e1f
addressed review comments
Dec 13, 2019
8536844
remove test-prefetch-2
Dec 13, 2019
bf8e918
renamed the vars for indexes relative to pipelined indexes
Dec 13, 2019
b2300b7
examples
Dec 13, 2019
5222150
fixed hang seen when disable deassign
Dec 14, 2019
f6baec4
Extended TM to store variable shared name
Dec 15, 2019
fae2335
added test
Dec 15, 2019
09612d8
Merge remote-tracking branch 'origin/master' into shrestha/prefetch_r…
Dec 15, 2019
b1120c1
fix axpy var test
Dec 15, 2019
06c93ef
fixed var tests
Dec 15, 2019
76bcd8e
Merge remote-tracking branch 'origin/shrestha/prefetch_right_inputs' …
Dec 15, 2019
f89a28e
Fixed axpy pipelined py
Dec 15, 2019
0c8f2d2
Merge remote-tracking branch 'origin/master' into shrestha/tm_get_sha…
Dec 16, 2019
25613b6
Read only required outputs
Dec 16, 2019
ff18bfd
Read only required outputs
Dec 16, 2019
6ffc443
Merge branch 'shrestha/var_in_compute' of https://github.com/tensorfl…
Dec 16, 2019
3e3d887
Var uses Parallel Executor
Dec 17, 2019
02888d3
Implemented IOTensorsReadyForExec
Dec 17, 2019
ca63606
Sync for output tensors
Dec 17, 2019
472b8d8
Fixed output
Dec 17, 2019
e932e11
Merge branch 'shrestha/var_in_compute' of https://github.com/tensorfl…
Dec 17, 2019
34de806
Merge branch 'master' into shrestha/var_in_compute
Dec 17, 2019
06efb09
Merge branch 'master' into shrestha/var_in_compute
Dec 17, 2019
5576a47
For non var build
Dec 17, 2019
b130917
Solved build and fix
Dec 17, 2019
f821bac
Fix test_flib
Dec 17, 2019
53c1c38
Merge remote-tracking branch 'origin/master' into shrestha/var_in_com…
Dec 18, 2019
1af0efa
Removed Print function
Dec 18, 2019
fe5d3e8
Added Traces to Encap. Some clean up
Dec 18, 2019
878572a
Added comments, clean up, etc
Dec 18, 2019
85882bd
Removed ngraph-var in tracked_variable.cc
Dec 18, 2019
8fc9205
Merge remote-tracking branch 'origin/master' into shrestha/var_in_com…
Dec 18, 2019
983eb7b
ngraph_tracked_variable.cc changes
Dec 18, 2019
c992027
added traces
Dec 19, 2019
5ae567a
fix build
Dec 19, 2019
ab33837
Var Rewrite pass calls EnterPrefetchInCatalog,fixed header guard, ten…
Dec 19, 2019
f8ae937
small fix
Dec 19, 2019
85f9a39
incorporate review comments
Dec 19, 2019
a5f9ff2
fixed path for axpy pipelined for test_ngtf.py
Dec 19, 2019
b639f9d
Added more specific tracing for prefetched
Dec 19, 2019
cbcc036
Merge branch 'master' into shrestha/var_in_compute
Dec 19, 2019
0c40739
incorporate review comments
Dec 20, 2019
c042564
Merge branch 'shrestha/var_in_compute' of https://github.com/tensorfl…
Dec 20, 2019
4b21cc1
minor
Dec 20, 2019
36f4bec
removed print vector. added lambda
Dec 20, 2019
bf3b846
fix test_utils.py
Dec 20, 2019
8e346e8
write prefetch traces
Dec 20, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions bazel/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ cc_library(
"ngraph_bridge/ngraph_tensor_manager.h",
"ngraph_bridge/ngraph_timer.h",
"ngraph_bridge/ngraph_utils.h",
"ngraph_bridge/ngraph_var.h",
"ngraph_bridge/ngraph_version_utils.h",
"ngraph_bridge/tf_deadness_analysis.h",
"ngraph_bridge/tf_graphcycles.h",
Expand Down Expand Up @@ -92,6 +93,7 @@ cc_library(
"ngraph_bridge/ngraph_tensor_manager.cc",
"ngraph_bridge/ngraph_tracked_variable.cc",
"ngraph_bridge/ngraph_utils.cc",
"ngraph_bridge/ngraph_var.cc",
"ngraph_bridge/tf_deadness_analysis.cc",
"ngraph_bridge/tf_graphcycles.cc",
"ngraph_bridge/ops/ngraph_ops.cc",
Expand Down
2 changes: 1 addition & 1 deletion ngraph_bridge/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@ set(SRC
ngraph_rewrite_pass.cc
ngraph_tensor_manager.cc
ngraph_tracked_variable.cc
ngraph_var.cc
ngraph_utils.cc
tf_graphcycles.cc
tf_deadness_analysis.cc
Expand Down Expand Up @@ -86,7 +87,6 @@ if(NGRAPH_TF_ENABLE_VARIABLES_AND_OPTIMIZERS)
list(APPEND SRC enable_variable_ops/ngraph_tracked_variable.cc)

# new files
list(APPEND SRC enable_variable_ops/ngraph_var.cc)
list(APPEND SRC enable_variable_ops/ngraph_assign_op.cc)
list(APPEND SRC enable_variable_ops/ngraph_enter_in_catalog.cc)
list(APPEND SRC enable_variable_ops/ngraph_remove_ngraphassigns.cc)
Expand Down
4 changes: 2 additions & 2 deletions ngraph_bridge/enable_variable_ops/ngraph_assign_op.cc
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,11 @@
#include "ngraph/event_tracing.hpp"
#include "ngraph/runtime/backend.hpp"

#include "ngraph_bridge/enable_variable_ops/ngraph_var.h"
#include "ngraph_bridge/ngraph_catalog.h"
#include "ngraph_bridge/ngraph_freshness_tracker.h"
#include "ngraph_bridge/ngraph_timer.h"
#include "ngraph_bridge/ngraph_utils.h"
#include "ngraph_bridge/ngraph_var.h"

using namespace std;
namespace ng = ngraph;
Expand Down Expand Up @@ -83,7 +83,7 @@ class NGraphAssignOp : public OpKernel {

void Compute(OpKernelContext* context) override {
std::ostringstream oss;
oss << "Execute: Assign_" << my_instance_id << ": " << name();
oss << "NGAssign::Compute::" << name();
ngraph::Event event_compute(oss.str(), name(), "");

NGRAPH_VLOG(4) << "NGraphAssign:: Compute called for: " << def().name()
Expand Down
15 changes: 6 additions & 9 deletions ngraph_bridge/enable_variable_ops/ngraph_enter_in_catalog.cc
Original file line number Diff line number Diff line change
Expand Up @@ -160,15 +160,12 @@ Status EnterInCatalog(Graph* graph, int graph_id) {
}
}

// are there indexes that need copy
if (op_index_to_copy.size() > 0) {
try {
NGraphCatalog::AddToEncapOutputCopyIndexesMap(graph_id, node->name(),
op_index_to_copy);
} catch (const std::exception& exp) {
return errors::Internal(
"Caught exception while entering in catalog: ", exp.what(), "\n");
}
try {
NGraphCatalog::AddToEncapOutputCopyIndexesMap(graph_id, node->name(),
op_index_to_copy);
} catch (const std::exception& exp) {
return errors::Internal("Caught exception while entering in catalog: ",
exp.what(), "\n");
}

} // end of node is type NGraphEncapsulate
Expand Down
8 changes: 8 additions & 0 deletions ngraph_bridge/enable_variable_ops/ngraph_rewrite_pass.cc
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
#include "ngraph_bridge/ngraph_cluster_manager.h"
#include "ngraph_bridge/ngraph_deassign_clusters.h"
#include "ngraph_bridge/ngraph_encapsulate_clusters.h"
#include "ngraph_bridge/ngraph_enter_prefetch_in_catalog.h"
#include "ngraph_bridge/ngraph_mark_for_clustering.h"
#include "ngraph_bridge/ngraph_rewrite_for_tracking.h"
#include "ngraph_bridge/ngraph_utils.h"
Expand Down Expand Up @@ -255,6 +256,13 @@ class NGraphEncapsulationPass : public NGraphRewritePass {
"Graph with NGraphAssigns Optimized/Removed");
}

// 8. Enter Prefetch in catalog then.
TF_RETURN_IF_ERROR(EnterPrefetchInCatalog(options.graph->get(), idx));
if (DumpCatalogedGraphs()) {
DumpGraphs(options, idx, "prefetch-cataloged",
"Graph with Prefetched Inputs Entered in Catalog");
}

return Status::OK();
}

Expand Down
5 changes: 3 additions & 2 deletions ngraph_bridge/enable_variable_ops/ngraph_tracked_variable.cc
Original file line number Diff line number Diff line change
Expand Up @@ -23,11 +23,11 @@
#include "ngraph/event_tracing.hpp"
#include "ngraph/runtime/backend.hpp"

#include "ngraph_bridge/enable_variable_ops/ngraph_var.h"
#include "ngraph_bridge/ngraph_backend_manager.h"
#include "ngraph_bridge/ngraph_catalog.h"
#include "ngraph_bridge/ngraph_freshness_tracker.h"
#include "ngraph_bridge/ngraph_utils.h"
#include "ngraph_bridge/ngraph_var.h"

using namespace std;
namespace ng = ngraph;
Expand Down Expand Up @@ -119,7 +119,7 @@ void NGraphVariableOp::Compute(OpKernelContext* ctx) {
<< " ,backend_name " << ng_backend_name_;

std::ostringstream oss;
oss << "NGraphVariable: " << my_instance_id << ": " << name();
oss << "NGVariable::Compute::" << name();
ngraph::Event event_compute(oss.str(), name(), "");

bool log_copies = false;
Expand Down Expand Up @@ -250,6 +250,7 @@ void NGraphVariableOp::Compute(OpKernelContext* ctx) {
ctx->record_persistent_memory_allocation(var->tensor()->AllocatedBytes());
}
var->Unref();
event_compute.Stop();
ngraph::Event::write_trace(event_compute);
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,12 +26,12 @@

#include "ngraph/runtime/backend.hpp"

#include "ngraph_bridge/enable_variable_ops/ngraph_var.h"
#include "ngraph_bridge/ngraph_backend_manager.h"
#include "ngraph_bridge/ngraph_catalog.h"
#include "ngraph_bridge/ngraph_freshness_tracker.h"
#include "ngraph_bridge/ngraph_timer.h"
#include "ngraph_bridge/ngraph_utils.h"
#include "ngraph_bridge/ngraph_var.h"

using namespace std;
namespace ng = ngraph;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,10 @@

#include "ngraph/event_tracing.hpp"

#include "ngraph_bridge/enable_variable_ops/ngraph_var.h"
#include "ngraph_bridge/enable_variable_ops/ngraph_variable_update_ng_tensor_op.h"
#include "ngraph_bridge/ngraph_timer.h"
#include "ngraph_bridge/ngraph_utils.h"
#include "ngraph_bridge/ngraph_var.h"

using namespace std;
namespace ng = ngraph;
Expand Down Expand Up @@ -67,6 +67,7 @@ NGraphVariableUpdateNGTensorOp::~NGraphVariableUpdateNGTensorOp() {
void NGraphVariableUpdateNGTensorOp::Compute(OpKernelContext* context) {
std::ostringstream oss;
// Start event tracing
oss << "NGVariableUpdateNGTensor::Compute::" << name();
ngraph::Event event_compute(oss.str(), name(), "");
bool log_copies = false;
OP_REQUIRES_OK(context,
Expand Down
2 changes: 1 addition & 1 deletion ngraph_bridge/ngraph_encapsulate_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -45,8 +45,8 @@
#include "ngraph_bridge/ngraph_timer.h"
#include "ngraph_bridge/ngraph_utils.h"

#include "ngraph_bridge/ngraph_var.h"
#if defined(NGRAPH_TF_ENABLE_VARIABLES_AND_OPTIMIZERS)
#include "ngraph_bridge/enable_variable_ops/ngraph_var.h"
#include "ngraph_bridge/ngraph_catalog.h"
#endif

Expand Down
79 changes: 49 additions & 30 deletions ngraph_bridge/ngraph_encapsulate_op.cc
Original file line number Diff line number Diff line change
Expand Up @@ -49,9 +49,9 @@
#include "ngraph_bridge/ngraph_prefetch_shared_data.h"
#include "ngraph_bridge/ngraph_timer.h"
#include "ngraph_bridge/ngraph_utils.h"
#include "ngraph_bridge/ngraph_var.h"

#if defined(NGRAPH_TF_ENABLE_VARIABLES_AND_OPTIMIZERS)
#include "ngraph_bridge/enable_variable_ops/ngraph_var.h"
#include "ngraph_bridge/ngraph_catalog.h"
#endif

Expand Down Expand Up @@ -88,13 +88,8 @@ NGraphEncapsulateOp::NGraphEncapsulateOp(OpKernelConstruction* ctx)
ctx, backend != nullptr,
errors::Internal("Cannot get the backend object for BE: ", be_name));

// If we have the VARIABLE capture on then we can't use the
// parallel executor until that support is added.
#if !defined(NGRAPH_TF_ENABLE_VARIABLES_AND_OPTIMIZERS)
// If backend executable can create tensors we use parallel executor
m_use_parallel_executor = backend->executable_can_create_tensors();
#else
m_use_parallel_executor = false;
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note to self: important section here.


// Override the switch for debugging/testing
if (std::getenv("NGRAPH_TF_USE_LEGACY_EXECUTOR") != nullptr) {
Expand Down Expand Up @@ -402,7 +397,7 @@ NGraphEncapsulateOp::~NGraphEncapsulateOp() {
// OpKernel::Compute
//---------------------------------------------------------------------------
void NGraphEncapsulateOp::Compute(OpKernelContext* ctx) {
ngraph::Event event_compute("Compute", "", "");
ngraph::Event event_compute("NGEncap::Compute::" + name(), name(), "");

if (m_use_parallel_executor) {
NGRAPH_VLOG(1) << "NGraphEncapsulateOp::Compute: Using Parallel Executor";
Expand Down Expand Up @@ -459,6 +454,7 @@ void NGraphEncapsulateOp::ComputeUsingParallelExecutor(OpKernelContext* ctx) {
m_parallel_executor->GetTensorPipelineDepth()));

// Get Tensor Manager and some error checking
ngraph::Event event_prepare_ng_tensors("Prepare NG In/Out Tensors", "", "");
auto tensor_manager = m_parallel_executor->GetTensorManager();
int num_of_inputs = tensor_manager->GetNumberOfInputs();
int num_of_outputs = tensor_manager->GetNumberOfOutputs();
Expand Down Expand Up @@ -499,14 +495,18 @@ void NGraphEncapsulateOp::ComputeUsingParallelExecutor(OpKernelContext* ctx) {
vector<shared_ptr<ng::runtime::Tensor>> ng_inputs(num_of_inputs);
vector<shared_ptr<ng::runtime::Tensor>> ng_outputs(num_of_outputs);

// All inputs and outputs are pipelined.
// Of all these pipelined inputs some are prefetched
// TODO: Fit in variables
ng_inputs = get<1>(pipelined_io_tensors);
ng_outputs = get<2>(pipelined_io_tensors);
// Prepare NG Input Output Tensors
// Assemble Variable tensors and pipelined tensors to ng_input and ng_outputs
OP_REQUIRES_OK(ctx, GetIOTensorsReadyForExecution(
ctx, tensor_manager, get<1>(pipelined_io_tensors),
get<2>(pipelined_io_tensors), ng_inputs, ng_outputs));
event_prepare_ng_tensors.Stop();
ngraph::Event::write_trace(event_prepare_ng_tensors);

// And execute
ngraph::Event event_execute_graph("Execute Graph", "", "");
ngraph::Event event_execute_graph(
"Execute Graph Pipeline Indx" + to_string(current_iter_pipeline_depth),
"", "");

BackendManager::LockBackend(m_parallel_executor->GetOpBackendName());
NGRAPH_VLOG(4) << "NGraphEncapsulateOp::Compute call starting for cluster "
Expand Down Expand Up @@ -540,12 +540,14 @@ void NGraphEncapsulateOp::ComputeUsingParallelExecutor(OpKernelContext* ctx) {
ngraph::Event::write_trace(event_execute_graph);

// Now prepare the output
ngraph::Event event_copy_output_tensor("Copy Output Tensor", "", "");
// Allocate TF Tensors
NGRAPH_VLOG(4) << "NGraphEncapsulateOp::Compute Allocating TF Output Tensors "
<< m_parallel_executor->GetNgraphClusterId();

std::vector<std::unique_ptr<ngraph::Event>> output_copy_events;
ngraph::Event event_prepare_tf_output_tensors("Prepare TF Output Tensor", "",
"");
vector<Tensor*> tf_output_tensors;
for (auto i = 0; i < ng_exec->get_results().size(); i++) {
std::unique_ptr<ngraph::Event> event_copy_prep(
new ngraph::Event("Copy Prep", "", ""));
auto ng_element = ng_exec->get_results()[i];
auto ng_shape = ng_element->get_shape();
auto ng_element_type = ng_element->get_element_type();
Expand All @@ -558,7 +560,7 @@ void NGraphEncapsulateOp::ComputeUsingParallelExecutor(OpKernelContext* ctx) {
TensorShape tf_shape(dims);
Tensor* tf_output_tensor = nullptr;
OP_REQUIRES_OK(ctx, ctx->allocate_output(i, tf_shape, &tf_output_tensor));

tf_output_tensors.push_back(tf_output_tensor);
// Make sure the nGraph-inferred element type agrees with what TensorFlow
// expected.
ng::element::Type expected_elem_type;
Expand All @@ -569,28 +571,45 @@ void NGraphEncapsulateOp::ComputeUsingParallelExecutor(OpKernelContext* ctx) {
ctx, ng_element_type == expected_elem_type,
errors::Internal("Element type inferred by nGraph does not match "
"the element type expected by TensorFlow"));
event_copy_prep->Stop();
output_copy_events.push_back(std::move(event_copy_prep));
}

// Now copy the nGraph Tensor to Host Tensor
std::unique_ptr<ngraph::Event> event_copy_d2h(
new ngraph::Event("Device to Host Copy", "", ""));
void* dst_ptr = DMAHelper::base(tf_output_tensor);
// Copy Tensors that are required
NGRAPH_VLOG(4) << "NGraphEncapsulateOp::Compute Read NG Output Tensors "
<< m_parallel_executor->GetNgraphClusterId();

ng_outputs[i]->read(
dst_ptr, ng_outputs[i]->get_element_count() * ng_element_type.size());
std::vector<std::unique_ptr<ngraph::Event>> output_copy_events;

auto output_indexes_to_be_copied =
tensor_manager->GetOutputIndexesThatNeedCopy();
for (auto output_index : output_indexes_to_be_copied) {
// Copy the nGraph Tensor to Host Tensor
std::unique_ptr<ngraph::Event> event_copy_d2h(new ngraph::Event(
"D2H_Output_" + std::to_string(output_index), "", ""));
void* dst_ptr = (void*)DMAHelper::base(tf_output_tensors[output_index]);
ng_outputs[output_index]->read(
dst_ptr, ng_outputs[output_index]->get_element_count() *
ng_outputs[output_index]->get_element_type().size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note to self: instead of calling read on everything, call read only on output_indexes_to_be_copied

event_copy_d2h->Stop();
output_copy_events.push_back(std::move(event_copy_d2h));
}

for (auto& next : output_copy_events) {
ngraph::Event::write_trace(*next.get());
}
event_prepare_tf_output_tensors.Stop();
ngraph::Event::write_trace(event_prepare_tf_output_tensors);

event_copy_output_tensor.Stop();
ngraph::Event::write_trace(event_copy_output_tensor);
// Synch Var Output Tensors as required
NGRAPH_VLOG(4)
<< "NGraphEncapsulateOp::Compute Sync NG Output Variable Tensors "
<< m_parallel_executor->GetNgraphClusterId();
ngraph::Event event_update_ngvar_tensors("Update NGVar Tensors", "", "");
OP_REQUIRES_OK(ctx, SyncOutputVarTensors(ctx, tensor_manager));
event_update_ngvar_tensors.Stop();
ngraph::Event::write_trace(event_update_ngvar_tensors);

// Now return them to the cache
NGRAPH_VLOG(4) << "NGraphEncapsulateOp::Returning Tensors "
<< m_parallel_executor->GetNgraphClusterId();
ngraph::Event event_return_tensor("Return Tensor", "", "");
pipelined_tensor_store->return_tensors(current_iter_pipeline_depth);

Expand Down
Loading