Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Branch 197583446 #19470

Merged
merged 61 commits into from
May 25, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
f0ee72b
Make the quantize_and_dequantize op use the full quantized range when…
tensorflower-gardener May 19, 2018
66b50ab
[XLA] Regression test for missing virtual destructor.
cdleary May 19, 2018
f3ae367
Delete unused and buggy code.
tensorflower-gardener May 19, 2018
92cdc99
Add 'src_graph' argument to gradients_impl._GradientsHelper.
skye May 19, 2018
81168f2
Add a method to list op names in an ApiDefMap.
tensorflower-gardener May 19, 2018
e34dcfe
Fix compile error due to missing default case in switch statement.
tensorflower-gardener May 19, 2018
9a11178
[XLA] Consistently apply gpu-max-kernel-unroll-factor = 1 in HloTestB…
May 19, 2018
c76b815
Rollforward of CL 197167501, without enabling CUDNN_FFT_TILING_FORWAR…
tensorflower-gardener May 20, 2018
85a19f6
Fixed Pi cross compilation
petewarden May 20, 2018
3a297e6
[XLA] Fix memory leak in ScopedShapedBuffer's move-assignment operator.
May 20, 2018
6b374e9
Extend optimization of Slice operator to StridedSlice.
tensorflower-gardener May 21, 2018
158528e
Enhance error reporting.
tensorflower-gardener May 21, 2018
0f192f9
Automated g4 rollback of changelist 197226707
tensorflower-gardener May 21, 2018
a0e4081
Add a kernel usable as a GEBP inner loop for an LLVM IR GEMM
May 21, 2018
fdd06c5
Optimize multiplications by constants in more cases.
benoitsteiner May 21, 2018
7e5d24d
Turn on dead branch elimination, shape optimization, and remapping by…
benoitsteiner May 21, 2018
7914340
Always enter the handle graph before calling convert_to_tensor in res…
alextp May 21, 2018
0cc2aef
[TF:XLA] Delete cumulative_total_size to simplify the DFS scheduler.
dimvar May 21, 2018
1f10b08
Allow using DNN to only train the embeddings and using the tree model…
tensorflower-gardener May 21, 2018
564fcf1
Disable flaky batch_dataset_op_test
jsimsa May 21, 2018
b28938c
Remove object-based checkpointing probes from Python 3 tf.train.Saver…
allenlavoie May 21, 2018
c734063
Extract out a MatrixMatrixBlockPanelEmitter::Dimensions struct; NFC
May 21, 2018
2746bc3
Internal Change.
May 21, 2018
a4afd46
Optimize more reductions
benoitsteiner May 21, 2018
d67a994
Expose partition_strategy option in embedding_lookup_unique
tensorflower-gardener May 21, 2018
4d03411
Add arithmetic optimizer stage that removes LogicalNot that takes a c…
tensorflower-gardener May 21, 2018
148790a
Support a better interface for the single option case in combinations…
isaprykin May 21, 2018
433bb8e
Improve error message in tensor.cc when IsAligned() test fails
tensorflower-gardener May 21, 2018
753cc5b
Fixes issue with gradient tape when asking for the gradient of an int…
alextp May 21, 2018
0ad7f20
Ensure that saving/restoring iterator in CheckpointInputPipelineHook …
saxenasaurabh May 21, 2018
006e293
Supports initializing an Interpreter with a direct ByteBuffer of nati…
tensorflower-gardener May 22, 2018
c3587c5
Improves documentation of labels and logits arguments in hinge loss m…
petrosmol May 22, 2018
31ca159
Make the quantize_and_dequantize op use the full quantized range when…
tensorflower-gardener May 22, 2018
d913a24
[XLA] Two minor style-guide fixups.
May 22, 2018
b113981
Introduce an option to allocate CUDA unified memory
smit-hinsu May 22, 2018
7c3cd08
Split generated_examples test into multiple test targets
angerson May 22, 2018
1d5c44c
Adds support for specifying a discovery_service_url (via either a par…
May 22, 2018
e4dcf28
Improvements to util/nest.py and data/util/nest.py
May 22, 2018
45e4c1c
s/tfe.GradientTape/tf.GradientTape/
asimshankar May 22, 2018
bbc8fe7
Internal Change
May 22, 2018
065436d
Internal Change
mkuperst May 22, 2018
c0bf28e
Update scan benchmarks to have a range of 16K-128K iterations. As of …
tensorflower-gardener May 22, 2018
eab53f2
[XLA:GPU] Implement trivial (one-replica) cross-replica-sum on XLA:GPU.
May 22, 2018
0289932
Enable tpu.rewrite to work on XLA CPU/GPU backends.
tensorflower-gardener May 22, 2018
6e6dcf4
Unifiy the cuda toolchain definition of gcc/nvcc and cuda-clang.
tensorflower-gardener May 22, 2018
3d428b1
Automated g4 rollback of changelist 197487461
tensorflower-gardener May 22, 2018
f5cb20e
convert Pow op into something that is more recognizable, so we can ha…
tensorflower-gardener May 22, 2018
a5f13f7
batch_util.h is generally useful so moved to util/ from kernels/ wher…
tensorflower-gardener May 22, 2018
e727f74
internal change
tensorflower-gardener May 22, 2018
291f349
[XLA] Optimize ShapeTree<T>
tensorflower-gardener May 22, 2018
2e4a4b4
[XLA:TF] Run buildifier on llvm.BUILD
d0k May 22, 2018
96f4fef
Automated g4 rollback of changelist 197527651
tensorflower-gardener May 22, 2018
62a0eef
Fix a couple of broken links in the Swift For TensorFlow page.
tensorflower-gardener May 22, 2018
8f5a71e
Update calls to addPassesToEmitFile
tensorflower-gardener May 22, 2018
dabc133
Contributing guidelines, style guide and README updates
tensorflower-gardener May 22, 2018
55711f7
[TF:XLA] Fix xla_interpreter_device build
d0k May 22, 2018
5e7d41d
Special case the 'dict' call, which trips other mechanisms for built-…
May 22, 2018
13e7f92
Make init_scope preserve the inner device stack when lifting into a g…
akshayka May 22, 2018
1b97609
Merge commit for internal changes
ankurtaly May 22, 2018
349c26f
fixed DirectSessionWithTrackingAllocTest keyboard_arrow_right CostMo…
ankurtaly May 23, 2018
ba0ed27
updated based on CL 197644290
ankurtaly May 23, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
14 changes: 11 additions & 3 deletions tensorflow/c/eager/tape.h
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,10 @@ class VSpace {
gtl::ArraySlice<Gradient*> output_gradients,
std::vector<Gradient*>* result) const = 0;

// Marks the following gradient as a result so it's not consumed by backward
// functions.
virtual void MarkAsResult(Gradient* gradient) const = 0;

// Deletes the input tensor.
virtual void DeleteGradient(Gradient* gradient) const = 0;

Expand Down Expand Up @@ -356,8 +360,7 @@ BackpropInitialState<BackwardFunction> PrepareBackprop(
count_it->second++;
} else {
result.tensor_usage_counts[it] = 1;
if (sources_set.find(it) == sources_set.end() &&
tensor_tape.find(it) != tensor_tape.end()) {
if (tensor_tape.find(it) != tensor_tape.end()) {
tensor_stack.push_back(it);
}
}
Expand Down Expand Up @@ -522,10 +525,15 @@ Status GradientTape<Gradient, BackwardFunction>::ComputeGradient(
}
} else {
any_gradient_nonzero = true;
out_gradients.push_back(vspace.AggregateGradients(grad_it->second));
auto new_gradients = vspace.AggregateGradients(grad_it->second);
if (sources_set.find(grad_it->first) == sources_set.end()) {
gradients.erase(grad_it);
} else {
grad_it->second.clear();
grad_it->second.push_back(new_gradients);
vspace.MarkAsResult(new_gradients);
}
out_gradients.push_back(new_gradients);
}
}
std::vector<Gradient*> in_gradients;
Expand Down
3 changes: 2 additions & 1 deletion tensorflow/compiler/aot/embedded_protocol_buffers.cc
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,8 @@ static StatusOr<string> CodegenModule(llvm::TargetMachine* target_machine,
llvm::legacy::PassManager codegen_passes;

if (target_machine->addPassesToEmitFile(
codegen_passes, ostream, llvm::TargetMachine::CGFT_ObjectFile)) {
codegen_passes, ostream, nullptr,
llvm::TargetMachine::CGFT_ObjectFile)) {
return xla::InternalError(
"Could not create pass pipeline to generate object file");
}
Expand Down
9 changes: 5 additions & 4 deletions tensorflow/compiler/jit/xla_interpreter_device.cc
Original file line number Diff line number Diff line change
Expand Up @@ -48,10 +48,11 @@ Status XlaInterpreterDeviceFactory::CreateDevices(
registration.compile_resource_ops = true;

std::unique_ptr<XlaDevice> device;
TF_RETURN_IF_ERROR(XlaDevice::Create("Interpreter", DEVICE_XLA_INTERPRETER, 0,
DEVICE_INTERPRETER_XLA_JIT, options,
name_prefix, registration,
/*transfer_as_literal=*/false, &device));
TF_RETURN_IF_ERROR(XlaDevice::Create(
"Interpreter", DEVICE_XLA_INTERPRETER, 0, DEVICE_INTERPRETER_XLA_JIT,
options, name_prefix, registration,
/*transfer_as_literal=*/false,
/*shape_representation_fn=*/{}, &device));
devices->push_back(device.release());
return Status::OK();
}
Expand Down
2 changes: 2 additions & 0 deletions tensorflow/compiler/tests/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -196,9 +196,11 @@ tf_xla_py_test(
name = "oom_test",
size = "medium",
srcs = ["oom_test.py"],
# TODO(b/80081500): Re-enable on GPU. Disabled on 2018-05-21.
disabled_backends = [
"cpu",
"cpu_ondemand",
"gpu",
],
tags = [
# Allocates very large amounts of memory and does not work under TSAN.
Expand Down
4 changes: 2 additions & 2 deletions tensorflow/compiler/tests/randomized_tests.cc
Original file line number Diff line number Diff line change
Expand Up @@ -619,8 +619,8 @@ std::vector<int64> OpTest::ImageDims(TensorFormat format, int batch,
dims.push_back(dim);
}
break;
case FORMAT_NCHW_VECT_C:
LOG(FATAL) << "FORMAT_NCHW_VECT_C not supported.";
default:
LOG(FATAL) << "Tensor format " << ToString(format) << " not supported.";
}
return dims;
}
Expand Down
22 changes: 22 additions & 0 deletions tensorflow/compiler/tf2xla/cc/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -27,3 +27,25 @@ cc_library(
"//tensorflow/core:protos_all_cc",
],
)

tf_gen_op_wrapper_cc(
name = "xla_jit_op_gen",
out_ops_file = "ops/xla_jit_op",
deps = ["//tensorflow/compiler/jit/ops:xla_ops"],
)

cc_library(
name = "xla_jit_ops",
srcs = ["ops/xla_jit_op.cc"],
hdrs = ["ops/xla_jit_op.h"],
deps = [
"//tensorflow/cc:const_op",
"//tensorflow/cc:ops",
"//tensorflow/cc:scope",
"//tensorflow/compiler/jit/ops:xla_ops",
"//tensorflow/core:core_cpu",
"//tensorflow/core:framework",
"//tensorflow/core:lib",
"//tensorflow/core:protos_all_cc",
],
)
1 change: 1 addition & 0 deletions tensorflow/compiler/xla/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -583,6 +583,7 @@ tf_cc_test(
":shape_util",
":test",
":xla_data_proto",
"//tensorflow/core:test",
"//tensorflow/core:test_main",
],
)
Expand Down
17 changes: 17 additions & 0 deletions tensorflow/compiler/xla/service/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -761,6 +761,23 @@ cc_library(
],
)

tf_cc_test(
name = "shaped_buffer_test",
srcs = ["shaped_buffer_test.cc"],
deps = [
":cpu_plugin",
":device_memory_allocator",
":platform_util",
":shaped_buffer",
"//tensorflow/compiler/xla:shape_util",
"//tensorflow/compiler/xla:test",
"//tensorflow/compiler/xla:test_helpers",
"//tensorflow/compiler/xla/tests:xla_internal_test_main",
"//tensorflow/core:ptr_util",
"//tensorflow/core:test",
],
)

cc_library(
name = "executable",
srcs = ["executable.cc"],
Expand Down
8 changes: 8 additions & 0 deletions tensorflow/compiler/xla/service/cpu/cpu_options.cc
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@ namespace {
const char* const kXlaOptimizeForSizeCpuOption = "xla_cpu_optimize_for_size";
const char* const kXlaDisableVectorizedReduce = "xla_disable_vectorized_reduce";
const char* const kLlvmIrDotTilingFactor = "xla_llvm_dot_tiling_factor";
const char* const kXlaEnableExperimentalLlvmIrGemm =
"xla_enable_experimental_llvm_ir_gemm";

} // namespace

Expand Down Expand Up @@ -54,6 +56,12 @@ tensorflow::gtl::optional<int64> LlvmIrGemvTilingFactor(
return tensorflow::gtl::nullopt;
}

bool EnableExperimentalLlvmIrGemm(const HloModuleConfig& config) {
const auto& extra_options_map =
config.debug_options().xla_backend_extra_options();
return extra_options_map.count(kXlaEnableExperimentalLlvmIrGemm) > 0;
}

} // namespace options
} // namespace cpu
} // namespace xla
1 change: 1 addition & 0 deletions tensorflow/compiler/xla/service/cpu/cpu_options.h
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ namespace options {

bool OptimizeForSizeRequested(const HloModuleConfig& config);
bool VectorizedReduceDisabled(const HloModuleConfig& config);
bool EnableExperimentalLlvmIrGemm(const HloModuleConfig& config);
tensorflow::gtl::optional<int64> LlvmIrGemvTilingFactor(
const HloModuleConfig& config);

Expand Down