New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadlock in MapDataset #10369

Closed
snnn opened this Issue Jun 1, 2017 · 3 comments

Comments

Projects
None yet
3 participants
@snnn
Contributor

snnn commented Jun 1, 2017

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow):
    no

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
    Linux Ubuntu 14.04

  • TensorFlow installed from (source or binary): source

  • TensorFlow version (use command below):
    95d90ab

  • Bazel version (if compiling from source):
    0.5.0

  • CUDA/cuDNN version:
    None

  • GPU model and memory:
    None

  • Exact command to reproduce:
    python map_dataset_op_test.py

Describe the problem

The process hangs forever

Source code / logs

Below is from map_dataset_op_test.py:

"""Tests for the experimental input pipeline ops."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import numpy as np

from tensorflow.contrib.data.python.ops import dataset_ops
from tensorflow.python.framework import constant_op
from tensorflow.python.framework import dtypes
from tensorflow.python.framework import errors
from tensorflow.python.ops import array_ops
from tensorflow.python.ops import data_flow_ops
from tensorflow.python.ops import lookup_ops
from tensorflow.python.ops import math_ops
from tensorflow.python.ops import random_ops
from tensorflow.python.ops import string_ops
from tensorflow.python.ops import variable_scope
from tensorflow.python.platform import test


class MapDatasetTest(test.TestCase):

  def _buildParallelMapDataset(self, components, count, num_threads,
							   output_buffer_size):
	def _map_fn(x, y, z):
	  return math_ops.square(x), math_ops.square(y), math_ops.square(z)
	return (dataset_ops.Dataset.from_tensor_slices(components).map(
		_map_fn, num_threads=num_threads, output_buffer_size=output_buffer_size)
			.repeat(count))

  def testParallelMapDataset(self):
	"""Test an dataset that maps a TF function across its input elements."""
	# The pipeline is TensorSliceDataset -> ParallelMapDataset(square_3) ->
	# RepeatDataset(count).
	components = [np.arange(7),
				  np.array([[1, 2, 3]]) * np.arange(7)[:, np.newaxis],
				  np.array(37.0) * np.arange(7)]
	count = array_ops.placeholder(dtypes.int64, shape=[])
	num_threads = array_ops.placeholder(dtypes.int32, shape=[])
	output_buffer_size = array_ops.placeholder(dtypes.int64, shape=[])

	dataset = self._buildParallelMapDataset(components, count, num_threads,
											output_buffer_size)
	iterator = dataset.make_initializable_iterator()
	init_op = iterator.initializer
	get_next = iterator.get_next()

	self.assertEqual([c.shape[1:] for c in components],
					 [t.shape for t in get_next])

	with self.test_session() as sess:
	  def do_test(num_threads_val, output_buffer_size_val):
		# Test single-threaded access to the iterator.
		sess.run(init_op, feed_dict={
			count: 14,
			num_threads: num_threads_val,
			output_buffer_size: output_buffer_size_val})
		for _ in range(14):
		  for i in range(7):
			result = sess.run(get_next)
			for component, result_component in zip(components, result):
			  self.assertAllEqual(component[i]**2, result_component)
		with self.assertRaises(errors.OutOfRangeError):
		  sess.run(get_next)

		# Test multi-threaded access to the same iterator.
		sess.run(init_op, feed_dict={
			count: 18,
			num_threads: num_threads_val,
			output_buffer_size: output_buffer_size_val})
		results = []
		def iterator_thread():
		  while True:
			try:
			  results.append(sess.run(get_next))
			except errors.OutOfRangeError:
			  return
		threads = [self.checkedThread(target=iterator_thread) for _ in range(8)]
		for t in threads:
		  t.start()
		for t in threads:
		  t.join()

		# `results` will contain the same elements components**2
		# repeated 18 times, but in a non-deterministic order. Sort the
		# results, and assert that each element of components**2 is
		# produced 18 times.
		results.sort(key=lambda x: x[0])
		for i in range(7):
		  for j in range(18):
			for component, result_component in zip(components,
												   results[i * 18 + j]):
			  self.assertAllEqual(component[i]**2, result_component)

	  for num_threads_val, output_buffer_size_val in [
		  (1, 1), (1, 2), (2, 2), (2, 4), (8, 8), (8, 16)]:
		do_test(num_threads_val, output_buffer_size_val)

if __name__ == "__main__":
  test.main()

bt.txt

@snnn

This comment has been minimized.

Contributor

snnn commented Jun 1, 2017

(gdb) thr 14
(gdb) f2
(gdb) p *mutex
$2 = {__data = {__lock = 2, __count = 0, __owner = 28288, __nusers = 1, __kind = 0, __spins = 0, __elision = 0,
	__list = {__prev = 0x0, __next = 0x0}},

Thread 12 is waiting on a mutex, which is owned by LWP 28288. (Thread 12), Thread 12 is waiting the output_buffer_ to be non-empty.

@asimshankar

This comment has been minimized.

Member

asimshankar commented Jun 1, 2017

@snnn : Could you reduce your sampler to something simpler? (Among other things, seems like you're running multiple groups of threads, if you can reduce it to the smallest example that demonstrates the failure, that will be helpful)

CC @mrry

@mrry

This comment has been minimized.

Contributor

mrry commented Jun 2, 2017

This is definitely a real bug, which I suspect arises because you have 8 or fewer cores on the machine where you're running the test. The issue is that the current implementation of the IteratorGetNext op is a synchronous OpKernel but it can block an inter-op threadpool thread, and the unblocking action may require the use of another inter-op threadpool thread. The default threadpool size is the number of cores in your machine.

I'm working on a fix, but there are two short-term workarounds:

  • Increase the size of the inter-op threadpool when you create the session using tf.ConfigProto. Setting it to (maximum number of concurrent get_next() ops) + 1 (i.e. 9 in this case) should address the deadlock.
  • Reduce the number of concurrent get_next() calls to (number of cores) - 1.

The true fix will involve rewriting IteratorGetNext as an AsyncOpKernel, which I'm working on now....

@mrry mrry self-assigned this Jun 2, 2017

@yifeif yifeif closed this in 8939b85 Jun 3, 2017

av8ramit added a commit to av8ramit/tensorflow that referenced this issue Jun 6, 2017

[tf.contrib.data] Re-implement IteratorGetNext as an AsyncOpKernel.
This prevents the op from consuming an inter-op thread pool thread
when blocked, and fixes a potential deadlock when many IteratorGetNext
ops are blocked. Fixes tensorflow#10369.

PiperOrigin-RevId: 157878885

caisq pushed a commit that referenced this issue Jun 13, 2017

Merge changes from github.
END_PUBLIC

---
Commit f0e185d1f authored by Benoit Steiner<bsteiner@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Better handle nodes with a variable number of outputs

PiperOrigin-RevId: 158435028

---
Commit bc3e20807 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Remove unused BUILD dependencies

PiperOrigin-RevId: 158431059

---
Commit a0c80e4d5 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Delete unnecessary (mistakenly duplicated) logging message.

PiperOrigin-RevId: 158428506

---
Commit b6ad1d747 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Adds DNN-only tests for DNNLinearCombinedClassifier.

PiperOrigin-RevId: 158423119

---
Commit ddbb58034 authored by Shanqing Cai<cais@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Remove unnecessary pylint disable

PiperOrigin-RevId: 158416140

---
Commit fcaa724e2 authored by Luke Iwanski<luke@codeplay.com>
Committed by gunan<gunan@google.com>:
[OpenCL] Cleans pack and unpack ops (#10336)

* [OpenCL] Cleans pack op

* [OpenCL] Cleans unpack op

---
Commit 2f53cacb2 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Fix a test failure of quantization_utils_test on ASAN

PiperOrigin-RevId: 158414538

---
Commit 50b2f951c authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Update ops-related pbtxt files.

PiperOrigin-RevId: 158413455

---
Commit 1e90b78e9 authored by Brennan Saeta<saeta@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Add CacheDataset ops.

Some input pipelines may pull down data from remote webservers or perform
expensive processing. In order to avoid extraneous work, we now support
caching the dataset (e.g. on disk).

PiperOrigin-RevId: 158411901

---
Commit e16cd2ede authored by Taehoon Lee<taehoonlee@snu.ac.kr>
Committed by gunan<gunan@google.com>:
Fix typos (#10533)

---
Commit 50d80ddf9 authored by Jonathan Hseu<jhseu@google.com>
Committed by Jonathan Hseu<jhseu@google.com>:
Fix fft_ops_test.py for CPU

---
Commit d35cbbb44 authored by Mustafa Ispir<ispir@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Add weight-column support to the heads.

PiperOrigin-RevId: 158409180

---
Commit 7fb52cd54 authored by Justin Lebar<jlebar@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Don't crash when displaying XLA metrics if they happen to be negative.

PiperOrigin-RevId: 158407664

---
Commit 12a7a752a authored by Jianfei Wang<me@thinxer.com>
Committed by Jonathan Hseu<vomjom@vomjom.net>:
Add a tip for tf.train.LoggingTensorHook (#10237)

`INFO` logs are not printed by default unless in IPython. Add a friendly tip for newcomers.
---
Commit 216dcbf1e authored by Luke Iwanski<luke@codeplay.com>
Committed by Jonathan Hseu<vomjom@vomjom.net>:
[OpenCL] Cleans reduction ops (#10340)

* [OpenCL] Cleans reduction_ops_max.cc

* [OpenCL] Cleans reduction_ops_mean.cc

* [OpenCL] Cleans reduction_ops_min.cc

* [OpenCL] Cleans reduction_ops_prod.cc

* [OpenCL] Cleans reduction_ops_sum.cc

---
Commit 2b351062a authored by Androbin<robin.richtsfeld@gmail.com>
Committed by Jonathan Hseu<vomjom@vomjom.net>:
Improve docs for selective registration headers (#10351)

* Improve docs for selective registration headers

progressing #10299

* Update print_selective_registration_header.py

* Mention both flags

-DSELECTIVE_REGISTRATION and -DSUPPORT_SELECTIVE_REGISTRATION

---
Commit ee919510f authored by Yun Peng<pcloudy@google.com>
Committed by gunan<gunan@google.com>:
Re-enable some python tests in Windows Bazel build (#10526)

---
Commit b0e881457 authored by Androbin<robin.richtsfeld@gmail.com>
Committed by gunan<gunan@google.com>:
[Bash] Declare and assign separately (#10509)

As proposed by static analysis tool:
https://github.com/koalaman/shellcheck/wiki/SC2155
---
Commit 284901b08 authored by Androbin<robin.richtsfeld@gmail.com>
Committed by gunan<gunan@google.com>:
[Bash] Remove unquoting quotes (#10506)

As proposed by static analysis tool:
https://github.com/koalaman/shellcheck/wiki/SC2027
---
Commit 2a1f11556 authored by ksellesk<zhengdachuan200305@gmail.com>
Committed by ksellesk<zhengdachuan200305@gmail.com>:
Fix AttributeError in resnet.py

There is no function tf.softmax() in Tensorflow 1.x.

When running the old code, Python interpreter complains:

File "resnet.py", line 152, in res_net_model
prediction, loss = res_net(x, y)
File "resnet.py", line 148, in res_net
return tf.softmax(logits), loss
AttributeError: 'module' object has no attribute 'softmax'

---
Commit 1d68f729b authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Remove unneeded BUILD dependency

PiperOrigin-RevId: 158391996

---
Commit 08ed32dbb authored by Yun Peng<pcloudy@google.com>
Committed by gunan<gunan@google.com>:
Windows: Make TensorFlow build without --cpu=x64_windows_msvc (#10466)

* Windows: Make TensorFlow build without --cpu=x64_windows_msvc

Since from Bazel 0.5.0, MSVC toolchain became the default toolchain on
Windows. So --cpu=x64_windows_msvc is not required as long as we adjust
the BUILD files in TensorFlow.

--cpu=x64_windows_msvc is also supported for now, but is depracated.
The configuration for cpu value x64_windows_msvc is a duplicate of
x64_windows, which should be removed in the future.

* Fix breakage on macOS

---
Commit 02dbe153a authored by Androbin<robin.richtsfeld@gmail.com>
Committed by gunan<gunan@google.com>:
[Bash] Simplify Conditional (#10503)

---
Commit c07bc581f authored by Androbin<robin.richtsfeld@gmail.com>
Committed by gunan<gunan@google.com>:
[Bash] Prefer read -a to split path (#10508)

As proposed by static analysis tool:
https://github.com/koalaman/shellcheck/wiki/SC2207
---
Commit 0a389674d authored by Androbin<robin.richtsfeld@gmail.com>
Committed by gunan<gunan@google.com>:
[Bash] Prefer [ p ] && [ q ] over [ p -a q ] (#10507)

As proposed by static analysis tool:
https://github.com/koalaman/shellcheck/wiki/SC2166
---
Commit 87a008ec3 authored by Jonathan Hseu<vomjom@vomjom.net>
Committed by gunan<gunan@google.com>:
Delete non-deterministic testEmpty() test (#10512)

---
Commit 3a2971bd8 authored by Frank Chen<frankchn@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Adds the base for ClusterResolvers, a new way of communicating with and retrieving cluster information for running distributed TensorFlow.

Implementations of this class would eventually allow users to simply point TensorFlow at a cluster management endpoint, and TensorFlow will automatically retrieve the host names/IPs and port numbers of TensorFlow workers from the cluster management service.

PiperOrigin-RevId: 158358761

---
Commit 28b4e7f04 authored by Jonathan Hseu<vomjom@vomjom.net>
Committed by gunan<gunan@google.com>:
Disable stage_op_test and map_stage_op_test (#10516)

---
Commit 390e57a75 authored by Yan (Asta) Li<yanastali@users.noreply.github.com>
Committed by Benoit Steiner<benoitsteiner@users.noreply.github.com>:
Check EIGEN_MAX_ALIGN_BYTES to prevent mod-by-0 (#10380)

* Check EIGEN_MAX_ALIGN_BYTES to prevent mod-by-0

If EIGEN_MAX_ALIGN_BYTES is set to 0, alignment checks that mod by EIGEN_MAX_ALIGN_BYTES fail at runtime.

* Returns true, as in tensorflow/core/framework/tensor.h
* Update unit tests

* Enable tests only if EIGEN_MAX_ALIGN_BYTES > 0

---
Commit cd5ac40b3 authored by Peter Hawkins<phawkins@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[XLA] Update LLVM to upstream revision r304927.
Add LLVM build rules for the LLVM AMDGPU backend, commented out by default. Fixes issue #10437.

PiperOrigin-RevId: 158351480

---
Commit 91cb809bd authored by David Norman<DavidNorman@users.noreply.github.com>
Committed by Jonathan Hseu<vomjom@vomjom.net>:
[XLA] Add ability to run the XLA unit tests against a different device (#9759)

* Add ability to run the XLA unit tests against a different device

* Allow for multiple extra backend devices

* Correct merge error

* Include options for additional tags

---
Commit aff4d124b authored by Yuxin Wu<ppwwyyxxc@gmail.com>
Committed by Jonathan Hseu<vomjom@vomjom.net>:
Compare base_dtype instead of dtype in piecewise_constant (#10280)

* Compare base_dtype instead of dtype in piecewise_constant

Compare base_dtype instead of dtype in piecewise_constant. Fix #10086

* add unit test

* Small lint fix and comment

---
Commit 845539f98 authored by Jianwei Xie<xiejw@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Add evaluation test for linear classifier (n==2 or n >2).

PiperOrigin-RevId: 158340296

---
Commit 7c46214ab authored by Jonathan Hseu<vomjom@vomjom.net>
Committed by GitHub<noreply@github.com>:
Fix numpy 1.13 incompatibilities (#10501)

* Fix numpy 1.13 incompatibilities

* Skip tests with numpy 1.13.0

---
Commit 4572c41df authored by gunan<gunan@google.com>
Committed by Jonathan Hseu<vomjom@vomjom.net>:
A few changes to kernel_tests. (#10502)

* Disable reader_ops_test on windows.

* Run buildifier on kernel_tests/BUILD

* Mark map_stage_op_test as large.

* Set the size of stage_op_test to large

---
Commit 892293d98 authored by Brennan Saeta<saeta@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Set a default for datasets end_of_sequence.

While all datasets carefully set the end_of_sequence to true at the
appropriate time, some datasets might forget to set it to false in the normal
case. In order to avoid potential undefined behavior, we set the
end_of_sequence variable to be false by default.

PiperOrigin-RevId: 158337799

---
Commit 187404eac authored by Benoit Steiner<bsteiner@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Setup the env to since ops such as MatchFileOp rely on it.

PiperOrigin-RevId: 158336344

---
Commit 2741561c8 authored by Justine Tunney<jart@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Fix up vz_projector script structure

We now make sure scripts and HTML imports are declared in the correct
places. In the future, pedantically listing script tags should not be
necessary.

PiperOrigin-RevId: 158334306

---
Commit beeaade46 authored by Kay Zhu<kayzhu@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Resubmit a reverted change. Original description:

[XLA] Enable HloEvaluator for constant folding, also merged a few operations
from hlo_constant_folding to hlo_evaluator.

Additionally:
- In ShapeUtil::ForEachIndex:
    * fix a bug where visitor is called when the shape has zero elements (e.g., F32{1,0})
    * added test case for ForEachIndex.

- In HloEvaluator:
    * Instead of copying and caching a Constant instruction, return the literal directly if the instruction is constant.
    * Fix an issue where TUPLE and OPAQUE primitives are not keyed in the templated typed_visitor.
    * Use (fixed) LiteralUtil::Populate to populate resulting literal, fixes the preexisting bug in the evaluator where R0 and shape with zero size dimensions are not handled.
    * Refactor ElementWiseUnaryOp and HandleCompare to be templatized on the operand's type.
    * Refactor IsFinite to be top level since it is only applicable to floats and the return type is always boolean.
    * Change from std::remainder to std::fmod for kRemainder to be compliant with existing XLA behavior.
    * Change from std::max and std::min to std::fmax and std::fmin to handle NaNs.
    * Minor comments fix.

PiperOrigin-RevId: 158330052

---
Commit b94540e6f authored by Toby Boyd<tobyboyd@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
tf.layers.conv2d use_bias=True to use nn.bias_add

PiperOrigin-RevId: 158326493

---
Commit 379aa9911 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Go: Update generated wrapper functions for TensorFlow ops.

PiperOrigin-RevId: 158325855

---
Commit 4e529f0f1 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Update ops-related pbtxt files.

PiperOrigin-RevId: 158325293

---
Commit 0a9d2dac0 authored by Yuefeng Zhou<yuefengz@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Add a util function in virtual placer to return canonicalized device string, which can be used to fix the node's device field before passing them to the maxcut algorithm.

PiperOrigin-RevId: 158322753

---
Commit 2d8da1d9b authored by Daniel Ylitalo<daniel@blodan.se>
Committed by gunan<gunan@google.com>:
Recognize CPU core count in FreeBSD (#10490)

---
Commit c19e6cac0 authored by Peter Hawkins<phawkins@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[TF:XLA] Initial implementation of TensorArray ops.

The XLA implementation of TensorArrays is more restrictive than regular TensorArrays:
* XLA TensorArrays must have dynamic_size=False.
* all elements in an XLA TensorArray must have the same shape.
* writes always add their values to any existing values; neither reads nor writes ever issue errors. Out-of-bounds writes currently wrap.

Refactor Variable handling in the TF/XLA bridge. Use a XlaVariable* to refer to variables inside compilation rather than a numerical ID. Allow for variables that don't correspond to variables known to the user. Also use XlaVariable to handle TensorArrays.

PiperOrigin-RevId: 158322041

---
Commit b5e8d3086 authored by Peter Hawkins<phawkins@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[TF:XLA] Refactor randomized tests to allow testing of larger inputs without running out of memory.

PiperOrigin-RevId: 158321431

---
Commit 5d90bbaac authored by Kay Zhu<kayzhu@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[XLA] Disable constant_folding in test base, so that intended test code paths
would not be elided by constant_folding pass.

PiperOrigin-RevId: 158317641

---
Commit 036ce8ba6 authored by Luke Iwanski<luke@codeplay.com>
Committed by gunan<gunan@google.com>:
[OpenCL] Cleans dense_update_ops (#10335)

* [OpenCL] Cleans dense_update_ops

* Acts on feedback from: #10335#discussion_r120536460

---
Commit 85f968125 authored by Luke Iwanski<luke@codeplay.com>
Committed by gunan<gunan@google.com>:
[OpenCL] Cleans cast operation (#10330)

* [OpenCL] Removes not needed typedef for SYCLDevice

* [OpenCL] Fixes formatting

* [OpenCL] use SYCLDevice for int32 cast case

---
Commit bff5e72da authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Fix typo.

PiperOrigin-RevId: 158310742

---
Commit 38249d6be authored by Shanqing Cai<cais@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Swap the order of NanTensorHook and custom hooks

to ensure that when the training encounteres NaN's in the loss function, user-supplied hooks such as tf_debug.LocalCLIDebugHook can still be used to debug the root cause of the numeric issues.

PiperOrigin-RevId: 158310249

---
Commit 599727c65 authored by Eli Bendersky<eliben@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[XLA] Propagate debug option flags to hlo_test_base.

Specific HLO tests have to replace the generic test_main target with a manual
main() that invokes RUN_ALL_TESTS.

To get access to a module with debug options set up, a new convenience method
is created on HloTestBase.

Initially algebraic_simplifier_test is modified as a canary; in a followup
we'll convert all HLO tests to this approach.

PiperOrigin-RevId: 158309488

---
Commit 0770393e9 authored by Eric Liu<ioeric@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[Tensorboard] Add a trace viewer component to TensorBoard.

We make the trace viewer a separate app; otherwise, there would be dependency
conflicts (e.g. Polymer) between the trace viewer app and the tensorboard app.
The trace viewer app would be served by a plugin, and Tensorboard dashboard will integrate trace viewer app using iframe in the
future.

This CL also added "mominify" support for link import HTML tags in the
tensorboard home-grown java vulnizer; otherwise, the vulcanized trace viewer code
would crash the java vulcanizer.

For open-source build, we add a denpendency on the Catapult github repository
(https://github.com/catapult-project/catapult/tree/master/tracing). We use a bazel genrule to vulcanize a trace viewer binary which is then used in the
tf-trace-viewer component.

PiperOrigin-RevId: 158309408

---
Commit 85e832201 authored by RJ Ryan<rjryan@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Support unknown emit shapes in tf.nn.raw_rnn.

PiperOrigin-RevId: 158308002

---
Commit edb5fed7f authored by Mustafa Ispir<ispir@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Add label-vocab support to binary logistic head.
Add assertion that binary classifier label is in range [0., 1.]
Fixed Classifier Integration tests.

PiperOrigin-RevId: 158307521

---
Commit f8e1cf8fa authored by Justine Tunney<jart@google.com>
Committed by Jonathan Hseu<vomjom@vomjom.net>:
Open up visibility of tf_imports (#10500)

This also fixes the definition of Clutz.
---
Commit 9fd7cf054 authored by Luke Iwanski<luke@codeplay.com>
Committed by Jonathan Hseu<vomjom@vomjom.net>:
[OpenCL] Cleans relu ops (#10343)

* [OpenCL] register relu ops to gpu types (no half)

* [OpenCL] Removes #undef EIGEN_USE_SYCL

---
Commit 09c1455e3 authored by Luke Iwanski<luke@codeplay.com>
Committed by Jonathan Hseu<vomjom@vomjom.net>:
[OpenCL] Cleans reverse_op.cc (#10346)

---
Commit b7892a30f authored by orome<royl@aldaron.com>
Committed by Jonathan Hseu<vomjom@vomjom.net>:
Clarify tf.matmul documentation (#10381)

* Update math_ops.py

* Fix non-ascii character

---
Commit 9786b7062 authored by Luke Iwanski<luke@codeplay.com>
Committed by Benoit Steiner<benoitsteiner@users.noreply.github.com>:
[OpenCL] Cleans StridedSlice Op (#10314)

* [OpenCL] Cleans StridedSlice Op

* [OpenCL] Removes half from registred types

---
Commit f105df047 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
In the CUDA path of depthwise_conv2d, optimize backward filter convolution for images 2 or 4 times smaller than 16x16. Also initialize in_cols from blockDim, to fix the regression caused in CL 157906773.

PiperOrigin-RevId: 158296136

---
Commit 492afc2e3 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Go: Update generated wrapper functions for TensorFlow ops.

PiperOrigin-RevId: 158295169

---
Commit abe0877ef authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Add bazel version check to .configure

PiperOrigin-RevId: 158294569

---
Commit b702e7e79 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Update ops-related pbtxt files.

PiperOrigin-RevId: 158294289

---
Commit 94085bee7 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Replace std::function object with regular function.

The function is called recursively, and the std::function object had only existed to allow recursion from within a lambda expression. A regular function should be cheaper than a polymorphic function wrapper.

PiperOrigin-RevId: 158292415

---
Commit ba656b261 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Use template specialization instead of overloaded methods. This is a more appropriate tool here. NFC

PiperOrigin-RevId: 158292035

---
Commit 55f987692 authored by Yutaka Leon<yleon@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
  Make tf.contrib.lookup  python functions use the kernels v2 that uses the resource tensor as handler.

PiperOrigin-RevId: 158291836

---
Commit ebae3deba authored by Wei Ho<weiho@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Switch back to max_num_rows_to_load instead of reading slice by slice due to performance regression from network overhead.

Add check when using initializing values to avoid seg fault

PiperOrigin-RevId: 158291218

---
Commit 7b4c01794 authored by RJ Ryan<rjryan@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Support numpy-style padding and slicing of tf.spectral.rfft/irfft to match the desired FFT length.

Fixes incorrect RFFT/IRFFT results when fft_length does not match the input dimension.

PiperOrigin-RevId: 158289991

---
Commit fdb8e2935 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Update iOS examples to use CocoaPods, and moved to tensorflow/examples/ios

PiperOrigin-RevId: 158289285

---
Commit d86167b5f authored by Amit Patankar<amitpatankar@google.com>
Committed by Amit Patankar<amitpatankar@google.com>:
Merging rc2 back into master.

---
Commit dffea202a authored by Eli Bendersky<eliben@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Clean up some code after previous CL

PiperOrigin-RevId: 158282834

---
Commit 7b5302af0 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Adds ability to set a "family" attribute in Tensorflow summaries, which
controls the "tab name" of the summary that is displayed.

This solution keeps using name_scope to keep names unique, but then prefixes the tag with the family name if provided.

PiperOrigin-RevId: 158278922

---
Commit 611c82b5b authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Adds integration test for DNNLinearCombined((Classifier)|(Regressor)).

PiperOrigin-RevId: 158278512

---
Commit cc6c91a9a authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Remove a further unused proto header inclusion

PiperOrigin-RevId: 158278026

---
Commit 9f17c26ca authored by Mark Heffernan<meheff@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[XLA] Add HloLocation to dataflow analysis.
Add an HloLocation abstraction to dataflow analysis which indicates where (in the output of what instruction and at which index) an HloValue may appear. Previously only uses were stored with an HLO value where a use is an edge in the HLO graph (instruction, operand number and ShapeIndex).

Also, change the handling of tuple-shaped kSelect instructions when ssa_form is true. Previously a phi value would be created. With this change the the value set instead contains the union of it's inputs identical to the ssa_form=false case.

PiperOrigin-RevId: 158276598

---
Commit b9d5e1441 authored by Eli Bendersky<eliben@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[XLA] Start collecting flags for debug options in a single place.

ClientLibraryTestBase will now parse command-line flags for debug options
automatically, permitting subclasses to override certain options by using
mutable_debug_options.

main() still has to call AppendDebugOptionsFlags() explicitly before running
the TF flag parser. In the mean-time, this CL leaves flag handling to the
current "legacy" approach. However, this is part of a larger plan to move *all*
debugging flags for XLA into the DebugOptions message and expose them as flags
from a single place. The other flags (which are not controlling debugging
options) will have to be propagated more explicitly.

PiperOrigin-RevId: 158276294

---
Commit 3b6fe94bb authored by Benoit Steiner<bsteiner@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Properly handle shape nodes that have a preexisting control dependency

PiperOrigin-RevId: 158274845

---
Commit 1d67379d5 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Minor cleanup

PiperOrigin-RevId: 158268933

---
Commit 41997756c authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Sort header inclusions; define EIGEN_USE_THREADS where headers depend on it.

PiperOrigin-RevId: 158267803

---
Commit 85355f015 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Add missing header inclusion

PiperOrigin-RevId: 158265934

---
Commit 3cf88d390 authored by Gunhan Gulsoy<gunan@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
When GPU is configured, do not require --config=cuda.
Also fix indentation in configure.

PiperOrigin-RevId: 158232959

---
Commit f48673b50 authored by Luke Iwanski<luke@codeplay.com>
Committed by gunan<gunan@google.com>:
[OpenCL] Removes ReductionFunctor for SYCLDevice (#10326)

We are using Eigen implementation
---
Commit 1b6453bec authored by Joan Puigcerver<joapuipe@gmail.com>
Committed by gunan<gunan@google.com>:
Fixes issue #10258 (#10366)

On CUDA versions previous to 8.0, only __shared__ variables could be declared as static in the device code.
---
Commit cd56a638d authored by Beomsu Kim<123bskim@naver.com>
Committed by Jonathan Hseu<vomjom@vomjom.net>:
Fixed wrong range in docstring (#10272)

---
Commit d13ae380c authored by Micha? Jastrz?bski<michal.jastrzebski@intel.com>
Committed by Jonathan Hseu<vomjom@vomjom.net>:
Fix CMD in Dockerfile (#10444)

Currently Notebook fails execution because default user for this container is root, and unless explicitly allowed, jupyter notebook will not start.
---
Commit 8118ab4ec authored by Simon Perkins<simon.perkins@gmail.com>
Committed by Jonathan Hseu<vomjom@vomjom.net>:
Support partial gets in MapStagingArea (#10276)

* Modify map staging area tests

- size from `small` to `medium`
- introduce 2 shards

* Add partial get support in MapStagingArea

A partial list of tensors in a (key, value) map entry can now be
requested. Once all tensors associated with the entry are removed,
it is removed from the map.

* Correct output/indices mismatch errors

* Rename IncompleteTuple to OptionalTuple

* Add partial get test with indices

* Add some more index checks

* Improve stage test case graph creation

Test sessions (and default graphs) are reused by default.
Create explicit, finalized graphs in each test to prevent
possible interactions between stateful Staging Areas and
others ops created in separate tests.

* Make staging area tests small and remove shards

They were originally made 'medium' to ameliorate timeouts in the test
case, but they usually run in ~1s so they should be small.

* Improve imports

Avoid importing base tensorflow package

* Support both python 2 and python 3 range.

* Set map_stage_op_test to size=large

* Convert the tests to size=medium

---
Commit 0df102b0a authored by Androbin<robin.richtsfeld@gmail.com>
Committed by Jonathan Hseu<vomjom@vomjom.net>:
Update `configure` script sample (#10455)

The `configure` script was changed regularly since the generation of the sample.
This PR updates the sample to reflect those changes.
---
Commit f6dc1ac61 authored by Earthson Lu<Earthson.Lu@gmail.com>
Committed by Jonathan Hseu<vomjom@vomjom.net>:
MKL_INSTALL_PATH should not be ignore when given (#10180)

* MKL_INSTALL_PATH should not be clear when given

* fix overwrite by default

---
Commit 8ad6a036e authored by Asim Shankar<ashankar@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Java: Update Maven release to 1.2.0-rc2

PiperOrigin-RevId: 158212897

---
Commit 15eddf035 authored by Fritz Obermeyer<fritz.obermeyer@gmail.com>
Committed by Jonathan Hseu<vomjom@vomjom.net>:
Export C API symbols in _pywrap_tensorflow_internal.so (#10469)

* Export C API symbols

* Export C API symbols under config:default

---
Commit 754e12668 authored by Luke Iwanski<luke@codeplay.com>
Committed by Jonathan Hseu<vomjom@vomjom.net>:
[OpenCL] Removes half concat op registration (#10331)

---
Commit cfdc22dee authored by Peng Yu<yupbank@users.noreply.github.com>
Committed by Jonathan Hseu<vomjom@vomjom.net>:
fix the error (#10293)

---
Commit 58747e357 authored by Joel Hestness<jthestness@gmail.com>
Committed by Jonathan Hseu<vomjom@vomjom.net>:
PhiloxRandom: Fix race in GPU fill function (#10298)

* PhiloxRandom: Fix race in GPU fill function

The PhiloxRandom fill kernel for the GPU had race conditions that caused the
outputs to be non-deterministic. In particular, the code previously executed
with N GPU threads (# thread contexts per GPU), but it would only advance the
fill addresses by N-1 stride in each step. This incorrect stride caused the
0th and N-1st threads to write to the same memory locations, racing for which
was last to write their common locations. Make the stride equal to the number
of threads to eliminate the race.

BONUS: By fixing this race, PhiloxRandom constant-sized GPU initializers now
match CPU initializers.

* Update random_ops_test.py to find race conditions

Increasing the size of arrays in the random_ops_test.py test to manifest
the race conditions to be resolved.

---
Commit 2cbcda08f authored by Androbin<robin.richtsfeld@gmail.com>
Committed by Jonathan Hseu<vomjom@vomjom.net>:
Fixed formatting in Linux install guide (#10353)

Formatting issues were introduced in PR #8825, commit f30918b3694afe844990cbddc82e27e023d88856
---
Commit ab5f38560 authored by Lakshay Garg<lakshayg@outlook.in>
Committed by Jonathan Hseu<vomjom@vomjom.net>:
Fixed typos in documentation & READMEs (#10365)

---
Commit 94dc1dbfa authored by Christos Nikolaou<cNikolaou@users.noreply.github.com>
Committed by Jonathan Hseu<vomjom@vomjom.net>:
Enable figures in the tfprof README.md (#10372)

---
Commit 3018d4678 authored by Taehoon Lee<taehoonlee@snu.ac.kr>
Committed by Jonathan Hseu<vomjom@vomjom.net>:
Fix typos (#10386)

---
Commit c5f3c6171 authored by Daniel Rasmussen<drasmuss@users.noreply.github.com>
Committed by Jonathan Hseu<vomjom@vomjom.net>:
Fix unbatch for Datasets with multiple elements (#10401)

* Fix unbatch for datasets with multiple elements

* fixup! pylint (indent two spaces instead of four)

---
Commit 8b065bc10 authored by Yong Tang<yong.tang.github@outlook.com>
Committed by Jonathan Hseu<vomjom@vomjom.net>:
Fix unaligned args in api_docs/python/tf/contrib/learn/Evaluable (#10423)

This commit fixes unaligned args in api_docs/python/tf/contrib/learn/Evaluable

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
---
Commit 8f89b654f authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Profile memory usage in VirtualScheduler and report peak memory usage.
To do so, NodeState now handles different output ports of a node (in case
a node has multiple outputs).

Also, VirtualScheduler code is cleaned up with more comments.

PiperOrigin-RevId: 158209068

---
Commit 0ea0bf5aa authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Add a frontend for viewing the first ops that exhibit bad values (NaN, +/- Inf).

This helps the user identify problematic ops. Also moved the debugger data logic within tf-graph-info into a new tf-graph-debugger-data-card component.

PiperOrigin-RevId: 158208679

---
Commit ed47ecf2d authored by Luke Iwanski<luke@codeplay.com>
Committed by Benoit Steiner<benoitsteiner@users.noreply.github.com>:
[OpenCL] Cleans variable op (#10333)

* [OpenCL] Cleans variable op

* Fixes formatting and float / double -> GPU_NUMBER_TYPES_NO_HALF

---
Commit 9b2c1af63 authored by Luke Iwanski<luke@codeplay.com>
Committed by Benoit Steiner<benoitsteiner@users.noreply.github.com>:
[OpenCL] Improves device reporting (#10462)

Prints: id, type, name, vendor and profile of the device
---
Commit 7f5384dcc authored by Alexandre Passos<apassos@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Making load() work for resource variables.

PiperOrigin-RevId: 158205361

---
Commit 05412bd36 authored by Mark Heffernan<meheff@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[XLA] Simplify Shape traversal visitors.
Simplify shape traversal visitors in ShapeUtil and ShapeTree. Add a non-Status form because most uses of the traversal methods do not use it, and remove is_leaf parameter from ShapeTree.ForEach* as it is not frequently used.

PiperOrigin-RevId: 158201574

---
Commit 69c9365b4 authored by Mustafa Ispir<ispir@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Extracted linear estimator testing utils to be reused by dnn-linear-combined.
Added tests for linear part of dnn-linear-combined estimator.

PiperOrigin-RevId: 158200827

---
Commit 65ce8c723 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Add arrowheads to dataflow edges.
Make reference edges orange.
Remove animations from tooltips in the graph documentation.

Previously, arrowheads were only added to reference edges (because we assumed users knew about the convention that arrowless edges flow upwards). That decision nicely reduces clutter. However, recently, some internal and external folks have expressed confusion, and so I want to try adding arrowheads to all data flow edges. And make the reference edges starkly different.

See #10428

PiperOrigin-RevId: 158195388

---
Commit bf4c3dd6b authored by gunan<gunan@google.com>
Committed by GitHub<noreply@github.com>:
Revert "Fix patching issue on Windows" (#10472)

This reverts commit 47e6785646a1266f01a1a570bd799f8518ee2997.

---
Commit b49515539 authored by David Soergel<soergel@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Add only string constants to ASSET_FILEPATHS collection.

PiperOrigin-RevId: 158192152

---
Commit 51acad09c authored by Sergio Guadarrama<sguada@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Add tests with different delta to huber_loss.

PiperOrigin-RevId: 158191361

---
Commit a4e7b7add authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Fixes a bug in setting default optimizers for DNNLinearCombinedClassifier.

PiperOrigin-RevId: 158190192

---
Commit ddd67e333 authored by Luke Iwanski<luke@codeplay.com>
Committed by Jonathan Hseu<vomjom@vomjom.net>:
[OpenCL] Cleans reshape.cc (#10347)

* [OpenCL] Cleans reshape.cc

* Removes half and complex numbers.

 Half is extension and complex numbers needs implementation in Eigen first

---
Commit 3ca653304 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Update ops-related pbtxt files.

PiperOrigin-RevId: 158186454

---
Commit 8cda8660e authored by Luke Iwanski<luke@codeplay.com>
Committed by gunan<gunan@google.com>:
[OpenCL] Cleans sendrecv_ops.cc (#10345)

---
Commit 6915bb919 authored by Luke Iwanski<luke@codeplay.com>
Committed by gunan<gunan@google.com>:
[OpenCL] Cleans Slice op (#10341)

---
Commit 54998b45d authored by Michele Colombo<m-colombo@users.noreply.github.com>
Committed by Jonathan Hseu<vomjom@vomjom.net>:
BasicRNNCell comment fix (#10467)

---
Commit df5906fb7 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Mark saver/restore ops that depend on filesystem as stateful to disable them
from being folded into a constant by graph optimizer.

PiperOrigin-RevId: 158182282

---
Commit 96cb4d182 authored by Sergio Guadarrama<sguada@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Add support of scale_l1 == 0. or scale_l2 == 0 to l1_l2_regularizer.
Added tests.

PiperOrigin-RevId: 158179790

---
Commit b65eb3f9b authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Speed up atrous_convolution_test by combining evaluations.

To make this test run faster (and prevent it from timing out under
certain circumstances), this change combines all evaluations for each
test method into a single call to Session.run, to eliminate overhead.

This reduces the test time from about 40 seconds to 10 seconds.

RELNOTES: n/a
PiperOrigin-RevId: 158175227

---
Commit b440abce7 authored by Gao, Xiang<qasdfgtyuiop@gmail.com>
Committed by Rasmus Munk Larsen<rmlarsen@google.com>:
add Cuda{2D,3D}LaunchConfig that maximizes occupancy (#10032)

* add Cuda{2D,3D}LaunchConfig that max occupancy

* remove default val, check input<=0

* add max size check

* fix typo

* tests, docs, and related changes

* build the test

* buildify

* cudaOccupancy... call check success, and style fix

---
Commit 81cf61fdb authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Initialize tensor in graph_properties_test, to avoid msan complaint.

PiperOrigin-RevId: 158169374

---
Commit cabc5c35c authored by Eli Bendersky<eliben@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[XLA] Add xla_disable_hlo_passes to DebugOptions

Also add a SetDebugOptions method to ClientLibraryTestBas; this lets us set
debug options in tests by calling it.

As an example, this CL removes the current way of passing
xla_disable_hlo_passes programmatically in tests - it used to employ a special
constructor parameter which is no longer required.

PiperOrigin-RevId: 158169006

---
Commit 187d23337 authored by Luke Iwanski<luke@codeplay.com>
Committed by gunan<gunan@google.com>:
[OpenCL] Cleans Pad op (#10339)

---
Commit e8bc38ef6 authored by gunan<gunan@google.com>
Committed by GitHub<noreply@github.com>:
Fix test failures on windows. (#10470)

---
Commit 2b3535c64 authored by David Soergel<soergel@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Minor docstring fix for build_parsing_serving_input_receiver_fn

PiperOrigin-RevId: 158163615

---
Commit e55f2e036 authored by Benoit Steiner<bsteiner@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Propagates constants through switch nodes.

PiperOrigin-RevId: 158163537

---
Commit b01d4b905 authored by Jacques Pienaar<jpienaar@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[XLA] Remove outdated todo.

PiperOrigin-RevId: 158161411

---
Commit 7125733d7 authored by William Chargin<wchargin@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Create a set of sample data for the audio plugin

This implements a simple tone generator, with sine waves, square waves,
and triangle waves, plus two simple combinations of sine waves. The step
value is used to control the frequency.

PiperOrigin-RevId: 158160889

---
Commit dc81a2420 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Updates to the WALSMatrixFactorization estimator:
- Add a completed_sweeps variable to keep track of sweeps that have been completed during training.
- Add a StopAtSweepHook, which can request a stop after completing a specified number of sweeps.

PiperOrigin-RevId: 158156347

---
Commit 74220616c authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Set device cores and frequency in op_level_cost_estimator_test,
to avoid asan error about assigning inf to int64 (this comes
in from a divide-by-0).

PiperOrigin-RevId: 158155488

---
Commit 47e678564 authored by Yun Peng<pcloudy@google.com>
Committed by gunan<gunan@google.com>:
Fix patching issue on Windows (#10452)

---
Commit 6d54f09d9 authored by Yun Peng<pcloudy@google.com>
Committed by gunan<gunan@google.com>:
Fix linking errors of lmdb on Windows (#10457)

---
Commit 61c8a745b authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Minor cleanup: Add braces around if statement arms; remove redundant "return" and "static".

PiperOrigin-RevId: 158143418

---
Commit e9a889c5e authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Pass int parameter by value, not by const reference

PiperOrigin-RevId: 158142102

---
Commit 9184726ed authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Avoid unnecessary copying of map data during visitation

PiperOrigin-RevId: 158141962

---
Commit 2e7e1d57b authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Small fix for how std::move is used in constructors

PiperOrigin-RevId: 158141564

---
Commit 2a61c1652 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
In cpu compiler's CompileAheadOfTime, pass ordering when compiling entry computation.

PiperOrigin-RevId: 158140349

---
Commit f3f53e8b3 authored by Derek Murray<mrry@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[tf.contrib.data] Add support for dicts and remove lists from nested structures.

This changes the behavior of constructors like
`tf.contrib.data.Dataset.from_tensors()` when passed a list. Previously, the
`nest` utility would recurse into each element of such a list and create a
separate Dataset component. Now the list will be converted to a tensor, allowing code like:

```python
dataset = tf.contrib.data.Dataset.from_tensor_slices(([1, 2, 3], [4, 5, 6]))
```

...to define a dataset with two components (each of shape `()`).

This change also adds support for dictionaries as nested structures, which
simplifies integration with dictionary-returning ops like `tf.parse_example()`.

Fixes #10151.

RELNOTES: Breaking change to `tf.contrib.data.Dataset` APIs that expect a
nested structure. Lists are now converted to tf.Tensor implicitly. You may need
to change uses of lists to tuples in existing code. In addition, dicts are now
supported as a nested structure.
PiperOrigin-RevId: 158139467

---
Commit b6a8848c1 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Enabling python configuration to use a remotely generated configuration that is located inside of the org_tensorflow repo (previously it *had* to be a remote repo declared in workspace file).

PiperOrigin-RevId: 158138601

---
Commit 0fe0bfcc3 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Remove unused protobuf header inclusions

PiperOrigin-RevId: 158120864

---
Commit f0c4c6c3f authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
In the CUDA path of depthwise_conv2d, add a fast NCHW backward filter convolution for images smaller than 16x16.

PiperOrigin-RevId: 158111294

---
Commit 8dcf37b47 authored by Jon Malmaud<malmaud@gmail.com>
Committed by gunan<gunan@google.com>:
Fix typo (#10379)

---
Commit 3039d7da2 authored by Androbin<robin.richtsfeld@gmail.com>
Committed by gunan<gunan@google.com>:
Remove "bazel clean" (#10318)

Reverting #8880 (see #10236)
unnecessary since bazelbuild/bazel#2759 was merged
---
Commit ae1c16ae8 authored by Yifei Feng<fengyifei2026@gmail.com>
Committed by gunan<gunan@google.com>:
Update docker to cudnn6. (#10307)

* Update docker to cudnn6.

* Update Dockerfile.gpu

* Add --expunge to bazel clean to make cuda_configure run again and update TF_CUDNN_VERSION.

* Remove expunge and set CUDA and CUDNN version default in configure.

* Update configure

* Only set --action_env once

* Update prints for default version.

---
Commit 232e9d86d authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
tf_workspace() claims that the tf_repo_name argument is unused.
temp_workaround_http_archive still requires it.
This change silences the spurious message.

PiperOrigin-RevId: 158089834

---
Commit cc1a02d37 authored by Francois Chollet<fchollet@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Add fp16 support to convolutional layers that support it.

PiperOrigin-RevId: 158086284

---
Commit 7d3fbba48 authored by Mustafa Ispir<ispir@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Extracted dnn estimator testing utils to be reused by dnn-linear-combined.
Added tests for dnn part of dnn-linear-combined estimator.

PiperOrigin-RevId: 158084898

---
Commit 9d12c629c authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Refactor the document and some polishment

PiperOrigin-RevId: 158083952

---
Commit 134138299 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Corrected comment: import_scoped_metagraph does not return a Saver.

PiperOrigin-RevId: 158082288

---
Commit a58553e4d authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Add function in shape inference to try to infer output tensor content based on
the input shapes of the op. In some cases (E.g: shape), knowing the shapes of
the input is all that is necessary to infer the content of the output tensor.
This improves shape inference.

PiperOrigin-RevId: 158079306

---
Commit 0cc851c08 authored by Yuefeng Zhou<yuefengz@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Call maxcut algorithm in the model_based_cost_estimator.

PiperOrigin-RevId: 158078511

---
Commit 7d76a90be authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Add question marks next to items in the graph legend.

PiperOrigin-RevId: 158076005

---
Commit 68fdb7628 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Add DNNLinearCombinedClassifier.

PiperOrigin-RevId: 158075939

---
Commit 3d52e4cb9 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Fix create_meta_graph to respect an empty collection_list.

PiperOrigin-RevId: 158073112

---
Commit 54ccc3e5a authored by Mark Heffernan<meheff@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[XLA] Add module-scoped HLO dataflow analysis.
This is the first step to replacing TuplePointsToAnalysis with a global, module-scoped analysis. This dataflow analysis identifies all values and their defs and uses in the XLA graph. The analysis is currently unused. Follow up CLs will add buffer alias analysis using this dataflow analysis, and incrementally switch the transformation passes (for example, CopyInsertion) to use these new module-scoped analyses.

PiperOrigin-RevId: 158067910

---
Commit 93c57c6e4 authored by Benoit Steiner<bsteiner@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Handle control flow logic properly:
 * Don't fold enter/exit nodes since that can interact badly with frames
 * Create proper control dependencies on switch nodes

PiperOrigin-RevId: 158066691

---
Commit 9e6899720 authored by Jingyue Wu<jingyue@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[SE] Add cudnnTransformTensor to StreamExecutor.

PiperOrigin-RevId: 158062553

---
Commit 827874c30 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
In the CUDA path of depthwise_conv2d, add a fast NCHW backward input convolution for images smaller than 16x16.

PiperOrigin-RevId: 158061669

---
Commit bee26215c authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Speed up multinomial_op on CPU by using a vectorized Eigen expression and avoiding unnecessary casts.

Benchmark with AVX+FMA enabled:

Run on <redacted> (12 X 3492 MHz CPUs); 2017-06-05T12:54:07.881672447-07:00
CPU: Intel Haswell with HyperThreading (6 cores) dL1:32KB dL2:256KB dL3:15MB
Benchmark                          Base (ns)  New (ns) Improvement
------------------------------------------------------------------
BM_Multinomial_cpu_1_10000_4          250817    172953    +31.0%
BM_Multinomial_cpu_1_10000_128        273834    187552    +31.5%
BM_Multinomial_cpu_1_10000_10000     1174175   1130778     +3.7%
BM_Multinomial_cpu_1_100000_4        2040741   1276761    +37.4%
BM_Multinomial_cpu_32_10000_4       10221765   4498666    +56.0%
BM_Multinomial_cpu_32_10000_128     10638159   4994754    +53.0%
BM_Multinomial_cpu_32_100000_4      100790019  44193314    +56.2%
BM_Multinomial_cpu_128_100000_1     431269640  182506078    +57.7%
PiperOrigin-RevId: 158061480

---
Commit 515b3ac67 authored by Justine Tunney<jart@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Add Clutz to TensorBoard build

This is so we can get JavaScript protobufs. This CL also improves the
web_aspect and makes some peculiar Closure Compiler errors go away
relating to externs.

PiperOrigin-RevId: 158061198

---
Commit 0df6760fe authored by Benoit Steiner<bsteiner@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Added a test to make sure that graph properties for variables are properly
reported

PiperOrigin-RevId: 158053084

---
Commit 2ccfe8e76 authored by Benoit Steiner<bsteiner@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Added a new method to extract the graph properties from a cost graph without
having to run the model. This will simplify the process of creating regression
tests

PiperOrigin-RevId: 158050327

---
Commit 27f1b80c2 authored by Alexandre Passos<apassos@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Fixes memory leak in py_func when functions return unwrapped strings.

PiperOrigin-RevId: 158046530

---
Commit cf238e1f2 authored by Eugene Brevdo<ebrevdo@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Fix memory leak in python caused by @tf_should_use.

The issue is that python's GC has trouble collecting objects with __del__ methods.

The solution is two pronged:
* Keep track of usage state outside of the class, via a dict mapping
  id(object) => state
* Remove __del__ (this was the source: python's GC couldn't collect wrapped
  objects), and instead use weakref.finalize to emit warnings just as the object
  is being garbage collected.
* Added tests for garbage collection [they were failing before i fixed the issue]

PiperOrigin-RevId: 158042388

---
Commit e6f581863 authored by Bo Wang<david.b.wang@gmail.com>
Committed by Rasmus Munk Larsen<rmlarsen@google.com>:
New reader for LMDB databases (#9950)

* Add LMDBReader op and test case

* Add testcase to load LMDB from a folder

* Add tensorflow/core/lib/lmdb/testdata/data.mdb

* Add EOF test

* Add license export

* Blacklist the test data in pip_smoke_test.py

* Address issues with respect to review

* Add LICENSE to BUILD rules

* Remove the prefx of LICENSE

* Wrap key with compat.as_bytes()

* Fixed a compilation flag

* Improve BUILD rules

* Support LMDB build in cmake

* Fix BUILD file format with buildifier

* Add fake unistd.h for lmdb to build on Windows

* Avoid building lmdb tools which depends on unistd.h

* Fix the string encoding issue in Python3

* Update lmdb library name in CMakeList.txt

---
Commit cc411f938 authored by Yao Zhang<yaozhang@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
When converting the layout of Conv2DBackpropInput, we need to permute one of
its inputs, which is a constant node. We permute a copy of this node, instead of the
original node, because the original node may be used as input to other nodes.
This kind of sharing of const node could arise if the graph is pre-optimized by common
subexpression elimination, which is part of the L1 optimizations in
TensorFlow.

PiperOrigin-RevId: 158037552

---
Commit 88bdb6fca authored by Dandelion Man?<dandelion@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Remove all remaining references to non-public TF modules from TensorBoard.

I deleted the PluginAssetUtil tests because that code is deprecated.
I'll later add manual testing for backcompat in the text plugin.

PiperOrigin-RevId: 158037466

---
Commit 6c531eb2f authored by Francois Chollet<fchollet@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Add file hash to Keras Boston Housing dataset to force cache update.

PiperOrigin-RevId: 158036587

---
Commit afdc38cd3 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Remove deprecated resource handle functions in InferenceContext.

PiperOrigin-RevId: 158034419

---
Commit 9f932e6ce authored by Derek Murray<mrry@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Avoid parsing a rendezvous key for Send/Recv ops outside a loop.

For such ops, the rendezvous key will be constant, because
`ctx->frame_iter()` will always evaluate to `{0, 0}`. Benchmarking
reveals that this can save between 1 and 2 microseconds per Send or
Recv op execution. The optimization applies to all cross-process,
inter-device, and intra-device (host-to/from-device memory) Send/Recv
ops.

PiperOrigin-RevId: 158032522

---
Commit cc2dd4ac8 authored by Shanqing Cai<cais@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
tfdbg: dump debug data from different devices in separate directories

Fixes: #7051
wherein TFDBG failed to load the data dump from a Session.run() involving multiple GPUs.

The root cause of the bug was that TFDBG previously assumed that node names are unique across all partition graphs. This is however not the case when multiple GPUs exist. The Send/Recv nodes in the partition graphs of the GPUs can have duplicate names. There will potentially be other cases like this in the future due to other reasons (e.g., distributed sessions and/or graph optimization).

This CL relaxes this assumption, by dumping the GraphDef and tensor data from different devices into different sub-directories under the dump root directory.

PiperOrigin-RevId: 158029814

---
Commit a5909d643 authored by Toby Boyd<tobyboyd@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Fixed triggering create device multiple times

PiperOrigin-RevId: 158025196

---
Commit 504a307b7 authored by Martin Wicke<wicke@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Make sure that Adam colocates ops with a consistent variable across workers.

PiperOrigin-RevId: 158022292

---
Commit 69ba4d3d4 authored by Asim Shankar<ashankar@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Fix #10371

cpuinfo.get_cpu_info() doesn't seem to include the l2_cache_size key on some
architectures.

PiperOrigin-RevId: 158021008

---
Commit a51a9846c authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Performance-related tweaks: Don't copy loop variables; remove ineffective std::move casts.

PiperOrigin-RevId: 158017670

---
Commit 009789f74 authored by Peter Hawkins<phawkins@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[XLA] Allow 0-sized slices in DynamicSlice and DynamicUpdateSlice; add tests.

PiperOrigin-RevId: 158015870

---
Commit 48a4853eb authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Miscellaneous cleanups

PiperOrigin-RevId: 158012131

---
Commit 379ddde24 authored by Chris Song<sjhshy@gmail.com>
Committed by Chris Song<sjhshy@gmail.com>:
Fix misspells.

---
Commit a0a76da97 authored by Lakshay Garg<lakshay.garg.1996@gmail.com>
Committed by Lakshay Garg<lakshay.garg.1996@gmail.com>:
Fixed typo in code

---
Commit 7ffc35732 authored by Eugene Brevdo<ebrevdo@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Add support for bools in matrix_diag, matrix_diag_part, matrix_set_diag, matrix_band_part.

PiperOrigin-RevId: 157939272

---
Commit edf3d5dbe authored by Darren Garvey<darren.garvey@gmail.com>
Committed by Darren Garvey<darren.garvey@gmail.com>:
configure: Fix default path when enabling MPI.

Correct showing what the default path is when mpi is installed.

---
Commit aad2e3daf authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
In the CUDA path of depthwise_conv2d, add a fast NCHW forward convolution for images smaller than 16x16.

PiperOrigin-RevId: 157915637

---
Commit 5cf08d9cb authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Drop blockDim.y for the equivalent in_cols, and slightly improve naming (use 'pixels' instead of 'size' for height*width numbers).

PiperOrigin-RevId: 157906773

---
Commit 563f05ff6 authored by Eugene Brevdo<ebrevdo@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[tf contrib seq2seq] Expand tile_batch to handle nested structures.

This allows it to properly tile the initial wrapper state when using
BeamSearchDecoder with AttentionWrapper.  Unit tests updated to show this use.

PiperOrigin-RevId: 157903115

---
Commit 1234e2dda authored by Justine Tunney<jart@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Fix Plottable definition

On Mac OS the build directory in the Node package conflicts with BUILD.

PiperOrigin-RevId: 157899970

---
Commit bb7a8d8e7 authored by Benoit Steiner<bsteiner@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Don't use the _output_shape attribute in the op_level_cost_estimator since
there is no guaranty that it will be present or accurate.

PiperOrigin-RevId: 157898989

---
Commit 6f4204c3d authored by Justine Tunney<jart@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Fix TensorBoard SHA256 in cmake

PiperOrigin-RevId: 157897958

---
Commit c9d2f432b authored by Justine Tunney<jart@google.com>
Committed by Justine Tunney<jart@google.com>:
Fix TensorBoard SHA256 in cmake

---
Commit 1c70fb686 authored by Jianwei Xie<xiejw@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Add training test for multi classes (n>2) linear classifier.

PiperOrigin-RevId: 157896002

---
Commit 675d36be0 authored by Yao Zhang<yaozhang@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Add fused batch norm to tf.layers.

PiperOrigin-RevId: 157893874

---
Commit f37d0ea47 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Internal change -- first draft docs

PiperOrigin-RevId: 157891937

---
Commit 9b8f6113b authored by Zongheng Yang<zongheng@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
tensor_bundle: fix that the read path forgets to cache file handles.

In a case where a reader is geographically far from the file, this change
achieves a speedup of end-to-end checkpoint restore by 5.8x.

PiperOrigin-RevId: 157889659

---
Commit 0c92dada6 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Use inplace Cholesky factorization and solves to speed up and reduce memory usage in matrix_solve_ls.
Check succes before copying outputs in cholesky_op.

PiperOrigin-RevId: 157887564

---
Commit a4caeb2ea authored by William Chargin<wchargin@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Extract the graphs dashboard to a plugin

This completes the great plugin migration!

The graphs plugin is somewhat different from the plugins considered so
far. First, it exposes two kinds of data: graph data and run metadata.
We elect to put both sources of data under the domain of the graphs
plugin for now, because it's not clear that the run metadata would be
useful for anything else. Second, the graph data really has no use for
"tags": a run either has an associated graph or it does not. Thus, we
expose an endpoint /data/plugin/graphs/runs that is different in format
from the /tags routes exposed by other plugins (it returns just a list
instead of a run-to-tag mapping).

This change removes a bunch of tests from application_test.py. The tests
cover the compresion behavior of the graph endpoint, but the graph
endpoint doesn't have any special logic in the way of compression. Thus,
the tests are, apparently, testing that werkzeug (or whatever is
relevant here) provides good compression defaults. This isn't
necessarily a bad idea, but it shouldn't be coupled to the graph tests.

To get test data that includes run metadata, you can run this script:

    https://raw.githubusercontent.com/tensorflow/tensorflow/326942394e69074d50d5889218a24c9371eff259/tensorflow/examples/tutorials/mnist/mnist_with_summaries.py

PiperOrigin-RevId: 157884714

---
Commit 05a6a13f7 authored by Gunhan Gulsoy<gunan@google.com>
Committed by gunan<gunan@google.com>:
Make sure all writer caches are closed before deleting directories in dnn_test.

---
Commit d0e761f8d authored by Gunhan Gulsoy<gunan@google.com>
Committed by gunan<gunan@google.com>:
Disable another test that uses matrix_set_diag on windows.

---
Commit 8939b8562 authored by Derek Murray<mrry@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[tf.contrib.data] Re-implement IteratorGetNext as an AsyncOpKernel.

This prevents the op from consuming an inter-op thread pool thread
when blocked, and fixes a potential deadlock when many IteratorGetNext
ops are blocked. Fixes #10369.

PiperOrigin-RevId: 157878885

---
Commit 9e25c68ad authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Add loss_only_head to hold additional loss terms for multi_head setup

PiperOrigin-RevId: 157875934

---
Commit 7cdcd0cca authored by Benoit Steiner<bsteiner@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Filter more op types that don't benefit from constant folding.

PiperOrigin-RevId: 157875168

---
Commit 366990d92 authored by Kay Zhu<kayzhu@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[XLA] Fix a subtle issue in copy_insertion due the interaction between copy
overriding logic and RecordIndicesToColocatingBuffers:

- When building instructions ShapeTree to be copy overriden, it is possible
that we create a single kCopy for two identical instructions. An example can
be:

    %tuple.19 = tuple(%constant.4, %constant.1793, %constant.1793)

where it is used in a while.init operand, and constant.1793 is read-only within
the loop and also used by another while loop. The copy overriding pass will then
create the following (logical, not finalized) tuple:

    %tuple.19 = tuple(%constant.4, %copy.5, %copy.5)

- In the subsequent pass RecordAmbiguousOrNonDistinctIndices, to add copies to
ensure point_to set is distinct, the duplicate %copy.5 are ignored because they
are not yet finalized, and these indices (1 and 2 in the example) are still
marked as to-be copied.

Therefore distinctiveness is lost.

This fix applies to the override building stage, to explicitly avoid creating
shared copies for non-distinct buffers.

PiperOrigin-RevId: 157872231

---
Commit f4b8d21b8 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Change function parameters to references to avoid copying, or otherwise move from function parameters when moving reduces the amount of copying.

PiperOrigin-RevId: 157867333

---
Commit 3eee61caa authored by Drew Hintz<pushespretn@gmail.com>
Committed by GitHub<noreply@github.com>:
fix quotes in example code from ? to "
---
Commit 4905c0eae authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Remove TODO - the new tolerance is okay to keep.

PiperOrigin-RevId: 157861020

---
Commit 55f6b6ff1 authored by David Soergel<soergel@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Add explicit SparseTensor support to SignatureDef.

PiperOrigin-RevId: 157860466

---
Commit 79099d677 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Removes default thresholds from BinaryLogisticHead and adds predict and evaluate tests for DNNClassifier.

PiperOrigin-RevId: 157856471

---
Commit 54595f0f3 authored by Jianwei Xie<xiejw@google.com>
Comm…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment