-
Notifications
You must be signed in to change notification settings - Fork 74k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Branch 186214551 #17141
Merged
Merged
Branch 186214551 #17141
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
PiperOrigin-RevId: 186018787
PiperOrigin-RevId: 186019263
PiperOrigin-RevId: 186021386
…timization PiperOrigin-RevId: 186021666
This makes it easier for Hlo passes to do interesting rewrites with new, additional parameters which were not operands to the original fusion node. PiperOrigin-RevId: 186024182
Notice: unlike many NEON paths that we have in this optimized_ops.h file, which are enabled also on x86 by means of arm_neon_sse.h (#ifdef USE_NEON), this one is only enabled on real NEON (#ifdef GEMMLOWP_NEON). The reason for that is that gemmlowp's FixedPoint class is templatized in the underlying raw integer/register type, e.g. here int16x8_t, and on SSE there is only a single __m128i type for all integer types (both int16x8_t and int32x4_t), making it non-trivial to support this on SSE without contriving this code on NEON. PiperOrigin-RevId: 186031054
PiperOrigin-RevId: 186032527
…nels: the number of work_elements was too small, which could return a block_count that is too small to cover all elements. We also have been ignoring the suggested thread_per_block, so were potentially launching more blocks than necessary to fill the GPU (which is inefficient, but functionally correct). Changing 'assert(false && ...' to LOG(FATAL) because it shouldn't be debug only. PiperOrigin-RevId: 186037306
…in replicate_model_fn PiperOrigin-RevId: 186037416
PiperOrigin-RevId: 186038783
SimpleResolver became unused after an LLVM upstream merge, and we never needed the name mangling logic in what is now FindCompiledSymbol. PiperOrigin-RevId: 186039307
PiperOrigin-RevId: 186039949
Was slowing down the creation of _UnreadVariable objects. Adds CheckpointableBase without the __setattr__ override. It's tempting to just override __setattr__ in variables to try making it faster, but it's already just doing an isinstance check. Removing the override entirely seems to be the cleanest option. PiperOrigin-RevId: 186041147
where it needs efficient matrix*vector ("GEMV") code, but it's not exactly the same as the case of stand-alone fully-connected layers as here the output activations are 16bit-quantized. PiperOrigin-RevId: 186044068
…nges. PiperOrigin-RevId: 186045619
PiperOrigin-RevId: 186046129
…mputation. PiperOrigin-RevId: 186047964
PiperOrigin-RevId: 186048665
PiperOrigin-RevId: 186049156
PiperOrigin-RevId: 186050529
…h a single input to multiple hash ids. This column can be then used by one_hot_column to create a multi-hot column. PiperOrigin-RevId: 186050928
PiperOrigin-RevId: 186051752
PiperOrigin-RevId: 186053061
PiperOrigin-RevId: 186053793
…ive runtime conversion, and is a prerequisite to supporting dynamic non-recursive functions. PiperOrigin-RevId: 186053846
…nce for Gather Pretty much everything other than HLO verification and shape inference will fail for Gather with Unimplemented. Note that this CL is intentionally incomplete -- I figured it would be nicer to get some of the boiler-platey stuff out of the way early. Let me know if you want me to send in a larger but more complete CL instead. PiperOrigin-RevId: 186055521
PiperOrigin-RevId: 186055679
This is a prerequisite to moving toward a Saver-like model when graph building. We no longer mess with initializers (when graph building; eager needs it), and restore ops just get queued up and returned. Since initializers are left alone when graph building, there is a new special case for slot variables which needs to be handled. This is the third(!) queue for deferred slot restorations ((1) variable -> slot, (2) optimizer -> slot, (3) (optimizer, variable) -> slot), and should be the last one I need (it's a hypergraph with 3-tuple edges). The plan after this is to switch over to tf.train.Saver's existing restore op creation infrastructure, which will handle any SaveableObjects. There will also be a few CLs for making graph usage prettier, and eventually allowing eager/graph agnostic save/restore. PiperOrigin-RevId: 186059387
In an ideal world this won't make a difference since the compiler should be disciplined about not leaking host-level optimization artifacts into generated code. However, I think this provides some defense-in-depth in preventing non-obvious denormal behavior on the host side from messing up floating point constants etc. we want to embed into generated code. PiperOrigin-RevId: 186061140
PiperOrigin-RevId: 186062850
PiperOrigin-RevId: 186063941
#labeledtensor PiperOrigin-RevId: 186071210
…tion (only forward path). PiperOrigin-RevId: 186071285
PiperOrigin-RevId: 186072673
#labeledtensor PiperOrigin-RevId: 186073035
PiperOrigin-RevId: 186073337
…one instantiation of fixed-point Tanh, for 3 integer bits, regardless of the value of StateIntegerBits PiperOrigin-RevId: 186075161
PiperOrigin-RevId: 186075274
…ompilation. Also ran "buildozer warn //third_party/tensorflow/c/BUILD" and removed an unused symbol. PiperOrigin-RevId: 186081948
PiperOrigin-RevId: 186098155
Replace DCHECK with CHECK so that DoGemmWithAlgorithm is also called in non-debug mode to perform autotune. PiperOrigin-RevId: 186103809
Add the input argument (`foo`) to `tf.slice` example so that it actually works if it were run. Previously, the input argument was missing (perhaps implied), but the example is clearer with its inclusion. PiperOrigin-RevId: 186105694
…ative index scatter. PiperOrigin-RevId: 186202761
PiperOrigin-RevId: 186213207
This mirrors the behavior of usual graph construction where a Variable object is added to multiple collections. PiperOrigin-RevId: 186214551
Conflicts: RELEASE.md configure.py tensorflow/contrib/cmake/external/zlib.cmake tensorflow/contrib/cmake/python_modules.txt tensorflow/contrib/cmake/tests/cuda/compatibility_test.c tensorflow/contrib/cmake/tests/cuda/compatibility_test.cc tensorflow/contrib/data/python/ops/dataset_ops.py tensorflow/contrib/gan/python/eval/python/summaries_test.py tensorflow/contrib/layers/python/layers/layers.py tensorflow/contrib/layers/python/layers/layers_test.py tensorflow/contrib/tpu/profiler/pip_package/setup.py tensorflow/core/public/version.h tensorflow/docs_src/install/install_c.md tensorflow/docs_src/install/install_go.md tensorflow/docs_src/install/install_java.md tensorflow/docs_src/install/install_linux.md tensorflow/docs_src/install/install_mac.md tensorflow/docs_src/install/install_sources.md tensorflow/examples/image_retraining/retrain.py tensorflow/python/framework/test_util.py tensorflow/python/keras/_impl/keras/layers/lstm_test.py tensorflow/python/layers/utils.py tensorflow/python/ops/bitwise_ops_test.py tensorflow/python/ops/distributions/beta.py tensorflow/python/ops/image_ops_test.py tensorflow/python/ops/losses/losses_impl.py tensorflow/tools/pip_package/setup.py
tensorflow/contrib/opt:moving_average_optimizer_test
yifeif
approved these changes
Feb 20, 2018
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Manually merged, mostly formatting changes.
Conflicts:
RELEASE.md
configure.py
tensorflow/contrib/cmake/external/zlib.cmake
tensorflow/contrib/cmake/python_modules.txt
tensorflow/contrib/cmake/tests/cuda/compatibility_test.c
tensorflow/contrib/cmake/tests/cuda/compatibility_test.cc
tensorflow/contrib/data/python/ops/dataset_ops.py
tensorflow/contrib/gan/python/eval/python/summaries_test.py
tensorflow/contrib/layers/python/layers/layers.py
tensorflow/contrib/layers/python/layers/layers_test.py
tensorflow/contrib/tpu/profiler/pip_package/setup.py
tensorflow/core/public/version.h
tensorflow/docs_src/install/install_c.md
tensorflow/docs_src/install/install_go.md
tensorflow/docs_src/install/install_java.md
tensorflow/docs_src/install/install_linux.md
tensorflow/docs_src/install/install_mac.md
tensorflow/docs_src/install/install_sources.md
tensorflow/examples/image_retraining/retrain.py
tensorflow/python/framework/test_util.py
tensorflow/python/keras/_impl/keras/layers/lstm_test.py
tensorflow/python/layers/utils.py
tensorflow/python/ops/bitwise_ops_test.py
tensorflow/python/ops/distributions/beta.py
tensorflow/python/ops/image_ops_test.py
tensorflow/python/ops/losses/losses_impl.py
tensorflow/tools/pip_package/setup.py