Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Branch 188540944 #17603

Merged
merged 98 commits into from Mar 10, 2018
Merged

Branch 188540944 #17603

merged 98 commits into from Mar 10, 2018

Conversation

akshaym
Copy link
Contributor

@akshaym akshaym commented Mar 9, 2018

No description provided.

miaout17 and others added 30 commits March 7, 2018 17:46
PiperOrigin-RevId: 188263046
PiperOrigin-RevId: 188263337
…iltins_test, which somehow got left out.

PiperOrigin-RevId: 188264644
This lets you use Dimension objects in numerical computations; e.g.,
it lets you evaluate expressions like 3 + my_tensor.shape[0] when executing
eagerly.

At time of writing, without this change,

`matplotlib.pyplot.plt(my_tensor, my_other_tensor)`

fails when executing eagerly, but it works with this change.

PiperOrigin-RevId: 188265500
PiperOrigin-RevId: 188272354
PiperOrigin-RevId: 188273192
In libc++ std::map and std::multimap call the comparison functor
from a const object, which requires the `operator()` to be a const
method.

PiperOrigin-RevId: 188285407
…ix powers of the approximate Fisher

- Added multi-tower support to multi/RNN fully connected layers
- All op creation is now done inside functions that explicitly create ops, thus allowing fine control of their placement. One result of this is that we no longer need any colocation statements (and these have been removed)
- Multi-tower computations are now handled using ParitionedTensor class, which appears to be a single tensor to the FisherFactors but actually contains a list of tensors.
- To achieve the above damping values are passed around as special functions that are packaged along with "ids" that can be used to uniquely identify the computation they perform.  Topohash might provide a better solution for this in the future.
- Variable creation in the factors is now done via special methods so we can have fine control over where these are placed
- FisherEstimator now has special functions to create ops and variables using different placement strategies (currently: no strategy, round-robin, and as thunks).  By default this will use the round-robin strategy and manufacture the usual convenience properties ("inv_update_ops", etc).  This default behavior is to preserve backwards compatibility but in the future we should deprecate this and require the user to ask for an explicit strategy.
- LossFunctions no longer make any ops in their constructors.  The only make ops when evaluated.  LayerCollection maintains a list of tensors/ops which we can colocate LossFunction computations with (typically their inputs)
- LossFunctions no longer support multi-tower/mini-batches directly. Instead LayerCollection maintains a list of these objects, one for each tower.  This solution is better since now the loss function related computations can take place exclusively on the corresponding tower.
- All loss functions now support multiple towers/minibatches (via LayerCollection).
- tf.gradients is passed list of loss function values instead of their sum, which will prevent extraneous gradient ops being placed on arbitrary devices.  Hopefully with this change and the above one for loss functions all ops associated with gradient computations (for computing stats) will occur completely on the device that defines that part of the graph.  e.g. this will do the right thing for multiple towers
- I've also made sure that sensible colocation occurs for the extra ops needed by the curvature_propagation and exact estimation modes.
- Variables and ops made by FisherEstimator are now placed inside of name scopes (based on the name given to FisherEstimator)
- Restored old variable use count tracker implementation, thus fixing the issue with how generic registrations were handled by check_registration().
- Restored interface to FisherEstimator (which was changed in the previous CL).
- Fixed bug in LazyKFacOptimizer: optional/named arguments weren't being passed in properly
- Lots of other minor refactors/improvements

PiperOrigin-RevId: 188310846
…d repeated staging.

PiperOrigin-RevId: 188316160
…sponding script.

PiperOrigin-RevId: 188324090
This requires adding a special case to SourceIndexOfBitcast if the bitcast is a
reshape.

PiperOrigin-RevId: 188324197
PiperOrigin-RevId: 188327338
…s logic

of launching an XLA computation.

Also changes the resource variable container from a std::vector<OptionalTensor>
to a std::map<int, OptionalTensor> in preparation for backends where the
resource variables aren't ordered densely at the end of the argument list.

PiperOrigin-RevId: 188335574
PiperOrigin-RevId: 188339438
In the common case of clean termination, we can avoid performing several atomic
operations and allocations.

PiperOrigin-RevId: 188339594
…orward compatibility around TF op attributes.

PiperOrigin-RevId: 188359164
Adds initialization methods to Platform.  Some platforms require initialization.
Those that do not have trivial implementations of these methods.

PiperOrigin-RevId: 188363315
…n all of the

cases that are changed by this CL, a failure indicates a software bug, not a runtime
condition that should be handled and continued beyond.  Continuing to execute only promotes silently-ignored bugs.

I also removed the useless call which attempts to set the HTTP protocol to HTTP/2, because this call always fails.  I opened b/74351157 to track the possible feature of adding support for HTTP/2.

Also simplified the code around constructing the error string when returning actual Status objects, by moving code into a lambda.

PiperOrigin-RevId: 188363531
I need something like this for my Gather HLO->HLO lowering pass.

PiperOrigin-RevId: 188365102
tensorflower-gardener and others added 20 commits March 9, 2018 10:38
…ency on eigen!)

PiperOrigin-RevId: 188504172
Fusing GTE works, but it's slower than not fusing.  (In some sense, GTE
is *always* fused; it's just that our "implicit fusion" implementation
is faster than our explicit fusion implementation.)

PiperOrigin-RevId: 188509801
PiperOrigin-RevId: 188512706
…h a runner to execute kernels with. In that case, it defaults to using the threadpool provided by the device.

Also makes sure each device has a default threadpool to fall back on.

PiperOrigin-RevId: 188520648
…to third_party/tensorflow/python/training (move WarmStartSettings definition to third_party/tensorflow/python/estimator/estimator.py), and make _warm_start() public under tf.train.warm_start(). WarmStartSettings and VocabInfo are both available under tf.estimator, and VocabInfo is also available under tf.train.

PiperOrigin-RevId: 188522820
PiperOrigin-RevId: 188528771
PiperOrigin-RevId: 188534066
PiperOrigin-RevId: 188540659
@akshaym akshaym requested a review from yifeif March 9, 2018 23:00
@@ -267,6 +267,7 @@ cuda_py_test(
"//tensorflow/python/keras",
],
tags = ["no_windows"], # TODO: needs investigation on Windows
tags = ["notsan"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs merge?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@akshaym
Copy link
Contributor Author

akshaym commented Mar 9, 2018

I disabled a flaky test

@akshaym akshaym merged commit 851c289 into tensorflow:master Mar 10, 2018
StanislawAntol pushed a commit to StanislawAntol/tensorflow that referenced this pull request Mar 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet