Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upstream changes from internal #2028

Merged
merged 54 commits into from
Apr 20, 2016
Merged

Upstream changes from internal #2028

merged 54 commits into from
Apr 20, 2016

Conversation

martinwicke
Copy link
Member

No description provided.

teamdandelion and others added 30 commits April 15, 2016 14:13
Also update typings in package.json and add typing dependency on sinon (its provided by wct)
Change: 119988943
…te the inverse of the adjoint of a matrix. This is convenient in various gradient computations where it also saves an explicit transpose op.

Implements gradients for batch_matrix_determinant and optimizes implementation of gradient for matrix_determinant.
Change: 119990836
try_lock is used by eigen3/unsupported/Eigen/CXX11/src/Tensor/TensorRunQueue.h
which is disabled by default, so it left unnoticed.

TESTED:
  - passed opensource_build
Change: 119991000
The autoreload logic is factored out into a behavior, and is tested.
Change: 119991467
Auto-reloading defaults on and is controlled by a setting.

To have a place to put that setting, also add a settings panel.

The settings panel uses paper-dialog and is opened by clicking a settings icon button in the global top right.

Minor changes to how the toolbar is laid out by CSS to ensure that the settings all show up in a line and don't bunch up together.

Also, the "TensorBoard" message on the left no longer takes a fixed pixel width (everything is flex), so the toolbar will use space more appropriately when the window size is small.
Change: 119997005
…elected indices had a 0. Making it emulate the old behavior because gather stopped working for length 0 Tensors.

e.g.: Gathering row 1 from a Tensor of shape [2, 0] should result in a [1, 0] Tensor, but instead resulted in an error like:
AttributeError: index 1 is not in [0, 2)
Change: 119998790
…locator that has zero memory in its heap.

Change: 120002729
…ession in

the supervisor.

Make it innocuous to start a queue runner multiple times.
Change: 120010279
Also clarify that google.any expansion is not supported.
Change: 120013039
Usage: `tf.sparse_add(sp_tensor, tensor)`; CPU only for now.

The newly introduced ScatterNdFunctor can later be extended to handle
ScatterNd{Update,Add,Sub} ops, similar to the existing GatherNd op.
Change: 120018023
This change takes advantage of the fact that a dense gradient tensor
must have the same shape as the corresponding input to the
forward-pass op. It may enable additional optimizations (such as the
ability to use the accumulate_n aggregator) that rely on static shape
information.

This change revealed some broken code, for which fixes are included:

* The shape function for `tf.diag_part()` incorrectly used an equality
  test on shapes, rather than a compatibility test.

* The gradient function for `tf.nn.top_k()` incorrectly returned a
  length-1 vector for the gradient w.r.t. `k`, which it should have
  been a scalar.

* A unit test in `control_flow_ops_py_test` inconsistently used
  vectors and scalars.
Change: 120038251
…encies,

probably done in pure python TF, rather than wrapped C++.

Starting by moving lbeta from math_ops to special_math_ops, since a bugfix requires it to depend on control_flow_ops.  The bug was:  lbeta([]) returned -inf, whereas it should have returned 0.0.
Change: 120069192
// OLD
Benchmark                                 Time(ns)    CPU(ns) Iterations
------------------------------------------------------------------------
BM_ConvFloatDepthwiseBkFilterCPU1_conv0  281152179  280588497        100  588.2M items/s 32_112_112_3_8_24_3_3_1_2_cpu1
BM_ConvFloatDepthwiseBkFilterCPU1_conv1  760242956  758694909        100  580.1M items/s 32_112_112_64_1_64_3_3_1_2_cpu1
BM_ConvFloatDepthwiseBkFilterCPU1_conv2  383554418  382741182        100  574.9M items/s 32_56_56_128_1_128_3_3_1_2_cpu1
BM_ConvFloatDepthwiseBkFilterCPU1_conv3   98924384   98665676        100  557.2M items/s 32_56_56_128_1_128_3_3_2_2_cpu1
BM_ConvFloatDepthwiseBkFilterCPU1_conv4   94237506   94005920        100  585.0M items/s 32_28_28_128_1_128_3_3_1_2_cpu1
BM_ConvFloatDepthwiseBkFilterCPU1_conv5  106895864  106648144        100  515.7M items/s 32_14_14_512_1_512_3_3_1_2_cpu1
BM_ConvFloatDepthwiseBkFilterCPU1_conv6   69247718   69078442        100  398.0M items/s 32_7_7_1024_1_1024_3_3_1_2_cpu1
BM_ConvFloatDepthwiseBkFilterCPU1_conv7   70304661   70126053        100  588.1M items/s 32_112_112_3_8_24_3_3_2_2_cpu1
BM_ConvFloatDepthwiseBkFilterCPU1_conv8   67619710   67447142        100  611.4M items/s 32_112_112_3_8_24_3_3_2_1_cpu1

// NEW 1-thread
Benchmark                                 Time(ns)    CPU(ns) Iterations
------------------------------------------------------------------------
BM_ConvFloatDepthwiseBkFilterCPU1_conv0   59981294   59569328        100  2.7G items/s 32_112_112_3_8_24_3_3_1_2_cpu1
BM_ConvFloatDepthwiseBkFilterCPU1_conv1  165631344  165250674        100  2.6G items/s 32_112_112_64_1_64_3_3_1_2_cpu1
BM_ConvFloatDepthwiseBkFilterCPU1_conv2   76910026   76705735        100  2.8G items/s 32_56_56_128_1_128_3_3_1_2_cpu1
BM_ConvFloatDepthwiseBkFilterCPU1_conv3   21491439   21375872        100  2.5G items/s 32_56_56_128_1_128_3_3_2_2_cpu1
BM_ConvFloatDepthwiseBkFilterCPU1_conv4   18677714   18587209        100  2.9G items/s 32_28_28_128_1_128_3_3_1_2_cpu1
BM_ConvFloatDepthwiseBkFilterCPU1_conv5   2347423   23377934        100  2.3G items/s 32_14_14_512_1_512_3_3_1_2_cpu1
BM_ConvFloatDepthwiseBkFilterCPU1_conv6   17066829   16982791        100  1.6G items/s 32_7_7_1024_1_1024_3_3_1_2_cpu1
BM_ConvFloatDepthwiseBkFilterCPU1_conv7   14822571   14744419        100  2.7G items/s 32_112_112_3_8_24_3_3_2_2_cpu1
BM_ConvFloatDepthwiseBkFilterCPU1_conv8   14325480   14254559        100  2.8G items/s 32_112_112_3_8_24_3_3_2_1_cpu1

// NEW 4-threads
Benchmark                                 Time(ns)    CPU(ns) Iterations
------------------------------------------------------------------------
BM_ConvFloatDepthwiseBkFilterCPU4_conv0   21809044   69141049        100  7.4G items/s 32_112_112_3_8_24_3_3_1_2_cpu4
BM_ConvFloatDepthwiseBkFilterCPU4_conv1   57704422  192333505        100  7.5G items/s 32_112_112_64_1_64_3_3_1_2_cpu4
BM_ConvFloatDepthwiseBkFilterCPU4_conv2   29761264   91848609        100  7.2G items/s 32_56_56_128_1_128_3_3_1_2_cpu4
BM_ConvFloatDepthwiseBkFilterCPU4_conv3    9075773   26429821        100  5.9G items/s 32_56_56_128_1_128_3_3_2_2_cpu4
BM_ConvFloatDepthwiseBkFilterCPU4_conv4    7276754   22100190        100  7.4G items/s 32_28_28_128_1_128_3_3_1_2_cpu4
BM_ConvFloatDepthwiseBkFilterCPU4_conv5    6756189   24510067        100  8.0G items/s 32_14_14_512_1_512_3_3_1_2_cpu4
BM_ConvFloatDepthwiseBkFilterCPU4_conv6    4837993   17723279        142  5.6G items/s 32_7_7_1024_1_1024_3_3_1_2_cpu4
BM_ConvFloatDepthwiseBkFilterCPU4_conv7    6676347   19935585        100  6.0G items/s 32_112_112_3_8_24_3_3_2_2_cpu4
BM_ConvFloatDepthwiseBkFilterCPU4_conv8    5951583   17181079        100  6.8G items/s 32_112_112_3_8_24_3_3_2_1_cpu4

TESTED:
  - passed opensource_build
  - passed unit tests
Change: 120125325
Benchmark                          Base (ns)  New (ns) Improvement
------------------------------------------------------------------
BM_MaxPool_32_112_112_64_3_3_2_VALID_1 28173747  28956041     -2.8%
BM_MaxPool_32_56_56_192_3_3_2_VALID_1 14467716  14581478     -0.8%
BM_MaxPool_32_28_28_352_3_3_2_VALID_1  5318842   5367336     -0.9%
BM_MaxPool_32_14_14_576_3_3_2_VALID_1  1331917   1351642     -1.5%
BM_MaxPool_32_112_112_64_3_3_2_SAME_1 28757024  29005280     -0.9%
BM_MaxPool_32_56_56_192_3_3_2_SAME_1 15119295  15478783     -2.4%
BM_MaxPool_32_28_28_352_3_3_2_SAME_1  5802450   5871220     -1.2%
BM_MaxPool_32_14_14_576_3_3_2_SAME_1  1632582   1662128     -1.8%
BM_MaxPool_32_112_112_64_3_3_2_VALID_4 28579650   8240771    +71.2%
BM_MaxPool_32_56_56_192_3_3_2_VALID_4 14621344   4373595    +70.1%
BM_MaxPool_32_28_28_352_3_3_2_VALID_4  5404303   1571711    +70.9%
BM_MaxPool_32_14_14_576_3_3_2_VALID_4  1343607    427873    +68.2%
BM_MaxPool_32_112_112_64_3_3_2_SAME_4 29195151   8204002    +71.9%
BM_MaxPool_32_56_56_192_3_3_2_SAME_4 15314088   4642979    +69.7%
BM_MaxPool_32_28_28_352_3_3_2_SAME_4  6094918   1777112    +70.8%
BM_MaxPool_32_14_14_576_3_3_2_SAME_4  1643584    544554    +66.9%

TESTED:
  - passed opensource_build
  - passed unit tests
Change: 120128184
…d an utility to convert a freezed graph into this format.

Change: 120128412
- remove flags (pushed to internal client)
- thread-parallel execution for CountExtremeleyRandomStats op.
- critical time-seeding fix.
Change: 120129936
A. Unique TensorFlower and others added 23 commits April 18, 2016 09:42
…s, softmax_cross_entropy_loss, and sum_of_pairwise_squares_loss. Refactoring old losses for consistency.

Change: 120130871
Class represents multi-indexed batches of Dirichlet Multinomial distributions.  Initialized with parameters alpha, which broadcast to arbitrary shapes to match arguments in e.g. dist.pdf(x).
Change: 120138028
…and expanding documentation of get_collection.

Change: 120141411
TESTED:
  - passed opensource_build (only known failing test loss_ops test fails)
  - passed unit tests
Change: 120156989
Change: 120164016
Fixes a bug in gather that would segfault when gathering
from very large (>2^31 entry) parameter tensors.

Gather can now handle index vectors with more than 2^31 entries,
(if you have enough memory).
Change: 120171737
Combines the 6 entities (sparse, dense) x (features, weights, delta-weights)
into a single class (FeaturesAndWeights) which tracks all features and
weights, hiding their underlying representation.  This class is built by
composing other classes:
  FeaturesAndWeights
    SparseFeaturesAndWeights
      (examples_by_group_)
      WeightsAndDeltas
        WeightsByGroup
        (delta_weights_by_group_)
    DenseFeaturesAndWeights
      (features_by_group_)
      WeightsAndDeltas (same as above)

Also adds a microbenchmark.
Change: 120173207
Change: 120192253
- Oneof values should be printed even when equal to the default.
- Fields should be printed in tag number order, not declaration order.
Change: 120223509
@martinwicke martinwicke self-assigned this Apr 19, 2016
@googlebot
Copy link

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for the commit author(s). If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.

@martinwicke martinwicke merged commit cc9bfbf into tensorflow:master Apr 20, 2016
fsx950223 pushed a commit to fsx950223/tensorflow that referenced this pull request Nov 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet