Skip to content
This repository has been archived by the owner on Aug 15, 2019. It is now read-only.

Batch matmul #1259

Merged
merged 34 commits into from
Aug 30, 2018
Merged

Batch matmul #1259

merged 34 commits into from
Aug 30, 2018

Conversation

annxingyuan
Copy link
Collaborator

@annxingyuan annxingyuan commented Aug 30, 2018

RE: tensorflow/tfjs#539

This PR changes matmul so that it accepts any two tensors of rank >= 2. This mimics the behavior of Tensorflow in other languages.

The matmul op now passes its inputs to the kernels as 3D tensors, collapsing the batch dimensions together and creating a batch dimension of 1 if necessary. The kernels perform straightforward matmul on each batch element and return a 3D tensor, which is reshaped to its original dimensions before being returned.

Benchmark results

I ran matmul and downloaded the result 200 times for matrices of sizes 1, 100, 400, and 1000 across this branch and master. As we would expect, the results were very similar:

master: 7, 7, 12, 96 (miliseconds per operation)
batch_matmul: 7, 7, 10, 95


For repository owners only:

Please remember to apply all applicable tags to your pull request.
Tags: FEATURE, BREAKING, BUG, PERF, DEV, DOC, SECURITY

For more info see: https://github.com/tensorflow/tfjs/blob/master/DEVELOPMENT.md


This change is Reviewable

@nsthorat
Copy link
Contributor

About to review, but just wanted to show you the travis build. If you look at the bottom of the PR you'll see this (build is failing):

image

If you click on "details" then click on whichever node version is failing (in this case both), you'll see the problem.

You can run the linter locally with yarn lint.

@nsthorat
Copy link
Contributor

Also I think you can set up Sublime with clang-format so that some of these things like line length are done automatically.

The clang-format style we use is "Google", which you can see here: https://github.com/tensorflow/tfjs-core/blob/master/.vscode/settings.json#L24

Copy link
Contributor

@nsthorat nsthorat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 approvals obtained (waiting on @annxingyuan and @dsmilkov)


src/kernels/backend_cpu.ts, line 428 at r2 (raw file):

          }
        }
      }      

remove these trailing spaces

can you set up clang-format with sublime?


src/ops/matmul.ts, line 25 at r2 (raw file):

import {op} from './operation';

function computeBatchDimension_(shape: number[]) {

return type here


src/ops/matmul.ts, line 45 at r2 (raw file):

/** @doc {heading: 'Operations', subheading: 'Matrices'} */
function matMul_<T extends Tensor>(
    a: T|TensorLike, b: T|TensorLike, transposeA = false,

what happens if we pass a vector? we should add a unit test for that


src/ops/matmul_batch_test.ts, line 1 at r2 (raw file):

/**

let's just put these in matmul_test


src/ops/matmul_batch_test.ts, line 3 at r2 (raw file):

/**
 * @license
 * Copyright 2017 Google Inc. All Rights Reserved.

2018 Google LLC.


src/ops/matmul_batch_test.ts, line 24 at r2 (raw file):

describeWithFlags('matmulBatch', ALL_ENVS, () => {
  it('A x B', () => {
    const a = tf.tensor3d([-5, -5, -6, 8, -2, -8, 4, -7, -6, -9, -1, 3, 7, -2, 5, -6, 3, 8, 7, -8, 1, 4, -4, 6, 4, -4, -9, -5, 2, -2], [5, 2, 3]);

can you add a test for a 4D tensor?

Copy link
Contributor

@dsmilkov dsmilkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome progress. Left a few comments, but this is already looking great.

Reviewed 1 of 7 files at r1, 4 of 4 files at r2.
Reviewable status: 0 of 1 approvals obtained (waiting on @annxingyuan and @dsmilkov)


src/kernels/backend.ts, line 53 at r2 (raw file):

  floatPrecision(): number;

  matMul(a: Tensor3D, b: Tensor3D, transposeA: boolean, transposeB: boolean):

rename this method to batchMatMul (this is internal) so it's explicit that backends should do batched matmul and nodejs should bind to BatchMatMul Kernel in TF C++.


src/kernels/backend_cpu.ts, line 413 at r2 (raw file):

            const kBlock = Math.min(k0 + blockSize, sharedDim);

            for(let b = b0; b < bBlock; b++) {

does the batch need to be blocked? I was expecting a single outermost for loop over batch, and the blocked matmul should just treat batch as a constant.


src/kernels/backend_cpu.ts, line 419 at r2 (raw file):

                  for (let k = k0; k < kBlock; k++) {
                    sum += aValues[b * aBatchStrides + i * aOuterStep + k * aInnerStep] *

wrap to 80 width. See https://github.com/tensorflow/tfjs-core/blob/master/DEVELOPMENT.md#code-editor for pointers to clang-format.


src/kernels/backend_cpu.ts, line 428 at r2 (raw file):

          }
        }
      }      

remove trailing whitespace


src/kernels/webgl/mulmat_gpu.ts, line 57 at r2 (raw file):

          getMatrixB(${bSnippetFromOffset(2, 'i')}),
          getMatrixB(${bSnippetFromOffset(3, 'i')})
        );        

remove trailing whitespace


src/ops/matmul.ts, line 25 at r2 (raw file):

import {op} from './operation';

function computeBatchDimension_(shape: number[]) {

no need for trailing underscore for private methods - we do this for the ops only because we export as matmul = op({matmul_}) and we want to avoid name conflict.


src/ops/matmul.ts, line 26 at r2 (raw file):

function computeBatchDimension_(shape: number[]) {
  return shape.slice(0, -2).reduce((acc, curr) => acc * curr, 1);

reuse util.sizeFromShape(shape) instead of doing reduce.


src/ops/matmul.ts, line 62 at r2 (raw file):

  util.assert(
      $a.rank >= 2 && $b.rank >= 2 && $a.rank === $b.rank,
      `Error in matMul: inputs must have the same rank of at least 2, got ranks ${$a.rank}` +

wrap this to 80 width, otherwise the error message will have a newline character in it.


src/ops/matmul.ts, line 66 at r2 (raw file):

  util.assert(
      batchDimA === batchDimB,

according to the doc in tf.matmul in Python, "all outer dimensions must match", so checking the product is not sufficient. You can use util.arraysEqual() to assert that the outer-most shapes match.


src/ops/matmul.ts, line 67 at r2 (raw file):

  util.assert(
      batchDimA === batchDimB,
      `Error in matMul: batch dimensions (${batchDimA}) and (` + 

This error message is slightly misinformative to the user, because batchDimA is implicitly computed by multiplying the outer dimensions, but the user doesn't know that. Pass the outer shapes in the error message.

Copy link
Contributor

@dsmilkov dsmilkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 approvals obtained (waiting on @annxingyuan and @dsmilkov)


src/ops/matmul_batch_test.ts, line 54 at r2 (raw file):

  });

  it('annyuanA^t x B^t', () => {

left over debugging snippet. FYI, you can just rename it() to fit() and karma will run only that test (no need to do --grep )

Copy link
Collaborator Author

@annxingyuan annxingyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks guys. I think this is ready for another look.

Reviewable status: 0 of 1 approvals obtained (waiting on @annxingyuan)


src/kernels/backend.ts, line 53 at r2 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

rename this method to batchMatMul (this is internal) so it's explicit that backends should do batched matmul and nodejs should bind to BatchMatMul Kernel in TF C++.

Done


src/kernels/backend_cpu.ts, line 413 at r2 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

does the batch need to be blocked? I was expecting a single outermost for loop over batch, and the blocked matmul should just treat batch as a constant.

Done


src/kernels/backend_cpu.ts, line 419 at r2 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

wrap to 80 width. See https://github.com/tensorflow/tfjs-core/blob/master/DEVELOPMENT.md#code-editor for pointers to clang-format.

Done


src/kernels/backend_cpu.ts, line 428 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

remove these trailing spaces

can you set up clang-format with sublime?

Done


src/kernels/backend_cpu.ts, line 428 at r2 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

remove trailing whitespace

Done


src/kernels/webgl/mulmat_gpu.ts, line 57 at r2 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

remove trailing whitespace

Done


src/ops/matmul.ts, line 25 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

return type here

Done


src/ops/matmul.ts, line 25 at r2 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

no need for trailing underscore for private methods - we do this for the ops only because we export as matmul = op({matmul_}) and we want to avoid name conflict.

Done


src/ops/matmul.ts, line 26 at r2 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

reuse util.sizeFromShape(shape) instead of doing reduce.

Done


src/ops/matmul.ts, line 45 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

what happens if we pass a vector? we should add a unit test for that

We should want matmul to throw an error in that case right? That seems to be what tensorflow python does.


src/ops/matmul.ts, line 62 at r2 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

wrap this to 80 width, otherwise the error message will have a newline character in it.

Done


src/ops/matmul.ts, line 66 at r2 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

according to the doc in tf.matmul in Python, "all outer dimensions must match", so checking the product is not sufficient. You can use util.arraysEqual() to assert that the outer-most shapes match.

Done


src/ops/matmul.ts, line 67 at r2 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

This error message is slightly misinformative to the user, because batchDimA is implicitly computed by multiplying the outer dimensions, but the user doesn't know that. Pass the outer shapes in the error message.

Done


src/ops/matmul_batch_test.ts, line 1 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

let's just put these in matmul_test

Done


src/ops/matmul_batch_test.ts, line 3 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

2018 Google LLC.

Done


src/ops/matmul_batch_test.ts, line 24 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

can you add a test for a 4D tensor?

Done


src/ops/matmul_batch_test.ts, line 54 at r2 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

left over debugging snippet. FYI, you can just rename it() to fit() and karma will run only that test (no need to do --grep )

Done

Copy link
Contributor

@dsmilkov dsmilkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! LGTM (one tiny comment in cpu matmul)

Reviewed 8 of 8 files at r3.
Reviewable status: 0 of 1 approvals obtained (waiting on @annxingyuan)


src/kernels/backend_cpu.ts, line 423 at r3 (raw file):

                      bValues[k * bInnerStep + j * bOuterStep + b * bBatch];
                }
                result[b * (leftDim * rightDim) + (i * rightDim + j)] += sum;

precompute leftDim*rightDim outside all the for loops since it's constant. Usually I wouldn't push for this, but in this case, this is the hottest loop and we want to minimize cycles.

Copy link
Collaborator Author

@annxingyuan annxingyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 1 approvals obtained


src/kernels/backend_cpu.ts, line 423 at r3 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

precompute leftDim*rightDim outside all the for loops since it's constant. Usually I wouldn't push for this, but in this case, this is the hottest loop and we want to minimize cycles.

Done

@annxingyuan annxingyuan merged commit 4915101 into master Aug 30, 2018
@annxingyuan annxingyuan deleted the batch_matmul branch August 30, 2018 23:10
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
3 participants