Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[optimization] Perform vertical FC weights split optimization. #3481

Closed
wants to merge 3 commits into from

Conversation

@rdzhabarov
Copy link
Contributor

commented Aug 30, 2019

Summary:

  • Perform vertical split of FC for better parallelization.
  • Add ElemKind::Int32QTy to the Slice node for Interpreter (this is used in folding of bias through the constant folding pass in the interpreter).

Backends could invoke this pass as part of the backend specific optimizations.

Documentation:
doxygen where applicable

[Optional Fixes #issue]
Addresses optimization part on the #3462

Test Plan:
Added unit test.

@rdzhabarov rdzhabarov changed the title **WIP** [optimization] Perform FC optimization. **WIP** [optimization] Perform vertical FC weights split optimization. Aug 30, 2019

@rdzhabarov rdzhabarov force-pushed the rdzhabarov:optimizationFC branch 3 times, most recently from 2f30e2d to 61f1dea Sep 4, 2019

@rdzhabarov rdzhabarov force-pushed the rdzhabarov:optimizationFC branch from 61f1dea to 18a371c Sep 4, 2019

@rdzhabarov rdzhabarov changed the title **WIP** [optimization] Perform vertical FC weights split optimization. [optimization] Perform vertical FC weights split optimization. Sep 4, 2019

@rdzhabarov

This comment has been minimized.

Copy link
Contributor Author

commented Sep 4, 2019

@facebook-github-bot
Copy link

left a comment

@rdzhabarov has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@jfix71
Copy link
Contributor

left a comment

Looks pretty good to me, just a few comments.

lib/Optimizer/GraphOptimizerPipeline/Pipeline.cpp Outdated Show resolved Hide resolved
tests/unittests/GraphOptzTest.cpp Outdated Show resolved Hide resolved

size_t elemPerChunk = (bias.dims()[0] + numOfChunks - 1) / numOfChunks;
size_t sliceStart = 0;
llvm::SmallVector<NodeValue, numOfChunks> fcs;

This comment has been minimized.

Copy link
@jfix71

jfix71 Sep 4, 2019

Contributor

nit: Could just use a normal NodeValue array of size numOfChunks, since we know that's the exact number of NodeValues we'll have.

This comment has been minimized.

Copy link
@rdzhabarov

rdzhabarov Sep 9, 2019

Author Contributor

standard array was not easy convertible to llvm::ArrayRef which is required in the createConcat call.
Replaced small vector with fixed preallocated std::vector (note, cannot use numOfChunks anymore in the small vector due to it not being constexpr).

lib/Optimizer/GraphOptimizer/GraphOptimizer.cpp Outdated Show resolved Hide resolved
tests/unittests/GraphOptzTest.cpp Outdated Show resolved Hide resolved
@nadavrot

This comment has been minimized.

Copy link
Contributor

commented Sep 4, 2019

I like this PR, but I think that we can improve things with management of complexity (considering support for multiple targetts). Let's split this PR into two sections: the policy and mechanism. Consider having a function that's called 'splitFCs' that accepts some parameters like numOfChunks and minKToSplit instead of the constants. Now you can call this function splitFCs(12, 800) from the code.

The next step would be, how do you decide on the parameters and how you manage the constants for the different backends. I suggest making this a target-specific transformation. The target should decide on the parameters to pass to the transformation function, which needs to be globally accessible in the optimizer.

@jfix71 @rdzhabarov

@rdzhabarov

This comment has been minimized.

Copy link
Contributor Author

commented Sep 4, 2019

Consider having a function that's called 'splitFCs' that accepts some parameters like numOfChunks and minKToSplit instead of the constants. Now you can call this function splitFCs(12, 800) from the code.

That all makes sense and provides better flexibility.

We briefly talked about this in person I agree with this.

...

I'll address comments today/tomorrow.

@facebook-github-bot
Copy link

left a comment

@rdzhabarov has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@rdzhabarov rdzhabarov force-pushed the rdzhabarov:optimizationFC branch 2 times, most recently from ff71ecf to 893a19f Sep 9, 2019

@rdzhabarov rdzhabarov force-pushed the rdzhabarov:optimizationFC branch from 893a19f to eaa57b5 Sep 9, 2019

@facebook-github-bot
Copy link

left a comment

@rdzhabarov has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@jfix71
jfix71 approved these changes Sep 10, 2019
Copy link
Contributor

left a comment

LGTM, few nits. Nice work!

RE: the small vector/standard array/vector thing, I think you should be able to use llvm::makeArrayRef with a standard array? But no big deal.

lib/Optimizer/GraphOptimizer/GraphOptimizer.cpp Outdated Show resolved Hide resolved
lib/Optimizer/GraphOptimizer/GraphOptimizer.cpp Outdated Show resolved Hide resolved
lib/Optimizer/GraphOptimizer/GraphOptimizer.cpp Outdated Show resolved Hide resolved

@rdzhabarov rdzhabarov force-pushed the rdzhabarov:optimizationFC branch from a26f202 to d996a0d Sep 10, 2019

@facebook-github-bot
Copy link

left a comment

@rdzhabarov is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot

This comment has been minimized.

Copy link

commented Sep 11, 2019

@rdzhabarov merged this pull request in e875866.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.