-
Notifications
You must be signed in to change notification settings - Fork 2k
[core] Modularize Div, DivNoNan. #2946
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
nsthorat
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 20 of 21 files at r1, 1 of 1 files at r2.
Reviewable status:complete! 1 of 1 approvals obtained (waiting on @annxingyuan, @dsmilkov, @lina128, @nsthorat, and @tafsiri)
tfjs-core/src/kernel_names.ts, line 26 at r2 (raw file):
export type BinaryInputs = Pick<NamedTensorInfoMap, 'a'|'b'>; export const Div = 'Div';
what do you think of doing DivInputs = BinaryInputs here? I feel like we should avoid this categorization when possible. Yannick?
tfjs-core/src/backends/cpu/backend_cpu.ts, line 388 at r2 (raw file):
const sumExp = this.sum(b, axes).reshape(expandedShape); return b.div(sumExp);
use this.div (a test should actually fail because of this)
tfjs-core/src/backends/cpu/kernels/Div.ts, line 22 at r2 (raw file):
import {createBinaryOp} from '../utils/kernel_utils'; export const div = createBinaryOp((a: number, b: number) => a / b);
shouldnt this be createBinaryKernel?
tfjs-core/src/backends/cpu/utils/kernel_utils.ts, line 26 at r2 (raw file):
import {assertNotComplex} from '../cpu_util'; export const createBinaryKernelConfig =
export function
tfjs-core/src/backends/cpu/utils/kernel_utils.ts, line 50 at r2 (raw file):
}); export const createBinaryOp = (op: (a: number, b: number) => number) =>
export function (createBinaryKernel)
tfjs-core/src/backends/webgl/backend_webgl.ts, line 1588 at r2 (raw file):
const sumExp = this.sum(b, axes).reshape(expandedShape); return b.div(sumExp);
this.div
tfjs-core/src/backends/webgl/kernels/Div.ts, line 28 at r2 (raw file):
import {BinaryOpPackedProgram} from '../binaryop_packed_gpu'; export const divImpl =
export function
tfjs-core/src/backends/webgl/kernels/SquaredDifference.ts, line 29 at r2 (raw file):
backendName: 'webgl', kernelFunc: ({inputs, backend}) => { const {a, b} = inputs as BinaryInputs;
i think we should keep these all SquaredDifferenceInputs (the name of it), yannick?
|
Wait for @tafsiri :) |
| const sumExp = this.sum(b, axes).reshape(expandedShape); | ||
|
|
||
| return this.realDivide(b, sumExp) as T; | ||
| return b.div(sumExp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Ann, isn't it going back to the public API? I thought we want to avoid this as much as possible, see this paragraph in the modularization design doc: "When writing a kernel within a backend it is quite reasonable to call other kernels. We want to avoid using the public api to do this. Thus if functionality from another kernel is needed, that functionality should be factored into a separate file (with no side effects) and imported and used directly." Or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline, this is a intermediate step to avoid circular dependency. LGTM.
tafsiri
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 15 of 21 files at r1.
Reviewable status:complete! 1 of 1 approvals obtained (waiting on @annxingyuan, @dsmilkov, @lina128, @nsthorat, and @tafsiri)
tfjs-core/src/kernel_names.ts, line 26 at r2 (raw file):
Previously, nsthorat (Nikhil Thorat) wrote…
what do you think of doing DivInputs = BinaryInputs here? I feel like we should avoid this categorization when possible. Yannick?
+1 I had the same thought initially. But willing to be convinced otherwise.
1 pro is that it allows looking at this file (rather than the op) to quickly tell what one should import when implementing a kernel.
tfjs-core/src/backends/cpu/backend_cpu.ts, line 496 at r1 (raw file):
} realDivide(a: Tensor, b: Tensor): Tensor {
This is a breaking change for other backends. We shouldn't do this unless you are also modularizing the ops in those backends.
Also should we register a RealDiv kernel so that converter can call it directly without needing logic to convert RealDiv to Div?
tfjs-core/src/backends/cpu/utils/kernel_utils.ts, line 50 at r1 (raw file):
}); export const createBinaryOp = (op: (a: number, b: number) => number) =>
Let's rename this to something else since 'op' means something more specific in our context. maybe createBinaryKernelImpl
tfjs-core/src/backends/webgl/backend_webgl.ts, line 2478 at r1 (raw file):
runWebGLProgram( program: GPGPUProgram, inputs: TensorInfo[], outputDtype: DataType, output?: TensorInfo,
We can discuss this in sync but wanted to return to the discussion of why runWebGLprogram wont create output Tensors? Other than that I think we should begin to document things like this (especially since it is optional). So could you add a comment for this function.
tfjs-core/src/backends/webgl/kernels/Div.ts, line 28 at r1 (raw file):
import {BinaryOpPackedProgram} from '../binaryop_packed_gpu'; export const divImpl =
If this will be reused elsewhere, put it in a utils file.
tfjs-core/src/gradients/Div_grad.ts, line 27 at r1 (raw file):
inputsToSave: ['a', 'b'], gradFunc: (dy: Tensor, saved: Tensor[]) => { const [$a, $b] = saved;
this should be [a, b] given the inputsToSave spec above
tfjs-core/src/gradients/Div_grad.ts, line 31 at r1 (raw file):
broadcast_util.assertAndGetBroadcastShape($a.shape, $b.shape); const derA = () => { const res = dy.div($b.toFloat());
We should either use non chained ops or import the chained_op helpers. To ensure this you can remove div from the tensor class and ophandler
tfjs-core/src/ops/div.ts, line 97 at r1 (raw file):
const inputs: BinaryInputs = {a: $a, b: $b}; const attrs = {}; const inputsToSave = [$a, $b];
No need to add inputsToSave since this is now on the gradient config.
tfjs-core/src/ops/div.ts, line 98 at r1 (raw file):
const attrs = {}; const inputsToSave = [$a, $b]; const outputsToSave: boolean[] = [];
No need to add outputsToSave since this is now on the gradient config.
tfjs-core/src/ops/div.ts, line 101 at r1 (raw file):
return ENGINE.runKernelFunc( forward, inputs as {} as NamedTensorMap, der, Div, attrs,
Just FYI, i'm going to look into confirming/making sure we do not need to pass der for gradients that have been modularized. I suspect this should be true now given a recent PR but some clean up work may be needed.
tfjs-core/src/ops/divNoNan.ts, line 65 at r1 (raw file):
[$a, $b] = makeTypesMatch($a, $b); const divResult = div($a, $b);
I think this should be its own kernel. This would allow converter to call it directly. Thoughts?
|
Note I retracted the comment about removing realDivide being a breaking change. I mistook which file that was. |
annxingyuan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 2 of 1 approvals obtained (waiting on @annxingyuan, @dsmilkov, @lina128, @nsthorat, and @tafsiri)
tfjs-core/src/kernel_names.ts, line 26 at r2 (raw file):
Previously, tafsiri (Yannick Assogba) wrote…
+1 I had the same thought initially. But willing to be convinced otherwise.
1 pro is that it allows looking at this file (rather than the op) to quickly tell what one should import when implementing a kernel.
Done
tfjs-core/src/backends/cpu/backend_cpu.ts, line 388 at r2 (raw file):
Previously, nsthorat (Nikhil Thorat) wrote…
use this.div (a test should actually fail because of this)
Done
tfjs-core/src/backends/cpu/kernels/Div.ts, line 22 at r2 (raw file):
Previously, nsthorat (Nikhil Thorat) wrote…
shouldnt this be createBinaryKernel?
Done
tfjs-core/src/backends/cpu/utils/kernel_utils.ts, line 50 at r1 (raw file):
Previously, tafsiri (Yannick Assogba) wrote…
Let's rename this to something else since 'op' means something more specific in our context. maybe createBinaryKernelImpl
Done
tfjs-core/src/backends/cpu/utils/kernel_utils.ts, line 26 at r2 (raw file):
Previously, nsthorat (Nikhil Thorat) wrote…
export function
Done
tfjs-core/src/backends/cpu/utils/kernel_utils.ts, line 50 at r2 (raw file):
Previously, nsthorat (Nikhil Thorat) wrote…
export function (createBinaryKernel)
Done
tfjs-core/src/backends/webgl/backend_webgl.ts, line 1588 at r2 (raw file):
Previously, lina128 (Na Li) wrote…
Discussed offline, this is a intermediate step to avoid circular dependency. LGTM.
Done
tfjs-core/src/backends/webgl/backend_webgl.ts, line 1588 at r2 (raw file):
Previously, nsthorat (Nikhil Thorat) wrote…
this.div
Ideally a kernel that calls other kernels like softmax would import the impl function from the intermediate kernels, however that would introduce circular dependency issues in backend_webgl.ts. However when the softmax kernel becomes modularized (WIP PR: #2751), that problem will go away. I've added a TODO about this here and in backend_cpu.
tfjs-core/src/backends/webgl/kernels/Div.ts, line 28 at r1 (raw file):
Previously, tafsiri (Yannick Assogba) wrote…
If this will be reused elsewhere, put it in a utils file.
Now that the kernel files just define a config object and have no side effects, I was thinking the impl functions could live within the same files. What do you think?
tfjs-core/src/backends/webgl/kernels/Div.ts, line 28 at r2 (raw file):
Previously, nsthorat (Nikhil Thorat) wrote…
export function
Done
tfjs-core/src/backends/webgl/kernels/SquaredDifference.ts, line 29 at r2 (raw file):
Previously, nsthorat (Nikhil Thorat) wrote…
i think we should keep these all SquaredDifferenceInputs (the name of it), yannick?
Done
tfjs-core/src/gradients/Div_grad.ts, line 27 at r1 (raw file):
Previously, tafsiri (Yannick Assogba) wrote…
this should be
[a, b]given the inputsToSave spec above
Done
tfjs-core/src/gradients/Div_grad.ts, line 31 at r1 (raw file):
Previously, tafsiri (Yannick Assogba) wrote…
We should either use non chained ops or import the chained_op helpers. To ensure this you can remove div from the tensor class and ophandler
Done
tfjs-core/src/ops/div.ts, line 97 at r1 (raw file):
Previously, tafsiri (Yannick Assogba) wrote…
No need to add inputsToSave since this is now on the gradient config.
Done
tfjs-core/src/ops/div.ts, line 98 at r1 (raw file):
Previously, tafsiri (Yannick Assogba) wrote…
No need to add outputsToSave since this is now on the gradient config.
Done
tfjs-core/src/ops/divNoNan.ts, line 65 at r1 (raw file):
Previously, tafsiri (Yannick Assogba) wrote…
I think this should be its own kernel. This would allow converter to call it directly. Thoughts?
Sure, that makes sense. Could this be for a follow-up PR? I can add it to the kernels list in our tracker.
dsmilkov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple small comments. LGTM otherwise!
Reviewed 7 of 21 files at r1, 4 of 10 files at r3, 8 of 9 files at r4.
Reviewable status:complete! 3 of 1 approvals obtained (waiting on @annxingyuan, @dsmilkov, @lina128, @nsthorat, and @tafsiri)
tfjs-core/src/backends/cpu/backend_cpu.ts, line 391 at r4 (raw file):
// TODO(annxingyuan): Call divImpl rather than op as part of softmax kernel // modularization. return div(b, sumExp);
why not import 'div' from cpu/kernels/Div (which is divImpl) instead of ../ops/div. Or is this due to circular dep?
tfjs-core/src/backends/webgl/kernels/Div.ts, line 48 at r4 (raw file):
const out = divImpl(a, b, webglBackend); return {dataId: out.dataId, shape: out.shape, dtype: out.dtype};
return divImp(...) directly ? since divImpl gives you a tensorinfo
tfjs-core/src/gradients/Div_grad.ts, line 31 at r1 (raw file):
Previously, annxingyuan (Ann Yuan) wrote…
Done
can you avoid chaining below as well (we call .sum .reshape, .neg, etc as chained)
We should un-chain as we modularize
tfjs-core/src/ops/div.ts, line 69 at r4 (raw file):
const der = (dy: Tensor, saved: Tensor[]) => { const [$a, $b] = saved; const derA = () => {
remove chaining in this file
tafsiri
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here a few small comments, but LGTM.
Reviewable status:
complete! 4 of 1 approvals obtained (waiting on @annxingyuan, @lina128, @nsthorat, and @tafsiri)
tfjs-core/src/backends/webgl/kernels/Div.ts, line 28 at r1 (raw file):
Previously, annxingyuan (Ann Yuan) wrote…
Now that the kernel files just define a config object and have no side effects, I was thinking the impl functions could live within the same files. What do you think?
The problem is when tracing which kernel are required if a symbol (e.g. divImpl) is imported from this file we would infer that the div kernel needs to be registered (since Div is imported from kernel_names). Hence the desire to move shared utilities out of a kernel definition.
tfjs-core/src/ops/div.ts, line 101 at r1 (raw file):
Previously, tafsiri (Yannick Assogba) wrote…
Just FYI, i'm going to look into confirming/making sure we do not need to pass der for gradients that have been modularized. I suspect this should be true now given a recent PR but some clean up work may be needed.
Confirmed that you can remove der here since you have registered a modular gradient for this kernel.
tfjs-core/src/ops/divNoNan.ts, line 65 at r1 (raw file):
Previously, annxingyuan (Ann Yuan) wrote…
Sure, that makes sense. Could this be for a follow-up PR? I can add it to the kernels list in our tracker.
Sure.
tfjs-core/src/ops/divNoNan.ts, line 1 at r4 (raw file):
/**
the file name should probably be div_no_nan.ts
annxingyuan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 4 of 1 approvals obtained (waiting on @annxingyuan, @dsmilkov, @lina128, @nsthorat, and @tafsiri)
tfjs-core/src/backends/cpu/backend_cpu.ts, line 391 at r4 (raw file):
Previously, dsmilkov (Daniel Smilkov) wrote…
why not import 'div' from
cpu/kernels/Div(which is divImpl) instead of../ops/div. Or is this due to circular dep?
Yes, this is due to circular dep. Will import from cpu/kernels/Div like you suggested as part of modularizing softmax (WIP PR: #2945)
tfjs-core/src/backends/webgl/backend_webgl.ts, line 2478 at r1 (raw file):
Previously, tafsiri (Yannick Assogba) wrote…
We can discuss this in sync but wanted to return to the discussion of why runWebGLprogram wont create output Tensors? Other than that I think we should begin to document things like this (especially since it is optional). So could you add a comment for this function.
Done
tfjs-core/src/backends/webgl/kernels/Div.ts, line 28 at r1 (raw file):
Previously, tafsiri (Yannick Assogba) wrote…
The problem is when tracing which kernel are required if a symbol (e.g. divImpl) is imported from this file we would infer that the div kernel needs to be registered (since Div is imported from kernel_names). Hence the desire to move shared utilities out of a kernel definition.
Done
tfjs-core/src/backends/webgl/kernels/Div.ts, line 48 at r4 (raw file):
Previously, dsmilkov (Daniel Smilkov) wrote…
return divImp(...) directly ? since divImpl gives you a tensorinfo
Done
tfjs-core/src/gradients/Div_grad.ts, line 31 at r1 (raw file):
Previously, dsmilkov (Daniel Smilkov) wrote…
can you avoid chaining below as well (we call .sum .reshape, .neg, etc as chained)
We should un-chain as we modularize
Done
tfjs-core/src/ops/div.ts, line 101 at r1 (raw file):
Previously, tafsiri (Yannick Assogba) wrote…
Confirmed that you can remove der here since you have registered a modular gradient for this kernel.
Done
tfjs-core/src/ops/div.ts, line 69 at r4 (raw file):
Previously, dsmilkov (Daniel Smilkov) wrote…
remove chaining in this file
Done
tfjs-core/src/ops/divNoNan.ts, line 65 at r1 (raw file):
Previously, tafsiri (Yannick Assogba) wrote…
Sure.
Done
Changes
divop.divkernels for WebGL / CPU. Note that WebGLdivkernel exports an impl function - this will be useful in a follow-up PR for addingsoftmaxwhich invokesdivas an intermediate kernel.MathBackendWebGLso thatrunWebGLProgramtakes an optional outputTensorInfo.divNoNanop.To see the logs from the Cloud Build CI, please join either our discussion or announcement mailing list.
This change is