Skip to content

Conversation

@Linchenn
Copy link
Collaborator

@Linchenn Linchenn commented Jul 14, 2020

This PR is the first step for adding kernel timing information to the return value of tf.profile() (ProfileInfo).
This PR does the following:

  • Separate logging logic from the Profiler.profileKernel and let the Profiler.profileKernel return a KernelProfile object, so tf.profile can access the KernelProfile object to populate timeMs and extraInfo from the Profiler.profileKernel.

Any suggestions about the implementation of adding kernel timing information are appreciated. Thanks!

To see the logs from the Cloud Build CI, please join either our discussion or announcement mailing list.


This change is Reviewable

Copy link
Collaborator Author

@Linchenn Linchenn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, Profiler.profileKernel generated BackendTimingInfo and then used it to log.

Because we need BackendTimingInfo from Profiler.profileKernel to populate timeMs and extraInfo in ProfileInfo, we separate the logic of generating BackendTimingInfo and the logging logic in Profiler.profileKernel at first:

  • Profiler.profileKernel returns kernel profiles.
  • Profiler.logKernelProfile logs kernel profiles.

So we need to define a type for kernel profiles that are returned by Profiler.profileKernel, KernelProfile.

Reviewable status: 0 of 1 approvals obtained


tfjs-core/src/engine.ts, line 55 at r1 (raw file):

};

type KernelInfo = {

Keep consistent with MemoryInfo.


tfjs-core/src/engine.ts, line 639 at r1 (raw file):

    if (this.state.profiling) {
      this.state.activeProfile.kernels.push({

Another solution to add kernel timing information is that let Profiler to write ENGINE.state.activeProfile.kernels (type is KernelInfo), so we do not need the two similar types KernelInfo (a part of return value of tf.profile) and KernelProfile (return value of Profiler.profileKernel).

However, to write this KernelInfo, Profiler needs to import Engine to get ENGINE.state.numBytes for populating bytesAdded, causing circular import.

@Linchenn Linchenn marked this pull request as ready for review July 14, 2020 22:23
Copy link
Collaborator

@lina128 lina128 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thank you Lin!

Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @annxingyuan, @lina128, @Linchenn, and @pyu10055)


tfjs-core/src/engine.ts, line 639 at r1 (raw file):

Previously, Linchenn wrote…

Another solution to add kernel timing information is that let Profiler to write ENGINE.state.activeProfile.kernels (type is KernelInfo), so we do not need the two similar types KernelInfo (a part of return value of tf.profile) and KernelProfile (return value of Profiler.profileKernel).

However, to write this KernelInfo, Profiler needs to import Engine to get ENGINE.state.numBytes for populating bytesAdded, causing circular import.

We should avoid circular import. And it is fine to have KernelInfo and KernelProfile, other than name, they seem to contain very different information.


tfjs-core/src/engine.ts, line 635 at r2 (raw file):

    if (this.ENV.getBool('DEBUG')) {
      this.profiler.logKernelProfile(kernelProfiles);

Should this be in the scopedRun callback? Right after kernelProfiles.

Copy link
Collaborator Author

@Linchenn Linchenn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Na!

Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @annxingyuan, @lina128, and @pyu10055)


tfjs-core/src/engine.ts, line 639 at r1 (raw file):

Previously, lina128 (Na Li) wrote…

We should avoid circular import. And it is fine to have KernelInfo and KernelProfile, other than name, they seem to contain very different information.

Thanks Na!


tfjs-core/src/engine.ts, line 635 at r2 (raw file):

Previously, lina128 (Na Li) wrote…

Should this be in the scopedRun callback? Right after kernelProfiles.

Done. Thanks!

Copy link
Collaborator

@pyu10055 pyu10055 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I

Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @annxingyuan, @lina128, @Linchenn, and @pyu10055)


tfjs-core/src/engine.ts, line 639 at r1 (raw file):

Previously, Linchenn wrote…

Thanks Na!

If you want to consolidate the two types, you can move the type definition to a common file instead of leave it in the Engine file. So both side can import the type from the common file.


tfjs-core/src/profiler.ts, line 24 at r3 (raw file):

import * as util from './util';

export type KernelProfile = {

I am not sure if the output should be an array, since many information seem to be duplicated per output tensor.
maybe something like {kernelName, result, vals: {timeMs, extraInfo}[], inputs}


tfjs-core/src/profiler_test.ts, line 48 at r3 (raw file):

async function promiseCheckWrapper(
    acturalValPromise: Promise<{}>, truthVal: {}) {
  return acturalValPromise.then(acturalVal => {

if you are using async, use await instead of then. Please also fix this for other files.

Copy link
Collaborator Author

@Linchenn Linchenn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @annxingyuan, @lina128, and @pyu10055)


tfjs-core/src/engine.ts, line 639 at r1 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

If you want to consolidate the two types, you can move the type definition to a common file instead of leave it in the Engine file. So both side can import the type from the common file.

Thanks Ping for helping me find solutions! I am not very clear why moving types can help us combine the two types into a single one. Any explanations are appreciated.

From my view, the 'two types' sources from: since Profiler.profilerKern cannot return a KernelInfo object (including information related to numBytes and numTensors), we need to define a new type KernelProfile for its return value.


tfjs-core/src/profiler.ts, line 24 at r3 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

I am not sure if the output should be an array, since many information seem to be duplicated per output tensor.
maybe something like {kernelName, result, vals: {timeMs, extraInfo}[], inputs}

Nice suggestion! Only result (the kernel outputs) and vals (the values downloaded from the kernel outputs) should be an array and I have changed it.


tfjs-core/src/profiler_test.ts, line 48 at r3 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

if you are using async, use await instead of then. Please also fix this for other files.

Thanks! I also checked benchmarks. Done.

Copy link
Collaborator Author

@Linchenn Linchenn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @annxingyuan, @lina128, and @pyu10055)


tfjs-core/src/profiler.ts, line 24 at r3 (raw file):

Previously, Linchenn wrote…

Nice suggestion! Only result (the kernel outputs) and vals (the values downloaded from the kernel outputs) should be an array and I have changed it.

We may delete vals here, as it is the downloaded values from outputs (duplicate).
Originally, I added it because I want to keep the keys consistent with the parameters of Logger.logKernelProfile.

@Linchenn Linchenn requested a review from pyu10055 July 16, 2020 02:40
Copy link
Collaborator

@pyu10055 pyu10055 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 2 of 1 approvals obtained (waiting on @annxingyuan, @lina128, @Linchenn, and @pyu10055)


tfjs-core/src/engine.ts, line 639 at r1 (raw file):

Previously, Linchenn wrote…

Thanks Ping for helping me find solutions! I am not very clear why moving types can help us combine the two types into a single one. Any explanations are appreciated.

From my view, the 'two types' sources from: since Profiler.profilerKern cannot return a KernelInfo object (including information related to numBytes and numTensors), we need to define a new type KernelProfile for its return value.

I am just suggesting ways to avoid cyclic imports, not to suggest merging two types.

Copy link
Collaborator Author

@Linchenn Linchenn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 2 of 1 approvals obtained (waiting on @annxingyuan, @lina128, and @pyu10055)


tfjs-core/src/engine.ts, line 639 at r1 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

I am just suggesting ways to avoid cyclic imports, not to suggest merging two types.

Thanks Ping. This is a nice point!

@Linchenn Linchenn merged commit 9a1c567 into tensorflow:master Aug 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants