Some work with CUDNN #2797

danpovey · 2018-10-23T04:07:34Z

Still preliminary; not working yet.

Currently, this assumes that cudnn is installed in /usr/local/cuda/lib64 and /usr/local/cuda/include, because that's what my current machine has it installed. This can be cleaned up later.

Double check that we correctly transpose height and width everywhere!

b17 disk is no longer writable.

No longer publicly expose the default constructor. That could allow someone to do something like this: CuDevice device1; CuDevice device2; We're not sure what this would cause, but it probably wouldn't be good.

Document data formats expected of each member function.

There may be edge cases in the configure script with this remaining. We still don't look up the version of CUDNN to download for your CUDA version. Also make sure that non-CUDA builds still compile. This is achieved by the cu-cudnn-helper.h file, which forward declares cudnn types, which fortunately are all pointer types (except for enums, which don't matter in this case).

…nnv7

Remove support for CUDA versions that don't support CUDNN v7. Remove 32-bit CUDA mode, since no recent version of CUDA supports it. I may not have removed all instances of it.

Automatically download CUDNN as part of configure.

Convert assertion to warning, although I am not sure this works, since if an algorithm fails to be found because of an out-of-memory error, it is likely to fail at training time for the same reason.

galv · 2018-10-23T07:26:12Z

Hi Dan, check out danpovey#53 to see some results of my poking this.

Use cudnnSetTensor4dDescriptor with strides we calculate ourselves instead.

Change filter type back to NCHW, since it supports more algos.

It supports only the precomputed implicit GEMM implementation of convolution.

The ConvolutionComputation class does not know whether or not the bias is optional at initialization time. It will simply avoid using the bias if it is a nullptr.

galv · 2018-10-27T04:43:37Z

src/nnet3/convolution-cudnn.cc

+
+
+void ConvolutionComputation::
+ConvolveForward(const MatrixBase<BaseFloat> &input,


Is there some reason why we don't make this class own a "ConvolutionComputation" object, and have it defer to the functions like ConvolveForward defined in convolution.h?

danpovey · 2018-10-27T04:55:29Z

The layouts used in convolution.h are different. Thanks for this! I'll look at it again to-morrow.

…

On Sat, Oct 27, 2018 at 12:43 AM Daniel Galvez ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In src/nnet3/convolution-cudnn.cc <#2797 (comment)>: > + bias_desc_, + bias.Data(), + activation_desc_, + output_desc_, + output->Data())); + } else +#endif + { + ConvolveForward(input.Mat(), params.Mat(), bias.Vec(), + &(output->Mat())); + } +} + + +void ConvolutionComputation:: +ConvolveForward(const MatrixBase<BaseFloat> &input, Is there some reason why we don't make this class own a "ConvolutionComputation" object, and have it defer to the functions like ConvolveForward defined in convolution.h? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#2797 (review)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVuxXtz7tHWUbwZrkhgm-xA-9rsC50ks5uo-SAgaJpZM4X0qg7> .

galv · 2018-10-27T05:12:40Z

Oh, of course. Thanks. On Fri, Oct 26, 2018 at 9:55 PM Daniel Povey <notifications@github.com> wrote:

…

The layouts used in convolution.h are different. Thanks for this! I'll look at it again to-morrow. On Sat, Oct 27, 2018 at 12:43 AM Daniel Galvez ***@***.***> wrote: > ***@***.**** commented on this pull request. > ------------------------------ > > In src/nnet3/convolution-cudnn.cc > <#2797 (comment)>: > > > + bias_desc_, > + bias.Data(), > + activation_desc_, > + output_desc_, > + output->Data())); > + } else > +#endif > + { > + ConvolveForward(input.Mat(), params.Mat(), bias.Vec(), > + &(output->Mat())); > + } > +} > + > + > +void ConvolutionComputation:: > +ConvolveForward(const MatrixBase<BaseFloat> &input, > > Is there some reason why we don't make this class own a > "ConvolutionComputation" object, and have it defer to the functions like > ConvolveForward defined in convolution.h? > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > < #2797 (review)>, > or mute the thread > < https://github.com/notifications/unsubscribe-auth/ADJVuxXtz7tHWUbwZrkhgm-xA-9rsC50ks5uo-SAgaJpZM4X0qg7 > > . > — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#2797 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEi_UJZfrE9NUUpQmunzwD1SNwUyBzFHks5uo-dFgaJpZM4X0qg7> .

-- Daniel Galvez http://danielgalvez.me https://github.com/galv

Don't use cudnnConvolutionBiasActivationForward.

…ks: johnjosephmorgan@gmail.com

stale · 2020-06-19T08:36:00Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale · 2020-07-19T05:23:45Z

This issue has been automatically closed by a bot strictly because of inactivity. This does not mean that we think that this issue is not important! If you believe it has been closed hastily, add a comment to the issue and mention @kkm000, and I'll gladly reopen it.

stale · 2020-09-17T08:30:11Z

This issue has been automatically marked as stale by a bot solely because it has not had recent activity. Please add any comment (simply 'ping' is enough) to prevent the issue from being closed for 60 more days if you believe it should be kept open.

galv and others added 17 commits September 15, 2018 16:26

WIP: Add CUDNN as an optional dependency to cu-device.{cc,h}

2d76ce9

Currently, this assumes that cudnn is installed in /usr/local/cuda/lib64 and /usr/local/cuda/include, because that's what my current machine has it installed. This can be cleaned up later.

Download CUDNN, get this to build on CLSP

0a00e73

Initial draft of CUDNN 2d convolution implementation.

224d4da

Double check that we correctly transpose height and width everywhere!

Minor

c9806e0

[egs] mini librispeech fix for CLSP.

5b1855a

b17 disk is no longer writable.

[src] Fix Singleton implementation of CuDevice.

8e00e4f

No longer publicly expose the default constructor. That could allow someone to do something like this: CuDevice device1; CuDevice device2; We're not sure what this would cause, but it probably wouldn't be good.

Small CUDNN fixes

5e4d270

Fix implementation's height-width switching.

0387a0c

Document data formats expected of each member function.

Merge branch 'cudnnv7' of https://github.com/galv/kaldi into galv-cud…

e729495

…nnv7

[src] Updates to docs in convolution branch

5bb72e5

Automatically download CUDNN as part of configure.

9647494

Remove support for CUDA versions that don't support CUDNN v7. Remove 32-bit CUDA mode, since no recent version of CUDA supports it. I may not have removed all instances of it.

Merge pull request #52 from galv/cudnn

6427e79

Automatically download CUDNN as part of configure.

[src,build] Get it to compile; some structural changes.

53d62af

[src] Some refactoring; start adding tests.

7000850

[src] Fix some bugs, stuck again.

b5d2022

Change filter type back to NCHW, since it supports more algos.

22669f6

Convert assertion to warning, although I am not sure this works, since if an algorithm fails to be found because of an out-of-memory error, it is likely to fail at training time for the same reason.

galv and others added 6 commits October 23, 2018 23:49

Workaround cudnnSetTensor4dDescriptor's striding bug.

c958143

Use cudnnSetTensor4dDescriptor with strides we calculate ourselves instead.

Merge pull request #53 from galv/cudnn-povey

e4d3383

Change filter type back to NCHW, since it supports more algos.

[src] Fix various bugs.

74114d0

Don't use cudnnConvolutionBiasActivationForward.

0bed8aa

It supports only the precomputed implicit GEMM implementation of convolution.

Explain bias dimensions.

c546716

Make bias optional.

450f491

The ConvolutionComputation class does not know whether or not the bias is optional at initialization time. It will simply avoid using the bias if it is a nullptr.

galv reviewed Oct 27, 2018

View reviewed changes

danpovey added 3 commits October 27, 2018 13:50

Merge pull request #54 from galv/cudnn-povey-2

464db5c

Don't use cudnnConvolutionBiasActivationForward.

[egs] Remove unnecessary alignment from mini_librispeech run.sh, than…

bfea6c8

…ks: johnjosephmorgan@gmail.com

[src] Small cosmetic changes

a2de7b9

Merge master into cudnn branch

0457a61

stale bot added the stale Stale bot on the loose label Jun 19, 2020

stale bot closed this Jul 19, 2020

kkm000 reopened this Jul 19, 2020

stale bot removed the stale Stale bot on the loose label Jul 19, 2020

stale bot added the stale Stale bot on the loose label Sep 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some work with CUDNN #2797

Some work with CUDNN #2797

danpovey commented Oct 23, 2018

galv commented Oct 23, 2018

galv Oct 27, 2018

danpovey commented Oct 27, 2018 via email

galv commented Oct 27, 2018 via email

stale bot commented Jun 19, 2020

stale bot commented Jul 19, 2020

stale bot commented Sep 17, 2020



		void ConvolutionComputation::
		ConvolveForward(const MatrixBase<BaseFloat> &input,

Some work with CUDNN #2797

Are you sure you want to change the base?

Some work with CUDNN #2797

Conversation

danpovey commented Oct 23, 2018

galv commented Oct 23, 2018

galv Oct 27, 2018

Choose a reason for hiding this comment

danpovey commented Oct 27, 2018 via email

galv commented Oct 27, 2018 via email

stale bot commented Jun 19, 2020

stale bot commented Jul 19, 2020

stale bot commented Sep 17, 2020