Initialize gradient of uninitialized parameter with default dtype when initializer is callable #5064

takagi · 2018-07-05T15:14:51Z

This PR follows up #4510.

The gradient of an uninitialized parameter was always initialized with NaN of numpy.float32 when its initializer was a callable, in contrast to its data that was initialized with the default dtype. This caused mismatch of dtypes between the data and the gradient.

This PR fixes it to use the default dtype to initialize the gradient in such case so that the data and the gradient have the same dtype.

…n initializer is callable

mitmul · 2018-07-10T02:39:40Z

Jenkins, test this please.

chainer-ci · 2018-07-10T03:00:35Z

Jenkins CI test (for commit 63f2466) failed with status FAILURE.
(For contributors, please wait until the reviewer confirms the details of the error.)

takagi · 2018-07-10T13:52:56Z

Jenkins, test this please.

chainer-ci · 2018-07-10T14:17:16Z

Jenkins CI test (for commit 63f2466) succeeded without errors!

okuta · 2018-07-13T03:39:47Z

LGTM!

Initialize gradient of uninitialized parameter with default dtype whe…

63f2466

…n initializer is callable

takagi added the cat:enhancement Implementation that does not break interfaces. label Jul 5, 2018

takagi mentioned this pull request Jul 6, 2018

Improve initializer support in L.Maxout #5068

Merged

mitmul self-assigned this Jul 10, 2018

okuta added the st:test-and-merge State indicating that pull request is approved by a reviewer and can be merged after CI passes. label Jul 13, 2018

okuta self-assigned this Jul 13, 2018

okuta added this to the v5.0.0b3 milestone Jul 13, 2018

okuta merged commit c888ad9 into chainer:master Jul 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initialize gradient of uninitialized parameter with default dtype when initializer is callable #5064

Initialize gradient of uninitialized parameter with default dtype when initializer is callable #5064

takagi commented Jul 5, 2018 •

edited

mitmul commented Jul 10, 2018

chainer-ci commented Jul 10, 2018

takagi commented Jul 10, 2018

chainer-ci commented Jul 10, 2018

okuta commented Jul 13, 2018

Initialize gradient of uninitialized parameter with default dtype when initializer is callable #5064

Initialize gradient of uninitialized parameter with default dtype when initializer is callable #5064

Conversation

takagi commented Jul 5, 2018 • edited

mitmul commented Jul 10, 2018

chainer-ci commented Jul 10, 2018

takagi commented Jul 10, 2018

chainer-ci commented Jul 10, 2018

okuta commented Jul 13, 2018

takagi commented Jul 5, 2018 •

edited