Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Cholesky Decompostion #8202

Merged
merged 21 commits into from
Nov 20, 2019
Merged

Conversation

UmashankarTriforce
Copy link
Contributor

@UmashankarTriforce UmashankarTriforce commented Sep 30, 2019

This PR aims to add Cholesky Decomposition as per the issue #1448

Route taken to complete this feature -

  • Implement function
  • Write unit test cases
  • Document the function

@UmashankarTriforce UmashankarTriforce changed the title [WIP ]Add Cholesky Decompostion [WIP] Add Cholesky Decompostion Sep 30, 2019
@toslunar
Copy link
Member

toslunar commented Oct 1, 2019

Why don't you use numpy.linalg.cholesky?

@UmashankarTriforce
Copy link
Contributor Author

UmashankarTriforce commented Oct 1, 2019

How is cupy.linalg.cholesky implemented? I'm using cupy.linalg.cholesky for backward computation. I checked the difference between numpy and scipy implementation of cholesky.

scipy.linalg.cholesky is giving you the upper-triangular decomposition by default, whereas np.linalg.cholesky is giving you the lower-triangular version.

https://stackoverflow.com/questions/16699163/what-is-the-difference-between-cholesky-in-numpy-and-scipy

@toslunar
Copy link
Member

toslunar commented Oct 3, 2019

How is cupy.linalg.cholesky implemented?

cupy.linalg.cholesky decomposes a matrix into L @ L.H. The code https://github.com/cupy/cupy/blob/c9c7c4eedcca54e6f53b5dcfc205d4ad94a2b595/cupy/linalg/decomposition.py#L49-L54 has cublas.CUBLAS_FILL_MODE_UPPER, which looks inconsistent but it's not because of the difference of row-major vs column-major.

SciPy-compatible GPU implementation is available in cupyx.scipy namespace, if there is.

I'm using cupy.linalg.cholesky for backward computation.

Inputs to FunctionNode.backward are Variables.

  • Use Chainer functions in backward, or
  • define class CholeskyGrad(FunctionNode) and leave double-backward unimplemented.

@UmashankarTriforce
Copy link
Contributor Author

UmashankarTriforce commented Oct 7, 2019

Sorry for the late response!
SciPy-compatible GPU implementation is not there for cholesky. Also I plan to use cupy.linalg.cholesky in forward_gpu and for backwards I plan to take the dot product of L and L.T.conj(). Since chainer.dot is not implemented, I plan to use cupy.dot and wrap the array back to chainer.Variable. What do you suggest?

Edit: I also found that the results of cupy.linalg.cholesky is consistent with numpy's implementation. So I would be switching to numpy for forward_cpu

chainer/functions/math/cholesky.py Outdated Show resolved Hide resolved

def backward(self, indexes, gy):
a, = self.get_retained_inputs().array
return cuda.cupy.dot(a, a.T) * gy[0],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable-level equivalent code would be

a, = self.get_retrained_inputs()
return chainer.functions.matmul(a, a, transb=True),

@UmashankarTriforce
Copy link
Contributor Author

Sorry for the late response! I have added changes and also added test patterns
There's an error I'm encountering when I try to run the test
AttributeError: 'Cholesky' object has no attribute 'get_retrained_inputs'

How do I solve this?

@UmashankarTriforce
Copy link
Contributor Author

When I run tests, in some cases the gradients computed backwards is absurdly more than the numeric gradients. For example -

>           raise AssertionError(f.getvalue())
E           chainer.testing.function_link.FunctionTestError: Parameterized test failed.
E           
E           Base test method: TestCholesky_use_chainerx_false__chainerx_device_None__use_cuda_false__cuda_device_None__use_cudnn_never__cudnn_deterministic_false__autotune_false__cudnn_fast_batch_normalization_false__use_ideep_never.test_backward
E           Test parameters:
E             dtype: <class 'numpy.float32'>
E             shape: (2, 2)
E           
E           
E           (caused by)
E           FunctionTestError: backward is not implemented correctly
E           
E           (caused by)
E           AssertionError: check_backward failed (eps=0.001 atol=0.001 rtol=0.001)
E           inputs[0]:
E           [[7670.438  1473.2218]
E            [1473.2218 6982.2075]]
E           grad_outputs[0]:
E           [[-0.70425826  0.97613996]
E            [ 0.5500437  -0.09768751]]
E           directions[0]:
E           [[-0.7262568  -0.0971007 ]
E            [-0.21805311  0.64465135]]
E           gradients (numeric):  0.0015698603367818734
E           gradients (backward): 23360998.461244226
E           
E           x: numeric gradient, y: backward gradient
E           Not equal to tolerance rtol=0.001, atol=0.001
E           
E           Mismatch: 100%
E           Max absolute difference: 23360998.45967437
E           Max relative difference: 1.
E            x: array(0.00157)
E            y: array(23360998.461244)
E           
E           assert_allclose failed: 
E             shape: () ()
E             dtype: float64 float64
E             i: (0,)
E             x[i]: 0.0015698603367818734
E             y[i]: 23360998.461244226
E             relative error[i]: 0.9999999999328
E             absolute error[i]: 23360998.459674366
E             relative tolerance * |y[i]|: 23360.998461244228
E             absolute tolerance: 0.001
E             total tolerance: 23360.999461244228
E           x: 0.00156986
E           y: 23360998.46124423
``
How do you suggest I fix this issue?

@toslunar
Copy link
Member

The scale in the CuPy test is large because integer types are tested. To test backprop, scale=(1e-2, 2.0) looks good to me.

@UmashankarTriforce
Copy link
Contributor Author

Yes I have made the necessary changes, but the mismatch still exists (its of a lower magnitude now, but more than tolerance value)

chainer/functions/math/cholesky.py Outdated Show resolved Hide resolved

def backward(self, indexes, gy):
a, = self.get_retained_inputs()
return functions.matmul(a, a, transb=True) * gy[0],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My fix is available at UmashankarTriforce#1. Could you merge it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes its done

chainer/functions/math/cholesky.py Outdated Show resolved Hide resolved
chainer/functions/math/cholesky.py Outdated Show resolved Hide resolved
UmashankarTriforce and others added 7 commits October 18, 2019 09:50
Co-Authored-By: Toshiki Kataoka <tos.lunar@gmail.com>
Co-Authored-By: Toshiki Kataoka <tos.lunar@gmail.com>
Co-Authored-By: Toshiki Kataoka <tos.lunar@gmail.com>
Copy link
Member

@toslunar toslunar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    def forward_chainerx(self, inputs):
        return chainerx.linalg.cholesky(*inputs),

@UmashankarTriforce
Copy link
Contributor Author

UmashankarTriforce commented Oct 24, 2019

I have added the points 1 and 3 from your review. I will wait for #8312 and change accordingly

Copy link
Member

@toslunar toslunar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#8312 is merged. Could you align F.cholesky with chainerx.linalg.cholesky?

chainer/functions/math/cholesky.py Outdated Show resolved Hide resolved
return y_expect.astype(self.dtype),

def forward(self, inputs, device):
a, = inputs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Insert a = 0.5 * (a + a.T) to make the perturbation symmetric, like the test code for chainerx.linalg.cholesky:

def forward_xp(self, inputs, xp):
a, = inputs
if (_numpy_does_not_support_0d_input113 and a.size == 0):
pytest.skip('Older NumPy versions do not work with empty arrays')
# Input has to be symmetrized for backward test to work
a = (a + a.T)/2. + 1e-3 * xp.eye(*self.shape)
L = xp.linalg.cholesky(a)
return L,
.

It can be also tested that the returned gradient is symmetric. (Not mandatory in this PR)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also fix TestCholesky.forward? xp means numpy or cupy (or chainerx) in Chainer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I have added this fix

@toslunar toslunar added the st:awaiting-author State indicating that response is needed from contributors, often authors of pull request. label Nov 8, 2019
Co-Authored-By: Toshiki Kataoka <tos.lunar@gmail.com>
@UmashankarTriforce
Copy link
Contributor Author

I'm extremely sorry for the late response 🙁 : I had my exams in college and hence I wasn't able to work on this. Have a look and let me know if any more modifications are necessary 🙂

return y_expect.astype(self.dtype),

def forward(self, inputs, device):
a, = inputs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also fix TestCholesky.forward? xp means numpy or cupy (or chainerx) in Chainer.

chainer/functions/math/cholesky.py Outdated Show resolved Hide resolved
Co-Authored-By: Toshiki Kataoka <tos.lunar@gmail.com>
@toslunar toslunar added cat:feature Implementation that introduces new interfaces. and removed st:awaiting-author State indicating that response is needed from contributors, often authors of pull request. labels Nov 14, 2019
Copy link
Member

@toslunar toslunar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@toslunar
Copy link
Member

Jenkins & flexCI, test this please.

@chainer-ci
Copy link
Member

Jenkins CI test (for commit a496ded, target branch master) failed with status FAILURE.

@UmashankarTriforce
Copy link
Contributor Author

The error seems to be from ChainerX

@toslunar toslunar changed the title [WIP] Add Cholesky Decompostion Add Cholesky Decompostion Nov 20, 2019
@toslunar toslunar added this to the v7.0.0 milestone Nov 20, 2019
@toslunar toslunar merged commit a2c63c8 into chainer:master Nov 20, 2019
@UmashankarTriforce UmashankarTriforce deleted the cholesky branch November 20, 2019 11:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cat:feature Implementation that introduces new interfaces.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants