Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add linalg.lstsq #2165

Merged
merged 9 commits into from May 17, 2019

Conversation

Projects
None yet
4 participants
@cjekel
Copy link
Contributor

commented Apr 28, 2019

Adds cupy.linalg.lstsq

This solves a least squares problem using SVD similar to numpy.linalg.lstsq.

This only returns the least-squares solution, while numpy.linalg.lstsq returns the residuals, rank, and singular values in addition to the least-squares solution.

If it's necessary to have 100% return syntax match to numpy.linalg.lstsq, I can modify this pr to include residuals, rank, and singular values.

Closes #1273


Edit 04-28-2019 22:54 EDT.

I have a benchmark comparing the performance of this cupy.linalg.lstsq vs numpy.linalg.lstsq, where CuPy was about six times faster on an AMD FX-8350 with NVIDIA Titan Xp for 6,000,000 data points. I'd be curious if this was true on different hardware configurations. The code to run the benchmark is here, and was run in the following order:

python3 sine_benchmark_fixed_six_break_points.py
python3 sine_benchmark_fixed_six_break_points_TFnoGPU.py
python3 sine_benchmark_fixed_twenty_break_points.py
python3 sine_benchmark_fixed_twenty_break_points_TFnoGPU.py
python3 plot_results.py

@asi1024 asi1024 added the cat:feature label May 5, 2019

@asi1024

This comment has been minimized.

Copy link
Member

commented May 5, 2019

Could you fix for the interface compatibility?

@asi1024

This comment has been minimized.

Copy link
Member

commented May 5, 2019

Is it possible to add other return values with no or very few overheads?

@cjekel

This comment has been minimized.

Copy link
Contributor Author

commented May 5, 2019

@asi1024

Could you fix for the interface compatibility?

Is this the return other values?

Is it possible to add other return values with no or very few overheads?

I added the other return values, and now cupy.linalg.lstsq is 100% compatible with numpy.linalg.lstsq. This does add overhead, but now matches numpy's syntax.

The new overhead comes from calculating the sum of square of the residuals (L-2 norm):

    if rank != n or m <= n:
        resids = cupy.array([], dtype=a.dtype)
    elif b.ndim > 1:
        # note that this should be the same (and faster?) than the next couple of lines
        # e = b - core.dot(a, x)
        # resids = cupy.diagonal(core.dot(e.T, e))
        k = b.shape[1]
        resids = cupy.zeros(k, dtype=a.dtype)
        for i in range(k):
            e = b[:, i] - core.dot(a, x[:, i])
            resids[i] = core.dot(e.T, e)
    else:
        e = b - core.dot(a, x)
        resids = core.dot(e.T, e).reshape(-1)
cupy.testing.assert_allclose(rank_cpu, rank_gpu, atol=1e-3)
cupy.testing.assert_allclose(s_cpu, s_gpu, atol=1e-3)
cupy.testing.assert_array_equal(a_gpu_copy, a_gpu)
cupy.testing.assert_array_equal(b_gpu_copy, b_gpu)

This comment has been minimized.

Copy link
@asi1024

asi1024 May 7, 2019

Member

Does not this implementation ensure that the result of cupy.lstsq is equal to that of numpy.lstsq? In other words, it ensures only ||ax-b||^2 are equal to each other?
If so, could you fix to check if ||ax-b||^2 is close?

This comment has been minimized.

Copy link
@cjekel

cjekel May 7, 2019

Author Contributor

I check if ||ax-b||^2 is close with cupy.testing.assert_allclose(resids_cpu, resids_gpu, atol=1e-3)

I'm not sure about atol=1e-3, I picked this because it was the same value for tests of cupy.linalg.pinv which also uses svd.

This comment has been minimized.

Copy link
@asi1024

asi1024 May 8, 2019

Member

Does not this implementation ensure that the result (the first element of the return values) of cupy.lstsq is equal to that of numpy.lstsq?

This comment has been minimized.

Copy link
@cjekel

cjekel May 8, 2019

Author Contributor

The implementation ensures that each element in the result is close, but not equal. They won't be equal in a programming, because the backends are different.

Edit: To clarify, the implementation checks that all elements in the results of numpy are close to cupy.

This comment has been minimized.

Copy link
@asi1024

asi1024 May 8, 2019

Member

Why @condition.retry(10) at L157 is needed? One of the result value is often distant from the expected value?

This comment has been minimized.

Copy link
@cjekel

cjekel May 8, 2019

Author Contributor

Why @condition.retry(10) at L157 is needed? One of the result value is often distant from the expected value?

Long response #2165 (comment)

Show resolved Hide resolved cupy/linalg/solve.py
Show resolved Hide resolved cupy/linalg/solve.py Outdated
Show resolved Hide resolved tests/cupy_tests/linalg_tests/test_solve.py Outdated
address review comments
Changes:
1. Raise numpy.linalg.LinAlgError on invalid dimmensions.
2. Remove for loop for calculation of residuals in k dimmension.
3. Clarify tests with comments and new test names.
4. Modify tests to look for numpy.linalg.LinAlgError.
@cjekel

This comment has been minimized.

Copy link
Contributor Author

commented May 8, 2019

Why @condition.retry(10) at L157 is needed? One of the result value is often distant from the expected value?

Sometimes test fail, in particular cupy.testing.assert_allclose(x_cpu, x_gpu, atol=1e-3) which maybe fails every 2/100 runs. The obvious case is when a is singular. However the following case is not as simple

import numpy as np
a = np.array([[2., 5., 1., 2.],
              [2., 2., 3., 8.],
              [4., 8., 1., 8.],
              [1., 3., 5., 1.]], dtype=np.float32)
b = np.array((9., 6., 0., 3.), dtype=np.float32)

where lstsq is solved in the following manner

x, resid, rank, s = np.linalg.lstsq(a, b)

produces (x_cpu == numpy.linalg.lstsq solution, x_gpu == cupy.linalg.lstsq solution)

AssertionError:
Not equal to tolerance rtol=1e-07, atol=0.001

Mismatch: 50%
Max absolute difference: 0.0032959
Max relative difference: 1.6645674e-05
 x_cpu: array([198.      , -64.71429 ,   6.857143, -35.142857], dtype=float32)
 x_gpu: array([198.0033  , -64.71534 ,   6.857244, -35.14343 ], dtype=float32)

Should I make the tests deterministic to avoid such failures? this could be done by supplying a random seed for each configuration of a and b. or is it okay to retry when a test fails?

@asi1024

This comment has been minimized.

Copy link
Member

commented May 13, 2019

Yes, It seems preferred to make the tests deterministic.
This implementation has a condition with rank != n or m <= n, so the tests should have both regular and singular cases.
Additionally, when the given matrix is singular, the first element of the return value certainly does not have to be equal (or close) to the NumPy's one, but the other elements have to be.

cjekel added some commits May 14, 2019

@cjekel

This comment has been minimized.

Copy link
Contributor Author

commented May 14, 2019

@asi1024 Thanks for your recommendations.

Yes, It seems preferred to make the tests deterministic.

I remove the retry condition, and use fixed random seeds to generate the arrays now.

This implementation has a condition with rank != n or m <= n, so the tests should have both regular and singular cases.
Additionally, when the given matrix is singular, the first element of the return value certainly does not have to be equal (or close) to the NumPy's one, but the other elements have to be.

The tests have been reworked to generate a singular matrix. If a singular matrix was generated, the tests no longer check if the first return element is close to NumPy solution. All other return elements are still checked.

@asi1024 asi1024 added this to the v7.0.0b1 milestone May 16, 2019

@asi1024

This comment has been minimized.

Copy link
Member

commented May 16, 2019

Jenkins, test this please.

@pfn-ci-bot

This comment has been minimized.

Copy link
Collaborator

commented May 16, 2019

Successfully created a job for commit cc5012c:

@chainer-ci

This comment has been minimized.

Copy link
Collaborator

commented May 16, 2019

Jenkins CI test (for commit cc5012c, target branch master) succeeded!

@asi1024

This comment has been minimized.

Copy link
Member

commented May 17, 2019

LGTM. Thank you for the PR!

@asi1024 asi1024 merged commit d9d5176 into cupy:master May 17, 2019

9 checks passed

Jenkins Build finished.
Details
codecov/patch 100% of diff hit (target 0%)
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage increased (+0.3%) to 92.479%
Details
pfn-public-ci/cupy.py2.cv.examples test succeeded
Details
pfn-public-ci/cupy.py2.cv.gpu test succeeded
Details
pfn-public-ci/cupy.py3.cv.examples test succeeded
Details
pfn-public-ci/cupy.py3.cv.gpu test succeeded
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.