New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] Reduce runtime of graph_lasso #9858

Merged
merged 4 commits into from Oct 2, 2017

Conversation

Projects
None yet
3 participants
@stevendbrown
Contributor

stevendbrown commented Sep 30, 2017

Reference Issue

None.

What does this implement/fix? Explain your changes.

For a 1288x1288 empirical covariance matrix, sklearn.covariance.graph_lasso currently spends 70-80% of it's runtime creating sub_covariance on my computer. This change reduces that to ~3.5% of the function's runtime, and np.allclose(...) comparing both of the arrays in the result tuple return True for each. Iterations have the same losses & dual gaps, and calls to graph_lasso end after the same number of iterations, so I don't see a functional change with my data.

Any other comments?

Just noticed that this may interact with #4787. My dataset is quite dense, so the runtime of graph_lasso based on that PR is almost double that of the original function.

@TomDLT

This comment has been minimized.

Show comment
Hide comment
@TomDLT

TomDLT Oct 2, 2017

Member

LGTM

  • Maybe add a line comment to explain that we only need to update one line and one column.
  • The tests fail because your changes are not PEP8 (line too long).
Member

TomDLT commented Oct 2, 2017

LGTM

  • Maybe add a line comment to explain that we only need to update one line and one column.
  • The tests fail because your changes are not PEP8 (line too long).

@TomDLT TomDLT changed the title from reduce runtime of graph_lasso to [MRG] Reduce runtime of graph_lasso Oct 2, 2017

@stevendbrown

This comment has been minimized.

Show comment
Hide comment
@stevendbrown

stevendbrown Oct 2, 2017

Contributor

Hi @TomDLT, I think the PEP8 error is now resolved; the tests looked OK on my screen. I've added said comment. It took me a little while to convince myself that this was true, but this short script can also verify it for anyone who is interested:

import numpy as np

t = np.random.randint(1, 100, size=(100,100))

indices = np.arange(t.shape[0])
subt = np.ascontiguousarray(t[1:, 1:])
for idx in indices:
    if idx > 0:
        subt[idx-1] = t[idx-1][indices != idx]
        subt[:, idx-1] = t[:, idx-1][indices != idx]
    else:
        subt[:] = t[1:, 1:]
    assert np.all(subt == t[indices != idx].T[indices != idx].T)
Contributor

stevendbrown commented Oct 2, 2017

Hi @TomDLT, I think the PEP8 error is now resolved; the tests looked OK on my screen. I've added said comment. It took me a little while to convince myself that this was true, but this short script can also verify it for anyone who is interested:

import numpy as np

t = np.random.randint(1, 100, size=(100,100))

indices = np.arange(t.shape[0])
subt = np.ascontiguousarray(t[1:, 1:])
for idx in indices:
    if idx > 0:
        subt[idx-1] = t[idx-1][indices != idx]
        subt[:, idx-1] = t[:, idx-1][indices != idx]
    else:
        subt[:] = t[1:, 1:]
    assert np.all(subt == t[indices != idx].T[indices != idx].T)
@TomDLT

This comment has been minimized.

Show comment
Hide comment
@TomDLT

TomDLT Oct 2, 2017

Member

I think the PEP8 error is now resolved;

Indeed, for some weird reason I don't see the commit 070fdcf in this page...

Anyway, thanks for the script, this is indeed non trivial.

Member

TomDLT commented Oct 2, 2017

I think the PEP8 error is now resolved;

Indeed, for some weird reason I don't see the commit 070fdcf in this page...

Anyway, thanks for the script, this is indeed non trivial.

@stevendbrown

This comment has been minimized.

Show comment
Hide comment
@stevendbrown

stevendbrown Oct 2, 2017

Contributor

@TomDLT I've replaced my version of the comment with yours. Thanks.

Contributor

stevendbrown commented Oct 2, 2017

@TomDLT I've replaced my version of the comment with yours. Thanks.

@TomDLT TomDLT changed the title from [MRG] Reduce runtime of graph_lasso to [MRG+1] Reduce runtime of graph_lasso Oct 2, 2017

@TomDLT

This comment has been minimized.

Show comment
Hide comment
@TomDLT

TomDLT Oct 2, 2017

Member

All right, I approved this change, let's just wait for a second reviewer

Member

TomDLT commented Oct 2, 2017

All right, I approved this change, let's just wait for a second reviewer

@agramfort agramfort merged commit 692cd8b into scikit-learn:master Oct 2, 2017

6 checks passed

ci/circleci Your tests passed on CircleCI!
Details
codecov/patch 100% of diff hit (target 96.17%)
Details
codecov/project 96.17% (+<.01%) compared to 4e6b403
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
lgtm analysis: Python No alert changes
Details
@agramfort

This comment has been minimized.

Show comment
Hide comment
@agramfort
Member

agramfort commented Oct 2, 2017

maskani-moh added a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017

[MRG+1] Reduce runtime of graph_lasso (scikit-learn#9858)
* reduce runtime of graph_lasso

* fixed line length overrun

* added comment explaining the change

* changed explanation comment

jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this pull request Dec 18, 2017

[MRG+1] Reduce runtime of graph_lasso (scikit-learn#9858)
* reduce runtime of graph_lasso

* fixed line length overrun

* added comment explaining the change

* changed explanation comment

@stevendbrown stevendbrown deleted the stevendbrown:graph_lasso_runtime branch May 29, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment