Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1]Update discriminant analysis code for better memory usage #10904

Merged
merged 7 commits into from
Apr 25, 2018

Conversation

bobchennan
Copy link
Contributor

Fixed #10898

Copy link
Member

@rth rth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR @bobchennan !

Please add [MRG] at the beginning of the PR title if this is ready for review. A few comments are below.

What's the estimated memory usage gain from this change? (you can measure memory usage, for instance, with memory_profiler.

covs.append(np.atleast_2d(_cov(Xg, shrinkage)))
return np.average(covs, axis=0, weights=priors)
if priors is None:
cov = cov + np.atleast_2d(_cov(Xg, shrinkage)) / len(classes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since cov is already allocated, you could just do an inplace add cov += ..

if priors is None:
cov = cov + np.atleast_2d(_cov(Xg, shrinkage)) / len(classes)
else:
cov = cov + priors[idx] * np.atleast_2d(_cov(Xg, shrinkage))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with this part of the code base, but if I understand correctly #10898 was only about memory usage and I don't see how this line can be equivalent to the one above (if priors=None).

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I failed to submit these comments hours ago...

classes = np.unique(y)
for group in classes:
means = np.empty(shape=(len(classes), X.shape[1]), dtype=X.dtype)
for idx, group in enumerate(classes):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, though fwiw, I think we could do this without a loop, using no.add.at

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No sure I understand this part. I can re-write it as

means = np.array([X[y==group].mean(0) for group in classes])

but the loop is still inside.

if priors is None:
cov = cov + np.atleast_2d(_cov(Xg, shrinkage)) / len(classes)
else:
cov = cov + priors[idx] * np.atleast_2d(_cov(Xg, shrinkage))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid this duplication, by only doing the if for the prior, not the whole statement.

You also make the assumption that priors add to 1. The previous version did not. Please verify that this is a safe assumption.

@bobchennan
Copy link
Contributor Author

Regarding to the sum of priors, in fit function it is already assigned (thus not None) and normalized.

@bobchennan bobchennan changed the title Update discriminant analysis code for better memory usage [MRG]Update discriminant analysis code for better memory usage Apr 5, 2018
@bobchennan
Copy link
Contributor Author

For the memory usage I will give an example later.

classes, ny = np.unique(y, return_inverse=True)
cnt = np.bincount(ny)
means = np.zeros(shape=(len(classes), X.shape[1]))
for idx in xrange(X.shape[0]):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop over n_samples is slow.
What about the equivalent array operation:

    means = np.zeros(shape=(len(classes), X.shape[1]))
    np.add.at(means, ny, X)
    means /= cnt[:, None]
    return means

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I don't know this function before.

Xg = X[y == group, :]
covs.append(np.atleast_2d(_cov(Xg, shrinkage)))
return np.average(covs, axis=0, weights=priors)
cov += priors[idx] * np.atleast_2d(_cov(Xg, shrinkage))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line does not work if priors is None. You should do something like:

cov_g = np.atleast_2d(_cov(Xg, shrinkage))
if priors is not None:
    cov_g *= priors[idx]
cov += cov_g

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh priors cannot be None.
Then maybe we should remove the default which implies None is a valid input.

covs.append(np.atleast_2d(_cov(Xg, shrinkage)))
return np.average(covs, axis=0, weights=priors)
cov += priors[idx] * np.atleast_2d(_cov(Xg, shrinkage))
return cov
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In your version, you don't return the weighted average as before, but the weighted sum.
You should divide cov by the sum of priors, or by n_classes if the priors are None.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok you answered it already, my mistake

@bobchennan
Copy link
Contributor Author

One example of memory usage is given here.
New implementation reduced memory usage from 2.2GB to 838MB for 20000 classes case.

@TomDLT TomDLT changed the title [MRG]Update discriminant analysis code for better memory usage [MRG+1]Update discriminant analysis code for better memory usage Apr 5, 2018
Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM otherwise.

Xg = X[y == group, :]
means.append(Xg.mean(0))
return np.asarray(means)
classes, ny = np.unique(y, return_inverse=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what ny means. Perhaps just overwrite y?

@jnothman
Copy link
Member

jnothman commented Apr 9, 2018

Please add an entry to the Enhancements change log at doc/whats_new/v0.20.rst. Like the other entries there, please reference this pull request with :issue: and credit yourself (and other contributors if applicable) with :user:

@bobchennan
Copy link
Contributor Author

@rth any suggestions?

@TomDLT TomDLT merged commit 481dac7 into scikit-learn:master Apr 25, 2018
@TomDLT
Copy link
Member

TomDLT commented Apr 25, 2018

Thank you @bobchennan !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants