Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid extra data copy induced by np.ravel and np.flatten #5555

Open
rfezzani opened this issue Sep 2, 2021 · 3 comments
Open

Avoid extra data copy induced by np.ravel and np.flatten #5555

rfezzani opened this issue Sep 2, 2021 · 3 comments
Labels

Comments

@rfezzani
Copy link
Member

rfezzani commented Sep 2, 2021

Description

Some unwanted extra data copies can be avoided when a view is needed by using ndarray.reshape(-1) instead of np.ravel and np.flatten.

Upd(@soupault): More info - https://stackoverflow.com/questions/28930465/what-is-the-difference-between-flatten-and-ravel-functions-in-numpy

@andres-fr
Copy link
Contributor

Usage of ravel and flatten in the repo can be queried via the following commands:

grep -r --include=*.py --include=*.pyx "ravel(" .
grep -r --include=*.py --include=*.pyx "flatten(" .

Which at the current master HEAD returns 108 and 11 matches, respectively. For each match, an audit of the code is needed to figure out whether a copy is necessary.

My suggestion is to divide this issue into 2 PRs, one for the 11 flatten matches, that can be implemented and merged to main first, and a second one with the ravel matches that could be done afterwards (maybe we learn something from flatten that helps there).

Another further enhancement would be to expand CI with a further test that checks for ravel and flatten usage and suggests using reshape(-1) instead if possible, with e.g. a message like this one:

np.ravel() and np.flatten() may return array copies. If that is not needed, consider using ndarray.reshape(-1) instead

Upon every PR commit, flake8bot already reports devs with flaws found. It seems that flake8 can be configured to include custom, arbitrary checks like this one (I'm not sure if flake8 has access to the Python AST, but maybe a string matching against {"ravel(", "flatten("} would be enough?)

@rfezzani
Copy link
Member Author

rfezzani commented Sep 2, 2021

My suggestion is to divide this issue into 2 PRs, one for the 11 flatten matches, that can be implemented and merged to main first, and a second one with the ravel matches that could be done afterwards (maybe we learn something from flatten that helps there).

Go for it @andres-fr !

Another further enhancement would be to expand CI with a further test that checks for ravel and flatten usage and suggests using reshape(-1) instead if possible...

This is in fact interesting, I don't know if @scikit-image/core members are OK with such CI features...

@mkcor
Copy link
Member

mkcor commented Sep 3, 2021

My suggestion is to divide this issue into 2 PRs, one for the 11 flatten matches, that can be implemented and merged to main first, and a second one with the ravel matches that could be done afterwards (maybe we learn something from flatten that helps there).

Go for it @andres-fr !

👍 Thanks!

Another further enhancement would be to expand CI with a further test that checks for ravel and flatten usage and suggests using reshape(-1) instead if possible...

This is in fact interesting, I don't know if @scikit-image/core members are OK with such CI features...

Why not? I think it could be a nice CI addition, but let's roll it only once the current instances of ravel and flatten are audited:

For each match, an audit of the code is needed to figure out whether a copy is necessary.

👍 so all instances are either updated with reshape(-1) or marked as checked (so flake8 ignores them).

@scikit-image scikit-image locked and limited conversation to collaborators Oct 18, 2021
@grlee77 grlee77 reopened this Feb 20, 2022
@scikit-image scikit-image unlocked this conversation Feb 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants