-
-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use scipy.sparse array datastructure #5139
Conversation
Seems like a reasonable place to start. nx.to_scipy_sparse_matrix is one of the primary interfaces to scipy.sparse from within NetworkX.
Fix two instances in modularitymatrix where a new 2D array was being created via an outer product of two \"vectors\". In the matrix case, this was a row vector \* a column vector. In the array case this can be disambiguated by being explicit with np.outer.
- A few instances of matrix multiplication operator - Add np.newaxis + transpose to get shape right for broadcasting - Explicitly convert e.g. sp.sparse.spdiags to a csr_array.
- Wrap spdiags in csr_array and update matmul operators.
- Replace .A call with appropriate array semantics - wrap sparse.diags in csr_array.
- Replace * with @ - Remove superfluous calls to flatten.
- Simplify lil.getrowview call - Wrap spdiags in csr_array.
Update undirected laplacian functions.
Conflicts: - [x] networkx/algorithms/node_classification/hmn.py - [x] networkx/algorithms/node_classification/lgc.py - [x] networkx/linalg/algebraicconnectivity.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good so far. I like the notation of places where we should sparsify or not densify.
Just one comment about whether setting x should occur in an if statement or not.
Nice!
- biadjacency_matrix. - bethe_hessian_matrix. - incidence_matrix. - laplacian functions. - modularity_matrix functions. - adjacency_matrix.
Add a new conversion function to preserve array semantics internally while not altering behavior for users. Also adds FutureWarning to to_scipy_sparse_matrix.
2a9a4c0
to
b6ce853
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main questions I have are about tests/test_convert_scipy.py
where there are tests of to_scipy_sparse_array
, but none for from_scipy_sparse_array
. I'm guessing that the existing tests for from_scipy_sparse_matrix
can just be switched over to the array api. Do we want to do that now? Do we want to duplicate it now (convert one copy to array) and then remove the version for matrix later?
Otherwise this look good. I'm assuming that the sparse array itself is being tested elsewhere. So we're just testing the NetworkX usage of this data structure.
This was because I hadn't added
Correct, this was also done in 4a92d07. In fact, |
@MridulS @dschult I think we should merge this before scipy 1.8 is released. Then we can merge #5262. Before releasing networkx 2.7, we will need to update the requirements file. Since this is a big change it would be good to have it live in the main branch for a bit (potentially more people we see it and possibly test it for us). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! Thanks @rossbar!
Wow -- there is a lot in here.... :} LGTM... |
Co-authored-by: Mridul Seth <mail@mriduls.com>
Nope, not at this point. We're still waiting on the official release of scipy 1.8, but the dependency updates can be handled in a separate PR once it's out. |
I'll go ahead and merge this then 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A thorough piece of work, thank you @rossbar. Some tiny pieces of feedback, to use or lose as you choose.
edge_attribute: string | ||
Name of edge attribute to store matrix numeric value. The data will | ||
have the same type as the matrix entry (int, float, (real,imag)). | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing returns section
def from_scipy_sparse_matrix( | ||
A, parallel_edges=False, create_using=None, edge_attribute="weight" | ||
): | ||
"""Creates a new graph from an adjacency matrix given as a SciPy sparse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick, but ideally these fit on one line. We also typically use "Create" instead of "Creates", i.e., something like:
"Create a graph from a sparse adjacency matrix."
@@ -35,7 +35,7 @@ def bethe_hessian_matrix(G, r=None, nodelist=None): | |||
|
|||
Returns | |||
------- | |||
H : Numpy matrix | |||
H : scipy.sparse.csr_matrix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this level of specificity. Should the same be used above, instead of simply "SciPy sparse matrix"? Or were you keeping your options open there, in case it might change in the future?
return (r ** 2 - 1) * I - r * A + D | ||
# TODO: Rm csr_array wrapper when spdiags array creation becomes available | ||
D = sp.sparse.csr_array(sp.sparse.spdiags(A.sum(axis=1), 0, m, n, format="csr")) | ||
# TODO: Rm csr_array wrapper when eye array creation becomes available |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nicely descriptive TODO. Above there is one that simply says "TODO: csr_array"
Ah, perfect timing as usual 😂 |
Woopsie, sorry about that 😅 |
* Step 1: use sparse arrays in nx.to_scipy_sparse_matrix. Seems like a reasonable place to start. nx.to_scipy_sparse_matrix is one of the primary interfaces to scipy.sparse from within NetworkX. * 1: Use np.outer instead of mult col/row vectors Fix two instances in modularitymatrix where a new 2D array was being created via an outer product of two \"vectors\". In the matrix case, this was a row vector \* a column vector. In the array case this can be disambiguated by being explicit with np.outer. * Update _transition_matrix in laplacianmatrix module - A few instances of matrix multiplication operator - Add np.newaxis + transpose to get shape right for broadcasting - Explicitly convert e.g. sp.sparse.spdiags to a csr_array. * Update directed_combinitorial_laplacian w/ sparse array. - Wrap spdiags in csr_array and update matmul operators. * Rm matrix-specific code from lgc and hmn modules - Replace .A call with appropriate array semantics - wrap sparse.diags in csr_array. * Change hits to use sparse array semantics. - Replace * with @ - Remove superfluous calls to flatten. * Update sparse matrix usage in layout module. - Simplify lil.getrowview call - Wrap spdiags in csr_array. * lil_matrix -> lil_array in graphmatrix.py. * WIP: Start working on algebraic connectivity module. * Incorporate auth mat varname feedback. * Revert 1D slice and comment for 1D sparse future. * Add TODOs: rm csr_array wrapper around spdiags etc. * WIP: cleanup algebraicconn: tracemin_fiedler. * Typo. * Finish reviewing algebraicconnectivity. * Convert bethe_hessian matrix to use sparse arrays. * WIP: update laplacian. Update undirected laplacian functions. * WIP: laplacian - add comment about _transition_matrix return types. * Finish laplacianmatrix review. * Update attrmatrix. * Switch to official laplacian function. * Update pagerank to use sparse array. * Switch bipartite matrix to sparse arrays. * Check from_scipy_sparse_matrix works with arrays. Modifies test suite. * Apply changes from review. * Fix failing docstring tests. * Fix missing axis for in-place multiplication. * Use scipy==1.8rc2 * Use matrix multiplication * Fix PyPy CI * [MRG] Create plot_subgraphs.py example (networkx#5165) * Create plot_subgraphs.py networkx#4220 * Update plot_subgraphs.py black * Update plot_subgraphs.py lint plus font_size * Update plot_subgraphs.py added more plots * Update plot_subgraphs.py removed plots from the unit test and added comments * Update plot_subgraphs.py lint * Update plot_subgraphs.py typos fixed * Update plot_subgraphs.py added nodes to the plot of the edges removed that was commented out for whatever reason * Update plot_subgraphs.py revert the latest commit - the line was commented out for a reason - it's broken * Update plot_subgraphs.py fixed node color issue * Update plot_subgraphs.py format fix * Update plot_subgraphs.py forgot to draw the nodes... now fixed * Fix sphinx warnings about heading length. * Update examples/algorithms/plot_subgraphs.py * Update examples/algorithms/plot_subgraphs.py Co-authored-by: Ross Barnowski <rossbar@berkeley.edu> Co-authored-by: Dan Schult <dschult@colgate.edu> * Add traveling salesman problem to example gallery (networkx#4874) Adds an example of the using Christofides to solve the TSP problem to the example galery. Co-authored-by: Ross Barnowski <rossbar@berkeley.edu> * Fixed inconsistent documentation for nbunch parameter in DiGraph.edges() (networkx#5037) * Fixed inconsistent documentation for nbunch parameter in DiGraph.edges() * Resolved Requested Changes * Revert changes to degree docstrings. * Update comments in example. * Apply wording to edges method in all graph classes. Co-authored-by: Ross Barnowski <rossbar@berkeley.edu> * Compatibility updates from testing with numpy/scipy/pytest rc's (networkx#5226) * Rm deprecated scipy subpkg access. * Use recwarn fixture in place of deprecated pytest pattern. * Rm unnecessary try/except from tests. * Replace internal `close` fn with `math.isclose`. (networkx#5224) * Replace internal close fn with math.isclose. * Fix lines in docstring examples. * Fix Python 3.10 deprecation warning w/ int div. (networkx#5231) * Touchups and suggestions for subgraph gallery example (networkx#5225) * Simplify construction of G with edges rm'd * Rm unused graph attribute. * Shorten categorization by node type. * Simplify node coloring. * Simplify isomorphism check. * Rm unit test. * Rm redundant plotting of each subgraph. * Use new package name (networkx#5234) * Allowing None edges in weight function of bidirectional Dijkstra (networkx#5232) * added following feature also to bidirectional dijkstra: The weight function can be used to hide edges by returning None. * changed syntax for better readability and code duplicate avoidance Co-authored-by: Hohmann, Nikolas <nikolas.hohmann@tu-darmstadt.de> * Add an FAQ about assigning issues. (networkx#5182) * Add FAQ about assigning issues. * Add note about linking issues from new PRs. * Update dev deps (networkx#5243) * Update minor doc issues with tex notation (networkx#5244) * Add FutureWarnings to fns that return sparse matrices - biadjacency_matrix. - bethe_hessian_matrix. - incidence_matrix. - laplacian functions. - modularity_matrix functions. - adjacency_matrix. * Add to_scipy_sparse_array and use it everywhere. Add a new conversion function to preserve array semantics internally while not altering behavior for users. Also adds FutureWarning to to_scipy_sparse_matrix. * Add from_scipy_sparse_array. Supercedes from_scipy_sparse_matrix. * Handle deprecations in separate PR. * Fix docstring examples. Co-authored-by: Mridul Seth <mail@mriduls.com> Co-authored-by: Jarrod Millman <jarrod.millman@gmail.com> Co-authored-by: Andrew Knyazev <andrew.knyazev@ucdenver.edu> Co-authored-by: Dan Schult <dschult@colgate.edu> Co-authored-by: eskountis <56514439+eskountis@users.noreply.github.com> Co-authored-by: Anutosh Bhat <87052487+anutosh491@users.noreply.github.com> Co-authored-by: NikHoh <nikhoh@web.de> Co-authored-by: Hohmann, Nikolas <nikolas.hohmann@tu-darmstadt.de> Co-authored-by: Sultan Orazbayev <contact@econpoint.com> Co-authored-by: Mridul Seth <mail@mriduls.com>
* Step 1: use sparse arrays in nx.to_scipy_sparse_matrix. Seems like a reasonable place to start. nx.to_scipy_sparse_matrix is one of the primary interfaces to scipy.sparse from within NetworkX. * 1: Use np.outer instead of mult col/row vectors Fix two instances in modularitymatrix where a new 2D array was being created via an outer product of two \"vectors\". In the matrix case, this was a row vector \* a column vector. In the array case this can be disambiguated by being explicit with np.outer. * Update _transition_matrix in laplacianmatrix module - A few instances of matrix multiplication operator - Add np.newaxis + transpose to get shape right for broadcasting - Explicitly convert e.g. sp.sparse.spdiags to a csr_array. * Update directed_combinitorial_laplacian w/ sparse array. - Wrap spdiags in csr_array and update matmul operators. * Rm matrix-specific code from lgc and hmn modules - Replace .A call with appropriate array semantics - wrap sparse.diags in csr_array. * Change hits to use sparse array semantics. - Replace * with @ - Remove superfluous calls to flatten. * Update sparse matrix usage in layout module. - Simplify lil.getrowview call - Wrap spdiags in csr_array. * lil_matrix -> lil_array in graphmatrix.py. * WIP: Start working on algebraic connectivity module. * Incorporate auth mat varname feedback. * Revert 1D slice and comment for 1D sparse future. * Add TODOs: rm csr_array wrapper around spdiags etc. * WIP: cleanup algebraicconn: tracemin_fiedler. * Typo. * Finish reviewing algebraicconnectivity. * Convert bethe_hessian matrix to use sparse arrays. * WIP: update laplacian. Update undirected laplacian functions. * WIP: laplacian - add comment about _transition_matrix return types. * Finish laplacianmatrix review. * Update attrmatrix. * Switch to official laplacian function. * Update pagerank to use sparse array. * Switch bipartite matrix to sparse arrays. * Check from_scipy_sparse_matrix works with arrays. Modifies test suite. * Apply changes from review. * Fix failing docstring tests. * Fix missing axis for in-place multiplication. * Use scipy==1.8rc2 * Use matrix multiplication * Fix PyPy CI * [MRG] Create plot_subgraphs.py example (networkx#5165) * Create plot_subgraphs.py networkx#4220 * Update plot_subgraphs.py black * Update plot_subgraphs.py lint plus font_size * Update plot_subgraphs.py added more plots * Update plot_subgraphs.py removed plots from the unit test and added comments * Update plot_subgraphs.py lint * Update plot_subgraphs.py typos fixed * Update plot_subgraphs.py added nodes to the plot of the edges removed that was commented out for whatever reason * Update plot_subgraphs.py revert the latest commit - the line was commented out for a reason - it's broken * Update plot_subgraphs.py fixed node color issue * Update plot_subgraphs.py format fix * Update plot_subgraphs.py forgot to draw the nodes... now fixed * Fix sphinx warnings about heading length. * Update examples/algorithms/plot_subgraphs.py * Update examples/algorithms/plot_subgraphs.py Co-authored-by: Ross Barnowski <rossbar@berkeley.edu> Co-authored-by: Dan Schult <dschult@colgate.edu> * Add traveling salesman problem to example gallery (networkx#4874) Adds an example of the using Christofides to solve the TSP problem to the example galery. Co-authored-by: Ross Barnowski <rossbar@berkeley.edu> * Fixed inconsistent documentation for nbunch parameter in DiGraph.edges() (networkx#5037) * Fixed inconsistent documentation for nbunch parameter in DiGraph.edges() * Resolved Requested Changes * Revert changes to degree docstrings. * Update comments in example. * Apply wording to edges method in all graph classes. Co-authored-by: Ross Barnowski <rossbar@berkeley.edu> * Compatibility updates from testing with numpy/scipy/pytest rc's (networkx#5226) * Rm deprecated scipy subpkg access. * Use recwarn fixture in place of deprecated pytest pattern. * Rm unnecessary try/except from tests. * Replace internal `close` fn with `math.isclose`. (networkx#5224) * Replace internal close fn with math.isclose. * Fix lines in docstring examples. * Fix Python 3.10 deprecation warning w/ int div. (networkx#5231) * Touchups and suggestions for subgraph gallery example (networkx#5225) * Simplify construction of G with edges rm'd * Rm unused graph attribute. * Shorten categorization by node type. * Simplify node coloring. * Simplify isomorphism check. * Rm unit test. * Rm redundant plotting of each subgraph. * Use new package name (networkx#5234) * Allowing None edges in weight function of bidirectional Dijkstra (networkx#5232) * added following feature also to bidirectional dijkstra: The weight function can be used to hide edges by returning None. * changed syntax for better readability and code duplicate avoidance Co-authored-by: Hohmann, Nikolas <nikolas.hohmann@tu-darmstadt.de> * Add an FAQ about assigning issues. (networkx#5182) * Add FAQ about assigning issues. * Add note about linking issues from new PRs. * Update dev deps (networkx#5243) * Update minor doc issues with tex notation (networkx#5244) * Add FutureWarnings to fns that return sparse matrices - biadjacency_matrix. - bethe_hessian_matrix. - incidence_matrix. - laplacian functions. - modularity_matrix functions. - adjacency_matrix. * Add to_scipy_sparse_array and use it everywhere. Add a new conversion function to preserve array semantics internally while not altering behavior for users. Also adds FutureWarning to to_scipy_sparse_matrix. * Add from_scipy_sparse_array. Supercedes from_scipy_sparse_matrix. * Handle deprecations in separate PR. * Fix docstring examples. Co-authored-by: Mridul Seth <mail@mriduls.com> Co-authored-by: Jarrod Millman <jarrod.millman@gmail.com> Co-authored-by: Andrew Knyazev <andrew.knyazev@ucdenver.edu> Co-authored-by: Dan Schult <dschult@colgate.edu> Co-authored-by: eskountis <56514439+eskountis@users.noreply.github.com> Co-authored-by: Anutosh Bhat <87052487+anutosh491@users.noreply.github.com> Co-authored-by: NikHoh <nikhoh@web.de> Co-authored-by: Hohmann, Nikolas <nikolas.hohmann@tu-darmstadt.de> Co-authored-by: Sultan Orazbayev <contact@econpoint.com> Co-authored-by: Mridul Seth <mail@mriduls.com>
* Step 1: use sparse arrays in nx.to_scipy_sparse_matrix. Seems like a reasonable place to start. nx.to_scipy_sparse_matrix is one of the primary interfaces to scipy.sparse from within NetworkX. * 1: Use np.outer instead of mult col/row vectors Fix two instances in modularitymatrix where a new 2D array was being created via an outer product of two \"vectors\". In the matrix case, this was a row vector \* a column vector. In the array case this can be disambiguated by being explicit with np.outer. * Update _transition_matrix in laplacianmatrix module - A few instances of matrix multiplication operator - Add np.newaxis + transpose to get shape right for broadcasting - Explicitly convert e.g. sp.sparse.spdiags to a csr_array. * Update directed_combinitorial_laplacian w/ sparse array. - Wrap spdiags in csr_array and update matmul operators. * Rm matrix-specific code from lgc and hmn modules - Replace .A call with appropriate array semantics - wrap sparse.diags in csr_array. * Change hits to use sparse array semantics. - Replace * with @ - Remove superfluous calls to flatten. * Update sparse matrix usage in layout module. - Simplify lil.getrowview call - Wrap spdiags in csr_array. * lil_matrix -> lil_array in graphmatrix.py. * WIP: Start working on algebraic connectivity module. * Incorporate auth mat varname feedback. * Revert 1D slice and comment for 1D sparse future. * Add TODOs: rm csr_array wrapper around spdiags etc. * WIP: cleanup algebraicconn: tracemin_fiedler. * Typo. * Finish reviewing algebraicconnectivity. * Convert bethe_hessian matrix to use sparse arrays. * WIP: update laplacian. Update undirected laplacian functions. * WIP: laplacian - add comment about _transition_matrix return types. * Finish laplacianmatrix review. * Update attrmatrix. * Switch to official laplacian function. * Update pagerank to use sparse array. * Switch bipartite matrix to sparse arrays. * Check from_scipy_sparse_matrix works with arrays. Modifies test suite. * Apply changes from review. * Fix failing docstring tests. * Fix missing axis for in-place multiplication. * Use scipy==1.8rc2 * Use matrix multiplication * Fix PyPy CI * [MRG] Create plot_subgraphs.py example (networkx#5165) * Create plot_subgraphs.py networkx#4220 * Update plot_subgraphs.py black * Update plot_subgraphs.py lint plus font_size * Update plot_subgraphs.py added more plots * Update plot_subgraphs.py removed plots from the unit test and added comments * Update plot_subgraphs.py lint * Update plot_subgraphs.py typos fixed * Update plot_subgraphs.py added nodes to the plot of the edges removed that was commented out for whatever reason * Update plot_subgraphs.py revert the latest commit - the line was commented out for a reason - it's broken * Update plot_subgraphs.py fixed node color issue * Update plot_subgraphs.py format fix * Update plot_subgraphs.py forgot to draw the nodes... now fixed * Fix sphinx warnings about heading length. * Update examples/algorithms/plot_subgraphs.py * Update examples/algorithms/plot_subgraphs.py Co-authored-by: Ross Barnowski <rossbar@berkeley.edu> Co-authored-by: Dan Schult <dschult@colgate.edu> * Add traveling salesman problem to example gallery (networkx#4874) Adds an example of the using Christofides to solve the TSP problem to the example galery. Co-authored-by: Ross Barnowski <rossbar@berkeley.edu> * Fixed inconsistent documentation for nbunch parameter in DiGraph.edges() (networkx#5037) * Fixed inconsistent documentation for nbunch parameter in DiGraph.edges() * Resolved Requested Changes * Revert changes to degree docstrings. * Update comments in example. * Apply wording to edges method in all graph classes. Co-authored-by: Ross Barnowski <rossbar@berkeley.edu> * Compatibility updates from testing with numpy/scipy/pytest rc's (networkx#5226) * Rm deprecated scipy subpkg access. * Use recwarn fixture in place of deprecated pytest pattern. * Rm unnecessary try/except from tests. * Replace internal `close` fn with `math.isclose`. (networkx#5224) * Replace internal close fn with math.isclose. * Fix lines in docstring examples. * Fix Python 3.10 deprecation warning w/ int div. (networkx#5231) * Touchups and suggestions for subgraph gallery example (networkx#5225) * Simplify construction of G with edges rm'd * Rm unused graph attribute. * Shorten categorization by node type. * Simplify node coloring. * Simplify isomorphism check. * Rm unit test. * Rm redundant plotting of each subgraph. * Use new package name (networkx#5234) * Allowing None edges in weight function of bidirectional Dijkstra (networkx#5232) * added following feature also to bidirectional dijkstra: The weight function can be used to hide edges by returning None. * changed syntax for better readability and code duplicate avoidance Co-authored-by: Hohmann, Nikolas <nikolas.hohmann@tu-darmstadt.de> * Add an FAQ about assigning issues. (networkx#5182) * Add FAQ about assigning issues. * Add note about linking issues from new PRs. * Update dev deps (networkx#5243) * Update minor doc issues with tex notation (networkx#5244) * Add FutureWarnings to fns that return sparse matrices - biadjacency_matrix. - bethe_hessian_matrix. - incidence_matrix. - laplacian functions. - modularity_matrix functions. - adjacency_matrix. * Add to_scipy_sparse_array and use it everywhere. Add a new conversion function to preserve array semantics internally while not altering behavior for users. Also adds FutureWarning to to_scipy_sparse_matrix. * Add from_scipy_sparse_array. Supercedes from_scipy_sparse_matrix. * Handle deprecations in separate PR. * Fix docstring examples. Co-authored-by: Mridul Seth <mail@mriduls.com> Co-authored-by: Jarrod Millman <jarrod.millman@gmail.com> Co-authored-by: Andrew Knyazev <andrew.knyazev@ucdenver.edu> Co-authored-by: Dan Schult <dschult@colgate.edu> Co-authored-by: eskountis <56514439+eskountis@users.noreply.github.com> Co-authored-by: Anutosh Bhat <87052487+anutosh491@users.noreply.github.com> Co-authored-by: NikHoh <nikhoh@web.de> Co-authored-by: Hohmann, Nikolas <nikolas.hohmann@tu-darmstadt.de> Co-authored-by: Sultan Orazbayev <contact@econpoint.com> Co-authored-by: Mridul Seth <mail@mriduls.com>
This is a currently in-progress experiment to see what it would take to adopt the sparse array interface proposed in scipy/scipy#14822. Supporting array semantics for sparse data structures will in the long run be a huge improvement for NetworkX IMO. Both dense arrays (i.e. numpy arrays) and sparse arrays are used extensively throughout, and a decent amount of code is dedicated to implementing something for one or the other data structure. The ability to have these data structures behave the same and use them interchangeable would be excellent! Of course, scipy/scipy#14822 doesn't get all the way there yet, but it is a step in the right direction. This PR is to help flesh out the work there and to help start to frame the type of improvements we'd see in NetworkX if such a feature were to be added upstream.
Note also that this is very much a work in progress - the scipy.sparse array API itself is going to be changing a lot, so a lot of the changes here (for example, wrapping objects created with
spdiags
insparse.csr_array
calls) is subject to change.