Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added case in derive_by_array for sparse array #16937

Merged
merged 8 commits into from Jun 10, 2019

Conversation

Projects
None yet
5 participants
@kangzhiq
Copy link
Contributor

commented Jun 1, 2019

References to other Issues or PRs

#16941

Brief description of what is fixed or changed

The sparse array is cast to dense array while performing a derivation. Using directly sparse array will bring an improvement when it comes to a great dimension array whose value are mostly zeros.
The sparse was cast to dense array for derivative by both array and scalar.
Before:

>>> a = MutableSparseNDimArray.zeros(10000, 20000)
>>> a[1, 1] = i
>>> a[1, 2] = j
>>> d = derive_by_array(a, i)
MemoryError
>>> d = derive_by_array(a, (i, j))
MemoryError

Now:

>>> a = MutableSparseNDimArray.zeros(10000, 20000)
>>> a[1, 1] = i
>>> derive_by_array(a, i) = ImmutableSparseNDimArray({20001:1},(10000, 20000))
True
>>> a[1, 2] = j
>>> derive_by_array(a, (i, j)) == ImmutableSparseNDimArray({20001: 1, 200020002: 1},(2, 10000, 20000))
True

Other comments

Release Notes

  • tensor
    • Added specific case for sparse array in derive_by_array
Added case in derive_by_array for sparse array
- derive_by_array for sparse array will not be case to dense array
@sympy-bot

This comment has been minimized.

Copy link

commented Jun 1, 2019

Hi, I am the SymPy bot (v147). I'm here to help you write a release notes entry. Please read the guide on how to write release notes.

Your release notes are in good order.

Here is what the release notes will look like:

  • tensor
    • Added specific case for sparse array in derive_by_array (#16937 by @kangzhiq)

This will be added to https://github.com/sympy/sympy/wiki/Release-Notes-for-1.5.

Note: This comment will be updated with the latest check if you edit the pull request. You need to reload the page to see it.

Click here to see the pull request description that was parsed.

<!-- Your title above should be a short description of what
was changed. Do not include the issue number in the title. -->

#### References to other Issues or PRs
<!-- If this pull request fixes an issue, write "Fixes #NNNN" in that exact
format, e.g. "Fixes #1234". See
https://github.com/blog/1506-closing-issues-via-pull-requests . Please also
write a comment on that issue linking back to this pull request once it is
open. -->
#16941

#### Brief description of what is fixed or changed
The sparse array is cast to dense array while performing a derivation. Using directly sparse array will bring an improvement when it comes to a great dimension array whose value are mostly zeros. 
The sparse was cast to dense array for derivative by both **array** and **scalar**. 
Before:
```python
>>> a = MutableSparseNDimArray.zeros(10000, 20000)
>>> a[1, 1] = i
>>> a[1, 2] = j
>>> d = derive_by_array(a, i)
MemoryError
>>> d = derive_by_array(a, (i, j))
MemoryError
```
Now:
```python
>>> a = MutableSparseNDimArray.zeros(10000, 20000)
>>> a[1, 1] = i
>>> derive_by_array(a, i) = ImmutableSparseNDimArray({20001:1},(10000, 20000))
True
>>> a[1, 2] = j
>>> derive_by_array(a, (i, j)) == ImmutableSparseNDimArray({20001: 1, 200020002: 1},(2, 10000, 20000))
True
```

#### Other comments 

#### Release Notes

<!-- Write the release notes for this release below. See
https://github.com/sympy/sympy/wiki/Writing-Release-Notes for more information
on how to write release notes. The bot will check your release notes
automatically to see if they are formatted correctly. -->

<!-- BEGIN RELEASE NOTES -->
* tensor
  * Added specific case for sparse array in derive_by_array 
<!-- END RELEASE NOTES -->

Update

The release notes on the wiki have been updated.

Show resolved Hide resolved sympy/tensor/array/arrayop.py
Show resolved Hide resolved sympy/tensor/array/ndim_array.py Outdated
Show resolved Hide resolved sympy/tensor/array/sparse_ndim_array.py Outdated
Ameliorated some codes
- regrouped if blocs in _handle_ndarray_creation_inputs
- ameliorated codes modified or added
@codecov

This comment has been minimized.

Copy link

commented Jun 2, 2019

Codecov Report

Merging #16937 into master will increase coverage by 0.233%.
The diff coverage is 94.736%.

@@              Coverage Diff              @@
##            master    #16937       +/-   ##
=============================================
+ Coverage   73.922%   74.156%   +0.233%     
=============================================
  Files          620       620               
  Lines       160243    160548      +305     
  Branches     37600     37665       +65     
=============================================
+ Hits        118456    119057      +601     
+ Misses       36305     36085      -220     
+ Partials      5482      5406       -76
Modified sparse array and added __eq__
- Removed overriden function _eval_derivative_n_times
- Overrode _eval_derivative_array
- Overrode __eq__ to accelerate the equality evaluation
- Overrode __hash__ to make sparse array hashable
Show resolved Hide resolved sympy/tensor/array/sparse_ndim_array.py Outdated
if isinstance(other, SparseNDimArray):
from sympy import Dict
dict_self = Dict(self._sparse_array)
dict_other = Dict(other._sparse_array)

This comment has been minimized.

Copy link
@kangzhiq

kangzhiq Jun 4, 2019

Author Contributor

The dictionaries are cast to Dict because SymPy 's Dict and Python's built-in dict are not fully compatible. By saying that I mean that

>>> from sympy import Dict
>>> a = Dict({3:5, 2:7})
>>> b = {3:5, 2:7}
>>> a == b
False

They are not equal although they have exactly the same value.

This comment has been minimized.

Copy link
@Upabjojr

Upabjojr Jun 5, 2019

Contributor

No casting please. There are better ways to compare.

Show resolved Hide resolved sympy/tensor/array/sparse_ndim_array.py Outdated
Show resolved Hide resolved sympy/tensor/array/sparse_ndim_array.py Outdated
if isinstance(other, SparseNDimArray):
from sympy import Dict
dict_self = Dict(self._sparse_array)
dict_other = Dict(other._sparse_array)

This comment has been minimized.

Copy link
@Upabjojr

Upabjojr Jun 5, 2019

Contributor

No casting please. There are better ways to compare.

from sympy import Dict

if shape is None:
if iterable is None:

This comment has been minimized.

Copy link
@Upabjojr

Upabjojr Jun 5, 2019

Contributor

is this change necessary?

This comment has been minimized.

Copy link
@kangzhiq

kangzhiq Jun 5, 2019

Author Contributor

This change is to refactor the code so that shape is None will not be repeated several times

This comment has been minimized.

Copy link
@Upabjojr

Upabjojr Jun 6, 2019

Contributor

OK, open a new PR next time. A separate PR on this issue would already be mergeable.

@Upabjojr

This comment has been minimized.

Copy link
Contributor

commented Jun 5, 2019

Tests are needed.

@Upabjojr

This comment has been minimized.

Copy link
Contributor

commented Jun 5, 2019

Generally, you should not create new methods for equality checks. There is already one I believe, you should probably work on that one.

kangzhiq added some commits Jun 5, 2019

Improved derive_by_array
- Added support for derivative by array for sparse array
- Modified _eval_derivative_array in sparse array
Removed unnecessary overrides and add tests
- Removed override method __eq__, __hash__, eval_derivative_array in
Sprase array
- Modified __eq__ in NDimArray to add case for sparse array
- Modified derive_by_array to add case for sparse array
- Added tests of __eq__ for large scale sparse array
- Added tests for derive_by_array for large scale sparse array
sparse_array = {}
loop_size = len(expr)
for i, x in enumerate(dx):
sparse_diff = derive_by_array(expr, x)._sparse_array

This comment has been minimized.

Copy link
@Upabjojr

Upabjojr Jun 7, 2019

Contributor

why is this calling derive_by_array?

This comment has been minimized.

Copy link
@kangzhiq

kangzhiq Jun 7, 2019

Author Contributor

This is because .diff() was slow when it comes to large scale sparse array, so I chose not to use it. Besides, derive_by_array , as said in the description of the function, supports the derivate by scalars, so I used it. If .diff() or applyfunc is fixed, then we can simply call expr.diff(x).

Show resolved Hide resolved sympy/tensor/array/arrayop.py Outdated
Show resolved Hide resolved sympy/tensor/array/arrayop.py Outdated
if isinstance(dx, array_types):
new_array = [[y.diff(x) for y in expr] for x in dx]
if isinstance(expr, SparseNDimArray):

This comment has been minimized.

Copy link
@Upabjojr

Upabjojr Jun 7, 2019

Contributor

if both expr and dx are sparse, just use a dict comprehension (in a single line).

This comment has been minimized.

Copy link
@kangzhiq

kangzhiq Jun 7, 2019

Author Contributor

The line 213 uses dict comprehension when expr is sparse but dx is not.
I think for the case where dx is an array(sparse or dense), it is necessary to calculate the new index for the derivative by each term in dx , which is the purpose of _sparse_array_derive_by_array.

This comment has been minimized.

Copy link
@Upabjojr

Upabjojr Jun 8, 2019

Contributor

Yes, so why not just an array comprehension?

This comment has been minimized.

Copy link
@Upabjojr

Upabjojr Jun 8, 2019

Contributor

{k1+k2: v2.diff(v1) for k1, v1 in dx... dict ...items() for k2, v2 in expr... dict ...items()}

This comment has been minimized.

Copy link
@kangzhiq

kangzhiq Jun 8, 2019

Author Contributor

OK, I got your point! Thank you!

Show resolved Hide resolved sympy/tensor/array/ndim_array.py Outdated
all(v == other._sparse_array[k] for (k, v) in self._sparse_array.items()
if k in other._sparse_array)

return list(self) == list(other)

This comment has been minimized.

Copy link
@Upabjojr

Upabjojr Jun 7, 2019

Contributor

why not just use something similar to this? dict(...) == dict(... other ...) ?

This comment has been minimized.

Copy link
@kangzhiq

kangzhiq Jun 7, 2019

Author Contributor

I thought I am not supposed to cast one type to the other so I came up with this code. But if we can cast Dict to dict, that would indeed be much simpler. :-)
By the way, the problem of comparing Dict and dict will dispear if the test about mutableArray.diff() == MutableArray is changed to mutableArray.diff() == ImmutableArray().

This comment has been minimized.

Copy link
@Upabjojr

Upabjojr Jun 8, 2019

Contributor

Just cast to dict.

Show resolved Hide resolved sympy/tensor/array/sparse_ndim_array.py Outdated
Show resolved Hide resolved sympy/tensor/array/tests/test_arrayop.py Outdated
Temporarily resolved derive_by_array
- Temorarily fixed .diff()
- Fixed some mistakes
- Ameliorated tests
if isinstance(dx, array_types):
new_array = [[y.diff(x) for y in expr] for x in dx]
if isinstance(expr, SparseNDimArray):

This comment has been minimized.

Copy link
@Upabjojr

Upabjojr Jun 8, 2019

Contributor

Yes, so why not just an array comprehension?

if isinstance(dx, array_types):
new_array = [[y.diff(x) for y in expr] for x in dx]
if isinstance(expr, SparseNDimArray):

This comment has been minimized.

Copy link
@Upabjojr

Upabjojr Jun 8, 2019

Contributor

{k1+k2: v2.diff(v1) for k1, v1 in dx... dict ...items() for k2, v2 in expr... dict ...items()}

if isinstance(arg, (Iterable, Tuple, MatrixCommon, NDimArray)):
return derive_by_array(self, arg)
else:
if isinstance(self, SparseNDimArray):

This comment has been minimized.

Copy link
@Upabjojr

Upabjojr Jun 8, 2019

Contributor

This method was not supposed to be modified, applyfunc should be able to handle sparse arrays efficiently.

Ameliorated functions and removed unnecessary code
- Ameliorated derive_by_array using dict comprehension
- Removed _sparse_array_derive_by_array
- Ameliorated applyfunc for the case of sparse array
- Ameliorated __eq__ in NDimArray by casting dictionaries to dict
lp = len(expr)
new_array = {k + i*lp: expr.diff(x)._sparse_array[k]
for k in expr._sparse_array
for i, x in enumerate(dx)

This comment has been minimized.

Copy link
@Upabjojr

Upabjojr Jun 10, 2019

Contributor

why not iterate over the sparse array? You are flattening all members and then filtering with an if on the next line.

This comment has been minimized.

Copy link
@kangzhiq

kangzhiq Jun 10, 2019

Author Contributor

This is because dx is an arraytype, not exactly a sparse array. Besides, I was allowing flatterning of dx because dx is supposed to be a sequence of non-zero values since:

>>> S.Zero._diff_wrt
False

So I think dx is not supposed to be a sparse array. Am I misunderstanding something here? Please let me know if so, thank you!!

This comment has been minimized.

Copy link
@kangzhiq

kangzhiq Jun 10, 2019

Author Contributor

I think I got your point! Working on it.

Show resolved Hide resolved sympy/tensor/array/arrayop.py Outdated
@@ -111,7 +111,6 @@ def reshape(self, *newshape):

return type(self)(*(newshape + (self._array,)))


This comment has been minimized.

Copy link
@Upabjojr

Upabjojr Jun 10, 2019

Contributor

two blank lines in zeroth indentation

Ameliorated codes and added tests
- Ameliorated codes in derive_by_array
- Added test for f(0) != 0 for applyfunc

@Upabjojr Upabjojr merged commit 4cf38ad into sympy:master Jun 10, 2019

3 checks passed

codecov/project 74.156% (target 0%)
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
sympy-bot/release-notes The release notes look OK
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.