Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discussion]NumPy-like behavior in Array module #16911

Open
kangzhiq opened this issue May 27, 2019 · 7 comments

Comments

Projects
None yet
4 participants
@kangzhiq
Copy link
Contributor

commented May 27, 2019

The idea comes originally from this issue #15464.
I have noticed that the initialization of an Array, in some case, accepts only a sequence of int, rather than a tuple like we do in NumPy.

>>> from sympy.tensor.array import MutableDenseNDimArray
>>> MutableDenseNDimArray.zeros(2, 3)
[[0, 0, 0], [0, 0, 0]]
>>> MutableDenseNDimArray.zeros((2, 3))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\CodeHouse\Python\sympy\sympy\tensor\array\dense_ndim_array.py", line 62, in zeros
    return cls._new(([0]*list_length,), shape)
TypeError: can't multiply sequence by non-int of type 'tuple'

@Upabjojr has also noticed a difference of behavior between SymPy and NumPy. I would give the floor to him so that the problem can be illustrated clearly. (Thank you!)

In my opinion, it would be good if we follow the example of NumPy in syntax and behaviors of Array. Besides, as far as I know, Array module is recently created, so it could be eligible to some changes.

Based on this topic, what do you think about it?

@czgdp1807

This comment has been minimized.

Copy link
Member

commented May 28, 2019

I think, the NumPy syntax should be supported. However, it would be good if you can present some advantages and disadvantages of using the NumPy syntax.
I am saying so because, tuple, (2, 3) can safely be converted to sequence 2, 3 by * operator, so, there shouldn't be any problem in adapting it. We can even allow both types of inputs. Let's see what @Upabjojr has to say over it. May be, there is some difference of principles used in SymPy and NumPy which don't allow us to use tuples. I want to know them.

@kangzhiq

This comment has been minimized.

Copy link
Contributor Author

commented May 28, 2019

May be, there is some difference of principles used in SymPy and NumPy which don't allow us to use tuples.

Yes, I agree with you. I would also like to know them. :-)
Regarding the tuple thing, if you look at the codes of zeros():

def zeros(cls, *shape):
        list_length = functools.reduce(lambda x, y: x*y, shape, S.One)
        return cls._new(([0]*list_length,), shape)

In fact, the shape is passed as a tuple to a constructor. So I don't see other constraints to use tuple directly.
I think that both tuple and sequence of numbers can work nicely, and there is not(as far as I know) difference between the performance. So I might not be able to point out which one is better.
My point is that we are having both syntax in our API, which can be ambigious. For example, while initialting an array with a list:

>>> a = MutableDenseNDimArray([0, 1, 2, 3], (2, 2))
>>> a
[[0, 1], [2, 3]]

here we use a tuple to represent the shape. It doen't have to be NumPy-like, it is just a standardization of syntax. Of course, we should only apply it if there are no explicit obstacles.

@Upabjojr

This comment has been minimized.

Copy link
Contributor

commented May 28, 2019

We can even allow both types of inputs.

Yes, that would be optimal.

Many users are used to NumPy's syntax, so the difference may be an obstacle to the adoption of SymPy.

@kangzhiq

This comment has been minimized.

Copy link
Contributor Author

commented May 28, 2019

Ok I would try to make both parameters acceptable!

@kangzhiq

This comment has been minimized.

Copy link
Contributor Author

commented May 28, 2019

@Upabjojr Regarding the index syntax, do you have some comments?
I reproduced the problem:

>>> from sympy.tensor.array import ImmutableDenseNDimArray
>>> a = ImmutableDenseNDimArray([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], (2, 2, 3))
>>> a[1]
1
>>> import numpy as np
>>> b = np.array([[[0, 1, 2], [3, 4, 5]], [[6, 7, 8], [9, 10, 11]]])
>>> b[1]
array([[ 6,  7,  8],
       [ 9, 10, 11]])

Personnally, I have found several differences between those module:

>>> b[1, 1]
array([ 9, 10, 11])
>>> a[1, 1]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\CodeHouse\Python\sympy\sympy\tensor\array\dense_ndim_array.py", line 56, in __getitem__
    index = self._parse_index(index)
  File "D:\CodeHouse\Python\sympy\sympy\tensor\array\ndim_array.py", line 78, in _parse_index
    raise ValueError('Wrong number of array axes')
ValueError: Wrong number of array axes
>>> b[b>3]
array([ 4,  5,  6,  7,  8,  9, 10, 11])
>>> a[a>3]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: '>' not supported between instances of 'ImmutableDenseNDimArray' and 'int'
>>> a[1:9]
[1, 2, 3, 4, 5, 6, 7, 8]
>>> b[1:9]
array([[[ 6,  7,  8],
        [ 9, 10, 11]]])

I will test it more properly to list out the differences.

@czgdp1807

This comment has been minimized.

Copy link
Member

commented May 29, 2019

>>> from sympy.tensor.array import ImmutableDenseNDimArray
>>> a = ImmutableDenseNDimArray([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], (2, 2, 3))
>>> a[1]
1
>>> import numpy as np
>>> b = np.array([[[0, 1, 2], [3, 4, 5]], [[6, 7, 8], [9, 10, 11]]])
>>> b[1]
array([[ 6,  7,  8],
       [ 9, 10, 11]])

I think that ImmutableDenseNDimArray should be in line with numpy. At least, the output should be same for same queries.

@Upabjojr

This comment has been minimized.

Copy link
Contributor

commented May 31, 2019

I agree we should take NumPy as a reference. As the array module is still pretty young, we are still on time to change behaviour.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.