Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

numerix.dot doesn't support tensors #123

Closed
guyer opened this issue Sep 19, 2014 · 10 comments
Closed

numerix.dot doesn't support tensors #123

guyer opened this issue Sep 19, 2014 · 10 comments

Comments

@guyer
Copy link
Member

@guyer guyer commented Sep 19, 2014

>>> A = array([[1e-3,0],[0,1]])
>>> R = array([[cos(pi/4), sin(pi/4)], [sin(-pi/4),cos(pi/4)]])
>>> dot(A,R)
array([ 0.00070711,  0.70710678])
>>> NUMERIX.dot(A,R)
array([[  7.07106781e-04,   7.07106781e-04],
[ -7.07106781e-01,   7.07106781e-01]])

The latter result is what's desired.

The problem is in trunk/fipy/tools/numerix.py@2274#L973, but we can't just use NUMERIX.dot() because it doesn't know about our semantics of the last axis being the Cell or Face index.

Imported from trac ticket #138, created by guyer on 10-16-2007 at 10:53, last modified: 09-30-2013 at 21:44

@guyer
Copy link
Member Author

@guyer guyer commented Sep 19, 2014

NumPy's semantics for dot() are, given [[formula(a[I,:])]] and [[formula(b[J,:,m])]],

#formula
\text{dot}(a,b) = r[I,J,m] = \sum_k a[I,k]  b[J,k,m]

Presently, FiPy is doing, given [[formula(a[:,I])]] and [[formula(b[:,J])]],

#formula
\text{dot}(a,b) = r[K] = \sum_k a[k,I]  b[k,J]

where [[formula(K)]] is the broadcast shape of [[formula(I)]] and [[formula(J)]]. This only does the right thing for vectors and fields of vectors (I guess scalars are OK, too, but definitely not tensors).

What FiPy needs is support for, given [[formula(a[I,:,n])]] and [[formula(b[J,:,m,n])]],

#formula
\text{dot}(a,b) = r[I,J,m,n] = \sum_k a[I,k,n]  b[J,k,m,n]

while still allowing the NumPy syntax.

One possibility is that, if neither a nor b has a dot() method, to just fall back on NumPy's dot(). a and b simply aren't fields of scalars/vectors/tensors unless we declare them as a field object. A NumPy array is not a field, even if shaped correctly; it's an array. That leaves us with figuring out how to make field objects support dot() correctly, irrespective of rank.

Trac comment by guyer on 10-16-2007 at 13:35

@guyer
Copy link
Member Author

@guyer guyer commented Sep 19, 2014

We're not the first ones to want this. See the [http://thread.gmane.org/gmane.comp.python.numeric.general/5470 "Dot product threading?" thread] on mailto:numpy-discussion@scipy.org

Trac comment by guyer on 10-17-2007 at 09:11

@fipymigrate
Copy link

@fipymigrate fipymigrate commented Sep 19, 2014

Neither of NumPy's dot() or tensordot() does the "right" thing with scalars:

>>> dot(array([2]), array([3,4]))
14
>>> tensordot(array([2]), array([3,4]))
Traceback (most recent call last):
...
IndexError: tuple index out of range
>>> tensordot(array([2]), array([3,4]), axes=(-1,0))
Traceback (most recent call last):
...
ValueError: shape-mismatch for sum

when I would expect array([6, 8]).

That being said, this is what I think that our dot() operator should do for each element of a _MeshVariable:

for a#### scalar

>>> dot(array([2]), array([3]))
array([6])
>>> dot(array([2]), array([3, 4]))
array([6, 8])
>>> dot(array([2]), array([[3, 4],
...                        [5, 6]]))
array([[6,   8]
[10, 12]])
>>> dot(array([2]), array([[[3, 4],
...                         [5, 6]],
...                        [[5, 6],
...                         [7, 8]]]))
array([[[6,   8],
[10, 12]],

[[10, 12],
[14, 16]]])

for a#### vector

>>> dot(array([2, 3]), array([2]))
array([4, 6])
>>> dot(array([2, 3]), array([3, 4]))
array([18])
>>> dot(array([2, 3]), array([[3, 4],
...                           [5, 6]]))
array([21, 26])
>>> dot(array([2, 3]), array([[[3, 4],
...                            [5, 6]],
...                           [[5, 6],
...                            [7, 8]]]))
array([[21, 26],
[31, 36]])

for a rank-2#### tensor

>>> dot(array([[2, 3],
...            [4, 5]]), array([2]))
array([[4,  6]
[8, 10]])
>>> dot(array([[2, 3],
...            [4, 5]]), array([2, 3]))
array([13, 23])
>>> dot(array([[2, 3],
...            [4, 5]]), array([[3, 4],
...                             [5, 6]]))
array([[21, 26],
[37, 46]])
>>> dot(array([[2, 3],
...            [4, 5]]), array([[[3, 4],
...                              [5, 6]],
...                             [[5, 6],
...                              [7, 8]]]))
array([[[21, 26],
[37, 46]],

[[31, 36],
[55, 64]]])

for a rank-3#### tensor

>>> dot(array([[[3, 4],
...             [5, 6]],
...            [[5, 6],
...             [7, 8]]]), array([2]))
array([[[6,   8],
[10, 12]],

[[10, 12],
[14, 16]]])
>>> dot(array([[[3, 4],
...             [5, 6]],
...            [[5, 6],
...             [7, 8]]]), array([2, 3]))
array([[18, 28],
[28, 38]])
>>> dot(array([[[3, 4],
...             [5, 6]],
...            [[5, 6],
...             [7, 8]]]), array([[2, 3],
...                               [4, 5]]))
array([[[22, 29],
[34, 45]],

[[34, 45],
[46, 61]]])
>>> dot(array([[[3, 4],
...             [5, 6]],
...            [[5, 6],
...             [7, 8]]]), array([[[2, 3],
...                                [4, 5]],
...                               [[4, 5],
...                                [6, 7]]]))
array([[[[22, 29],
[36, 43]],

[[34, 45],
[56, 67]]],


[[[34, 45],
[56, 67]],

[[46, 61],
[76, 91]]]])

note: other than for scalars, these results can be obtained from NumPy's tensordot(..., axes=(-1,0))

Trac comment by guyer@nist.gov on 10-18-2007 at 10:37

@guyer
Copy link
Member Author

@guyer guyer commented Sep 19, 2014

Replying to guyer:

What FiPy needs is support for, given [[formula(a[I,:,n])]] and [[formula(b[J,:,m,n])]],

#formula
\text{dot}(a,b) = r[I,J,m,n] = \sum_k a[I,k,n]  b[J,k,m,n]

while still allowing the NumPy syntax.

That's not right. That would just replicate the bizarrity of NumPy's dot(), but adding on our mesh element dimension. Instead, we want NumPy's tensordot(..., axis=(-1, 0)) with our additional dimension:

#formula
\text{dot}(a,b) = r[I,J,n] = \sum_K a[I,K,n]  b[K,J,n]

for [[formula(a[I,:,n])]] and [[formula(b[:,J,n])]].

Trac comment by guyer on 10-18-2007 at 20:18

@guyer
Copy link
Member Author

@guyer guyer commented Sep 19, 2014

Replying to guyer@nist.gov:

_sigh''. I was using FiPy's dot(), not NumPy's dot(), so

_note:''' other than for scalars, these results can be obtained from NumPy's tensordot(..., axes=(-1,0))
should be:
'note: other than for scalars, these results can be obtained from NumPy's dot(...) or from tensordot(..., axes=(-1,0))
and the actual result for scalars is not

>> dot(array([2]), array([3,4]))
14

but

>> dot(array([2]), array([3,4]))
Traceback (most recent call last):
  ...
ValueError: objects are not aligned

Also, for rank-2 tensors, it's not

>> dot(array([[2, 3],
...            [4, 5]]), array([[[3, 4],
...                              [5, 6]],
...                             [[5, 6],
...                              [7, 8]]]))
array([[[21, 26],
[37, 46]],

[[31, 36],
[55, 64]]])

but
```python
array([[[21, 26],
[31, 36]],

[[37, 46],
[55, 64]]])

Trac comment by guyer on 10-19-2007 at 08:47

@guyer
Copy link
Member Author

@guyer guyer commented Sep 19, 2014

Replying to guyer@nist.gov:

All cases involving rank-0 fields are just multiplication. For higher rank cases, we first define some field arrays:

>>> N = 3

>>> v1 = empty((2,N), int)
>>> v1[:] = array([2,3])[..., newaxis]
>>> v2 = empty((2,N), int)
>>> v2[:] = array([3,4])[..., newaxis]

>>> t21 = empty((2, 2, N), int)
>>> t21[:] = array([[2, 3],
...                 [4, 5]])[..., newaxis]
>>> t22 = empty((2, 2, N), int)
>>> t22[:] = array([[3, 4],
...                 [5, 6]])[..., newaxis]

>>> t31 = empty((2, 2, 2, N), int)
>>> t31[:] = array([[[3, 4],
...                  [5, 6]],
...                 [[5, 6],
...                  [7, 8]]])[..., newaxis]
>>> t32 = empty((2, 2, 2, N), int)
>>> t32[:] = array([[[2, 3],
...                  [4, 5]],
...                 [[4, 5],
...                  [6, 7]]])[..., newaxis]

then define a routine to print a slice and shape for easy comparison with the non-field results:

>>> def P(a):
...     print repr(a[...,0]), a.shape

and then can see what it takes to get the "correct" results:

for a#### vector

>>> P(sum(v1 * v2, axis=0))
array(18) (3,)
>>> P(sum(v1[...,newaxis,:] * t22, axis=0))
array([21, 26]) (2, 3)
>>> P(sum(v1[...,newaxis,newaxis,:] * t31, axis=0))
array([[21, 26],
[31, 36]]) (2, 2, 3)

for a rank-2#### tensor

>>> P(sum(t21 * v1, axis=1))
array([13, 23]) (2, 3)
>>> P(sum(t21[...,newaxis,:] * t22, axis=1))
array([[21, 26],
[37, 46]]) (2, 2, 3)
>>> P(sum(t21[...,newaxis,newaxis,:] * t31, axis=1))
array([[[21, 26],
[31, 36]],

[[37, 46],
[55, 64]]]) (2, 2, 2, 3)

for a rank-3#### tensor

>>> P(sum(t31 * v1, axis=2))
array([[18, 28],
[28, 38]]) (2, 2, 3)
>>> P(sum(t31[...,newaxis,:] * t21, axis=2))
array([[[22, 29],
[34, 45]],

[[34, 45],
[46, 61]]]) (2, 2, 2, 3)
>>> P(sum(t31[...,newaxis,newaxis,:] * t32, axis=2))
array([[[[22, 29],
[36, 43]],

[[34, 45],
[56, 67]]],


[[[34, 45],
[56, 67]],

[[46, 61],
[76, 91]]]]) (2, 2, 2, 2, 3)

I think the pattern for a field-savvy dot(A, B) is then

>>> def fielddot(A, B):
...     index = index_exp[...] + (newaxis,) * (len(B.shape) - 2) + index_exp[:]
...     return sum(A[index] * B, axis=len(A.shape) - 2)

although this still needs special-casing for scalars.

Trac comment by guyer on 10-19-2007 at 11:46

@guyer
Copy link
Member Author

@guyer guyer commented Sep 19, 2014

Replying to guyer:

I think the pattern for a field-savvy dot(A, B) is then

>> def fielddot(A, B):
...     index = index_exp[...] + (newaxis,) * (len(B.shape) - 2) + index_exp[:]
...     return sum(A[index] * B, axis=len(A.shape) - 2)

although this still needs special-casing for scalars.

Accounting for scalars, I think that should be

>>> def fielddot(A, B):
...     rankA = len(A.shape) - 1
...     rankB = len(B.shape) - 1
...     index = index_exp[...] + (newaxis,) * (rankB - 1) + index_exp[:]
...     if rankA <= 0 or rankB <= 0:
...         return A[index] * B
...     else:
...         return sum(A[index] * B, axis=rankA - 1)

Trac comment by guyer on 10-19-2007 at 13:40

@guyer
Copy link
Member Author

@guyer guyer commented Sep 19, 2014

Fixed in r2347

Trac comment by guyer on 10-19-2007 at 22:15

@fipymigrate
Copy link

@fipymigrate fipymigrate commented Sep 19, 2014

In 52581d5:

#CommitTicketReference repository="fipy" revision="52581d5dd4dce123f1f5e8d240bf440488ec2021"
`numerix.dot()` didn't do the right thing for tensors (see issue #123). This 
patch allows a `_MeshVariable` of any rank to `dot` with any other 
`_MeshVariable` of any rank (including scalars).


git-svn-id: svn+ssh://code.matforge.org/fipy/trunk@2347 d80e17d7-ff13-0410-a124-85740d801063

Trac comment by Jonathan E. Guyer guyer@nist.gov on 06-12-2013 at 14:14

@guyer
Copy link
Member Author

@guyer guyer commented Sep 19, 2014

Marking milestone

Trac comment by guyer on 09-30-2013 at 21:44

@guyer guyer closed this Sep 19, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants