Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexed matrix expressions can now be parsed into TensorProduct #13622

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Upabjojr
Copy link
Contributor

@Upabjojr Upabjojr commented Nov 18, 2017

So this PR tries to solve the issue of matrix derivative returning higher-rank tensors. A matrix is a rank-2 tensor (i.e. an object with 2 indices). Sometimes, the derivative of a matrix expression may be a higher rank tensor, for example X.diff(X) is the rank-4 identity tensor (a tensor where all the ones are on the principal diagonal).

This PR tries to introduce results based on TensorProduct:

  • X.diff(X) == TensorProduct(I, I), where I is the identity matrix (TODO: this result is not yet achieved).
  • more examples to add from the matrix derivative cookbook.

Reasoning:

X.diff(X) can be rewritten into index-form as X[i, j].diff(X[m, n]), this returns KroneckerDelta(m, i)*KroneckerDelta(n, j). This last expression can be re-interpreted as the tensor product of two identity matrices: TensorProduct(I, I), by assigning indices (m, i) to the first term, and (n, j) to the second term.

@Upabjojr
Copy link
Contributor Author

Hi @jksuom , I would like to ask you whether you think that this PR makes sense, that is allow to use TensorProduct in case the matrix derivative returns a higher-rank tensor.

@jksuom
Copy link
Member

jksuom commented Nov 20, 2017

TensorProduct(I, I) looks nice but It should probably be augmented by some data on indices which might make it less practical. X.T.diff(X) can also be rewritten as a product of two Kronecker deltas but with "crossed indices". If it were alone, then it could perhaps be written TensorProduct(I, I), but that is not possible if both expressions would appear together as in Y.diff(X) where Y = X - X.T.

@Upabjojr
Copy link
Contributor Author

I've come to the conclusion that this cannot be done with TensorProduct, unless we use some tricks like introducing an unevaluated form for dimensional permutations.

If X is 3 dimensional, X.diff(X) should return:

⎡⎡1  0  0⎤  ⎡0  1  0⎤  ⎡0  0  1⎤⎤
⎢⎢       ⎥  ⎢       ⎥  ⎢       ⎥⎥
⎢⎢0  0  0⎥  ⎢0  0  0⎥  ⎢0  0  0⎥⎥
⎢⎢       ⎥  ⎢       ⎥  ⎢       ⎥⎥
⎢⎣0  0  0⎦  ⎣0  0  0⎦  ⎣0  0  0⎦⎥
⎢                               ⎥
⎢⎡0  0  0⎤  ⎡0  0  0⎤  ⎡0  0  0⎤⎥
⎢⎢       ⎥  ⎢       ⎥  ⎢       ⎥⎥
⎢⎢1  0  0⎥  ⎢0  1  0⎥  ⎢0  0  1⎥⎥
⎢⎢       ⎥  ⎢       ⎥  ⎢       ⎥⎥
⎢⎣0  0  0⎦  ⎣0  0  0⎦  ⎣0  0  0⎦⎥
⎢                               ⎥
⎢⎡0  0  0⎤  ⎡0  0  0⎤  ⎡0  0  0⎤⎥
⎢⎢       ⎥  ⎢       ⎥  ⎢       ⎥⎥
⎢⎢0  0  0⎥  ⎢0  0  0⎥  ⎢0  0  0⎥⎥
⎢⎢       ⎥  ⎢       ⎥  ⎢       ⎥⎥
⎣⎣1  0  0⎦  ⎣0  1  0⎦  ⎣0  0  1⎦⎦

This is clearly not the same as I⊗I. I don't even think there are algebraic operations to construct this object from matrices.

@Upabjojr
Copy link
Contributor Author

Maybe a generalization of FunctionMatrix, i.e. FunctionArray could help to find a closed formula for this result.

@czgdp1807
Copy link
Member

@Upabjojr @jksuom Any status updates on this PR?

@Upabjojr
Copy link
Contributor Author

@Upabjojr @jksuom Any status updates on this PR?

There are different opinions on whether TensorProduct(I, I) is equal to X.diff(X), so I'd suggest to leave this issue open for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants