Skip to content
This repository was archived by the owner on Nov 17, 2025. It is now read-only.

Conversation

@abdalazizrashid
Copy link
Contributor

@abdalazizrashid abdalazizrashid commented Oct 19, 2020

Issue #115 raised the attention that the current implementation of matrix power is implemented naively; thus, here is an implementation that uses exponentiation by squaring, which is significantly faster.

@abdalazizrashid abdalazizrashid changed the title Implementing exponentiation by squaring for matrix power #115 Implementing exponentiation by squaring for matrix power Oct 19, 2020
@ferrine
Copy link
Contributor

ferrine commented Oct 19, 2020

I think a better way is to implement this via http://deeplearning.net/software/theano/extending/extending_theano.html

@brandonwillard brandonwillard linked an issue Oct 19, 2020 that may be closed by this pull request
@brandonwillard brandonwillard added the enhancement New feature or request label Oct 19, 2020
Copy link
Member

@brandonwillard brandonwillard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abdalazizrashid, this looks significantly better than the current implementation!

@ferrine is correct, because—for instance—the current implementation won't allow us to use symbolic values for the n argument (e.g. matrix_power(M, tt.iscalar())). There are two basic ways around this:

  1. as @ferrine suggests, we create a MatrixPower Op, or
  2. we reproduce the steps in your current implementation using Theano (e.g. use IfElse or tt.switch Ops for the conditional statements and scan for the while-loop).

Approach 1. can be fairly copy-paste until we get to the Op.grad and Op.c_code implementations. The former seems easy to implement in this case, but the latter can be annoying. We don't have to implement Op.c_code, though. (Same goes for Op.grad, but that one really should be implemented.)

Approach 2. will basically give us everything that a fully implemented Op would (e.g. the gradient and C code will be available, as long as the Theano Ops we use have them implemented), but there could be a performance hit due to the use of scan.

Now that I think about it a little more, I think Approach 2. is the better option. If we take this approach and then invest our efforts into debugging and fixing any performance problems that it may have, those efforts will pay off much more, since they'll apply to anything else that uses the same underlying Ops. In other words, if we improve scan just a little bit, we may end up improving Theano a lot.

In the meantime, we can go forward with this implementation—after the tests succeed—and create another issue for further improvements, since this is already a big improvement over the current approach—and it doesn't introduce any limitations that weren't already there. If you want to try one of the other approaches, feel free to rebase this branch with those changes.

Also, you can automatically avoid most of these code style errors by installing pre-commit and running pre-commit install in this project's root directory. That will make sure that the checks are run at every commit.

Copy link
Member

@brandonwillard brandonwillard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test needs a few more cases.

Note: you don't need to compile and compute the values; theano.dot is a well tested Op, and compiling graphs and C-code using function is overkill. It's a very big problem with this old test suite and has led to unnecessarily long test durations.

Since we need to confirm that the underlying algorithm is correct, we can't simply check that the resulting symbolic graph matches our expectations, but we can at least avoid the compilation parts by using the test values. Just set A.tag.test_value to the NumPy test values and get the result from Q.get_test_value().

@brandonwillard
Copy link
Member

brandonwillard commented Oct 20, 2020

It looks like some tests are failing because of that global test value setting. There are some brittle tests that check the lengths of captured warning messages, and those messages now include the new test value warnings.

@abdalazizrashid
Copy link
Contributor Author

I think something is wrong; the coverage test took forever!

@brandonwillard
Copy link
Member

I think something is wrong; the coverage test took forever!

That's due to a bug introduced by #119. I just reverted that change, so, if you rebase and push, a coverage report should be available.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

p -> P

  - Change class name
  - Removing global test value & using local test values
  - Removing randomly generated test matrices
@brandonwillard brandonwillard merged commit 6150681 into aesara-devs:master Oct 22, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Provide a reasonable matrix_power implementation

3 participants