Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Matrix and lambdification of matrices #46

Merged
merged 4 commits into from
Mar 20, 2023
Merged

Conversation

oscarbenjamin
Copy link
Owner

Adds a simple Matrix class with some basic properties. I am not sure that this is is really how I want matrices to be implemented in general but this can do for now.

The Matrix class only supports matrix addition and differentiation but not matrix multiplication or anything like LUsolve (it is not hard to add those things). The key feature of the Matrix class is that it uses a single expression graph for all elements of the matrix. This means that when computing the derivative of a matrix a single forward accumulation pass is used to compute the derivative of every element of the Matrix. The Matrix class itself is designed to be sparse and to be somewhat optimised for the case of a matrix that has few nonzero entries.

Also adds lambdification of matrices using numpy and llvm to create a numpy array. The code for that part can be tidied up and better organised. Eventually I want most of the logic for things like IR generation to be handled by symbolic rewrite rules but those aren't implemented yet so for now I've just copy-pasted bts of the lambdify code to get things working quickly.

@oscarbenjamin oscarbenjamin added the enhancement New feature or request label Mar 20, 2023
Copy link
Collaborator

@brocksam brocksam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

The Matrix implementation looks somewhat like a "dictionary-of-keys" (or DOK) sparse matrix, except the values in the mapping are indices that themselves map to a sequence of expressions. Is the reason for this so that the expressions can themselves be simply grouped by the List function for the purpose of the graph representation?

The __getitem__ indexing is neat but might need some modification when we implement matrix operations. I think you're already on the same page as me here. Consider the following:

>>> M = Matrix([[Symbol("M00"), Symbol("M01")], [Symbol("M10"), Symbol("M11")]])
>>> k = Matrix([[Symbol("k0")], [Symbol("k1")]])

Assume a MatrixSolve operation has been implemented such that one can do:

>>> x = MatrixSolve(M, k)

Ideally this would remain unevaluated unless explicitly asked. But it might also be desirable to be able to index back into the result of the matrix solve, like:

>>> x00 = x[0, 0]
>>> x00
MatrixIndex((0, 0), MatrixSolve(M, k))

x00 (i.e. a MatrixIndex node) would perhaps just be an Expr given that it essentially just represents a scalar.

@oscarbenjamin
Copy link
Owner Author

@brocksam let me know what critical features should be added to get this to where it needs to be. I'm guessing it is:

  • to_sympy/from_sympy
  • matrix concatenation
  • matrix multiply
  • LU solve

We don't need to add all of those in one PR but we should be clear about exactly what is the minimum needed to be able to run a full benchmark (emphasis on minimum!).

For now you can implement to_sympy/from_sympy inefficiently like this:

In [14]: from protosym.simplecas import *

In [15]: import sympy

In [16]: to_sympy_mat = lambda M: sympy.Matrix([[e.to_sympy() for e in row] for row in M.tolist()])

In [17]: from_sympy_mat = lambda M: Matrix([[from_sympy(e) for e in row] for row in M.tolist()])

In [18]: M = Matrix([[sin(x), cos(x)], [-cos(x), sin(x)]])

In [19]: to_sympy_mat(M)
Out[19]: 
Matrix([
[ sin(x), cos(x)],
[-cos(x), sin(x)]])

In [20]: from_sympy_mat(to_sympy_mat(M))
Out[20]: Matrix([[sin(x), cos(x)], [(-1*cos(x)), sin(x)]])

@oscarbenjamin
Copy link
Owner Author

Is the reason for this so that the expressions can themselves be simply grouped by the List function for the purpose of the graph representation?

Yes, exactly. I have considered different ways to make this work but for now this is a quick hack that can work for differentiation (although I had to add List to the differentiation code). Ideally we would also use it for eval_f64 etc but it doesn't quite fit with the evaluator model because evaluator expects to fully evaluate the expression to e.g. float. What is needed is something that can handle partial evaluation.

The __getitem__ indexing is neat but might need some modification when we implement matrix operations.

More modifications would be needed than that. Here Matrix is just its own class and is not a Sym/Expr in itself. I'm not sure I like that model as it causes lots of problems in SymPy. It is of course needed if you want Matrix to be mutable (which it currently isn't). You would need something else to be able to represent matrices symbolically within an expression but that something else could be not much different from List (it just needs to also store the indices). I haven't thought through all of the implications so for now this is a quick approach to get it so that we can differentiate and lambdify matrices efficiently.

There are still some fundamental things that aren't really implemented in protosym yet before it can have coherent matrix expressions and more complicated codegen. Really all of that should be based on a rule rewriting system which isn't there. For now though I think that being able to generate a large matrix, differentiate and evaluate numerically gets us some way to where we need to be to implement a benchmark. SymPy's existing lambdify functionality can be used in combination with this where protosym handles the computationally intensive part I think.

@oscarbenjamin
Copy link
Owner Author

But it might also be desirable to be able to index back into the result of the matrix solve, like:

>>> x00 = x[0, 0]
>>> x00
MatrixIndex((0, 0), MatrixSolve(M, k))

x00 (i.e. a MatrixIndex node) would perhaps just be an Expr given that it essentially just represents a scalar.

We can have that. If M could be a symbolic expression then it could be included in the args of MatrixSolve. MatrixSolve itself would not be an instance of the Matrix class added here though but rather some symbolic function like:

In [21]: MatrixSolve = Function('MatrixSolve')

In [22]: M = Symbol('M')

In [23]: b = Symbol('b')

In [24]: MatrixSolve(M, b)
Out[24]: MatrixSolve(M, b)

Likewise MatrixIndex would be a Function. These would not be expected to "evaluate" to anything because so far nothing in protosym evaluates to anything.

@oscarbenjamin
Copy link
Owner Author

I just fixed a bug to make sure that binexpand is used when lambdifying a matrix. That will be needed when lambdifying a matrix that comes from sympy because it generates Adds and Muls with more than 2 args.

@brocksam
Copy link
Collaborator

I think that we should use a linearised N-link pendulum on a cart as the benchmark case for ProtoSym with sympy.physics.mechanics. A few reasons for this:

  • It’s trivial to scale the size of the problem so we can use it to show asymptotic runtimes using SymPy vs ProtoSym
  • As N increases, the majority of the runtime can be attributed to computing Jacobians so should require minimal additional features in ProtoSym
  • The model is already implemented in sympy.physics.mechanics.models as n_link_pendulum_on_cart

I will start putting together a branch using ProtoSym to do the differentiation with a simple from_sympy and to_sympy before and after. From there we can look at the results, benchmark again and work out the next place to focus attention.

I'm guessing it is:

  • to_sympy/from_sympy
  • matrix concatenation
  • matrix multiply
  • LU solve

For now I think we can at least start without any of these. Efficient to_sympy and from_sympy would be the next critical addition. After that I think let's wait for my feedback about the above before confirming.

Copy link
Collaborator

@brocksam brocksam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get this in and then I can use it in benchmarking to help identify what else we need.

@oscarbenjamin
Copy link
Owner Author

I agree. Thanks for the review!

@oscarbenjamin oscarbenjamin merged commit 4d80972 into main Mar 20, 2023
@oscarbenjamin oscarbenjamin deleted the pr_lambdify_mat branch March 20, 2023 15:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants