Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sure the output of julia_code is reasonable and correct. #23736

Open
cocolato opened this issue Jul 7, 2022 · 5 comments
Open

Make sure the output of julia_code is reasonable and correct. #23736

cocolato opened this issue Jul 7, 2022 · 5 comments

Comments

@cocolato
Copy link
Contributor

cocolato commented Jul 7, 2022

When I fixed the issue #23667 , I found that the output of Julia_code was not quite right at the moment.

The default printer will use the vectorized "dot" operator to print julia code when use binary operators and the arg is not a number.
For example:

>>> x = Symbol("x")
>>> julia_code(x**10)
'x.^10'
>>> y = Symbol("y")
>>> julia_code(x * y)
'x .* y'
>>> julia_code(x / y)
'x ./ y'
>>> julia_code(x + y)
'x + y'

I am not sure if we should use "dot" operator on all possible mathematical operations to make sure it is compatible with the vectorization in Julia.

If we do this, probably any mathematical function will have to print a "dot" operator that looks a bit strange, although in Julia it is not necessary, for example:

julia> x = [1 2 3]
1×3 Matrix{Int64}:
 1  2  3

julia> y = [1, 2, 3]
3-element Vector{Int64}:
 1
 2
 3

julia> x * y
1-element Vector{Int64}:
 14

julia> x .* y
3×3 Matrix{Int64}:
 1  2  3
 2  4  6
 3  6  9

I think it's up to users to decide whether they want .* or *, depending on what type x and y are in Julia.
We just need to return a correct and beautiful Julia expression x * y, its specific behavior is determined by the user in Julia:

julia> expr_str = "x * y"
"x * y"

julia> expr = Meta.parse(expr_str)
:(x * y)

julia> x = [1 2 3]
1×3 Matrix{Int64}:
 1  2  3

julia> y = [1, 2, 3]
3-element Vector{Int64}:
 1
 2
 3

julia> eval(expr)
1-element Vector{Int64}:
 14

julia> import Base.*

julia> (*)(mat::Matrix{Int64}, vec::Vector{Int64}) = mat .* vec
* (generic function with 365 methods)

julia> eval(expr)
3×3 Matrix{Int64}:
 1  2  3
 2  4  6
 3  6  9

As @moble said in #23729 (review), we can also use @. macro in Julia to give the user the option to make all of the operations in the current expression vectorized, for example:

>>> julia_code(x * y)
'x * y'
>>> julia_code(x * y, vectorized=True)
@. (x * y)
@cocolato
Copy link
Contributor Author

cocolato commented Jul 7, 2022

@moble @asmeurer I am not sure if I have described the problem clearly. Do you have any other suggestions?

@asmeurer
Copy link
Member

asmeurer commented Jul 7, 2022

If you know your variables are matrices and you want to perform matrix operations, you should use SymPy's MatrixExpr classes to represent that. This looks correct:

>>> x = MatrixSymbol('x', n, n)
>>> y = MatrixSymbol('y', n, n)
>>> julia_code(x*y)
'x*y'
>>> x, y = symbols('x y')
>>> julia_code(x*y)
'x.*y'

Generally speaking, SymPy is going to generate code for expressions assuming they represent the thing they represent in SymPy (except a scalar Expr could be translated a vectorized array of scalars).

I think the concern is that if you know that your variables are scalars, then the dot notation is unnecessary. It wasn't exactly clear to me if it actually matters for performance (if I understand #23667 (comment) correctly, it doesn't).

One could argue that it's unreadable to use the dots when they are unnecessary, i.e., when the variables are scalars instead of arrays. Or if you use the @. syntax to avoid using them. My suggestion is to have a flag.

Actually I just tested it and it looks like @. can be used with any expression? This comment made it sound like it only worked with assignments. So maybe the flag should have three options

  • No dots on any operators.
  • Dots on all operators (except non-vector operators like matmul).
  • No dots on any operators and include the @. macro (but this would need to fail if a non-vectorized operator like matmul is included).

Alternately, the second and third could be combined and it can just use @. automatically, e.g., when the expression is large enough.

I would suggest making the second of these the default since that's already the current behavior and since that should automatically work in all cases.

But I'm not really a use of julia_code or even Julia, so your feedback on this would be appreciated.

@cocolato
Copy link
Contributor Author

cocolato commented Jul 7, 2022

Thank you for your reply!

I think the concern is that if you know that your variables are scalars, then the dot notation is unnecessary. It wasn't exactly clear to me if it actually matters for performance (if I understand #23667 (comment) correctly, it doesn't).

The performance issues are due to my bad use of Julia Benchmark, so I guess we can ignore the difference in performance ( And in most cases there will be no difference in performance ).

Actually I just tested it and it looks like @. can be used with any expression? This comment made it sound like it only worked with assignments. So maybe the flag should have three options

Yes, @. will convert every function call or operator in expr into a "dot call" (e.g. convert f(x) to f.(x)), and convert every assignment in expr to a "dot assignment" (e.g. convert += to .+=).

I would suggest making the second of these the default since that's already the current behavior and since that should automatically work in all cases.

I agree that it is better to extend other output styles with Flag without changing the existing default output styles.

But I'm not really a use of julia_code or even Julia, so your feedback on this would be appreciated.

We will normally use parse_latex() in sympy to process the data from our latex dataset and convert it to sympy expr. Then use julia_code() to turn it into an expr that Julia can recognize, so we can use Julia to do some more efficient processing. (Of course, this part should be implemented in Julia, but currently ANTLR does not seem to support Julia.)

@moble
Copy link
Contributor

moble commented Jul 7, 2022

Sorry. I didn't see this issue before leaving this comment. I think Aaron's three options sound good. The only thing I would add is that rather than failing, option 3 could just provide the ugly translations MatPow(A, p) -> $^(A, p) and MatMul(A, B) -> $*(A, B).

@cocolato
Copy link
Contributor Author

cocolato commented Jul 7, 2022

Thank you! I think I'd like to work on this.

@oscarbenjamin oscarbenjamin mentioned this issue Jul 8, 2022
44 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants