-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
simplify: Simplification of non-commutative expressions #14520
Conversation
This commit introduces nc_simplify - a function for simplifying expressions with non-commutative symbols. To be specific, it simplifies the terms of the expression that involve only multiplication and raising to a power by grouping together repeated subterms, e.g. nc_simplify(a*b*a*b) == (a*b)**2.
Nice. Your algorithm could probably be useful for #10271. It tried your examples there and they aren't as good.
(side note: I can't believe that pull request is still open. I should probably just fix the merge conflicts and get it in already) |
I was wondering if there were any related issues or PRs trying to do similar things. The algorithm I used is mostly naive, the only not-entirely-naive thing about it is a large if clause to modify the subterm we're trying to match by splitting up the powers. E.g. if we are grouping subterms of length 2 in |
I think there must be an issue somewhere in my cse code. The underlying algorithm should be able to recursively get the same results as yours >>> shortest_repeated_subsequence((a, a, a, b, a, b, a, b))
(a, b)
>>> shortest_repeated_subsequence((a, b, a, a, b, a, a, b, a))
(a, b, a)
>>> shortest_repeated_subsequence((a, b, a, b, c, a, b, a, b, c))
(a, b) I haven't had a look at it in a long time so I'm not sure what is going on. I thought maybe it was that it doesn't recognize powers as products, but it fails even with
I might do some more debugging later. Let me know what you think of the "shortest repeated subsequence" idea. It can probably be extended to work with powers implicitly, so that they don't have to be manually extended in the sequence (that would be important for very large powers). At some point I got bogged down in @smichr's suggestions, and it turned out that I didn't need cse on matrix expressions as much as I thought I did, so my focus moved to other places. |
Although cse perhaps has a different goal than nc_simplify. For cse, |
Ah, cse is destroying unevaluated Muls here. And it's a known issue that cse doesn't automatically expand powers (apparently nontrivial, because it was fixed once, but had to be reverted because of performance issues). |
rather than It's true that if you go for the longest one, there might be shorter repeated subsequences inside so you'll have to run the function on it as well, but you could probably reuse the information from the matrix you've computed for the whole expression and then it shouldn't be too inefficient (maybe)? |
The x2 in your example isn't correct (it's (x1, x0) or (x0, x1)). Also, it should be For your example (corrected), we have
vs.
From the point of view of Maybe there's a better example where the total number of multiplications isn't the same. But I'm thinking the cse algorithm isn't directly applicable for simplify, because cse has to care about repeated subsequences anywhere in the expression, whereas for simplify you want to combine repeated subsequences that are next to each other (into a power). I don't think shortest or longest matters without taking that into account. But maybe some of the ideas, like the matrix, could be reused. I'm not sure. This also doesn't take into account at all inverses and simplifying things like |
Another thing that can be done regarding inverses would be to minimize the number of negative powers (this seems to me to be a reasonable heuristic for "more simple", at least assuming other things like combining don't occur). For instance,
|
Oh yeah... Oops.
In the group theory module, grouped powers are always expanded automatically, so that
And this also sounds like something I should do. Thanks for the suggestion 👍 Generally, the group theory module still doesn't have a simplifying algorithm for group elements and this PR is me finally getting around to it since summer. I thought that I might just as well sort out non-commutative expressions first. So I guess I should do a bit more work here, to include the negative powers and see if using a matrix will make things more efficient. Might do this later this week. |
Any objections to merging this? I think it's a good start for noncommutative simplification. |
I've rewritten it from scratch and the new version is much better: it takes into account the things we talked about, can choose simplifications a bit more cleverly and is essentially as efficient as this one. I've just been debugging and commenting the code quite slowly but it's essentially ready now. I'll run some more tests and hopefully will commit it in a couple of days - that would be better to merge. |
This is a new version of nc_simpliy. Now inverses are handled so that nc_simplify(a*b*(b**-1*a**-1)**2) = (a*b)**-1 and nc_simplify(c**-1*b**-1*a**-1) = (a*b**c)**-1. Negative powers are generally handled better, including this sort of thing: nc_simplify(a*b**3*a*b**3*a*b) = (a*b**3)**3*b**-2. Additionally, simplifications are now chosen on the basis of what will leave the fewest arguments rather than the earliest and longest possible repeated subterm as before. The repetitions themselves are found by computing a matrix of overlaps of prefixes and suffixes of the word instead of repeated scanning the word for each possible repeated subterm length.
Just as an example, if you define:
the latest version simplifies it to There are some specific cases the function doesn't handle right now. E.g. it can simplify |
@asmeurer, I think this is ready to merge can you please have a look. |
I ran a coverage test ( Also maybe the tests could be structured so that they are self checking, like def _check(expr, simplified):
assert nc_simplify(expr) == simplified)
assert expand(expr) == expand(simplified)
assert check(a*b*a*b*a*b*c*(a*b)**3*c, ((a*b)**3*c)**2)
... |
I'm not sure I understand about the coverage. I had never run this before but now I tried with and without the test I added, and the numbers in |
The percentage is just the number of lines that are covered, which isn't helpful since it's looking at the whole file. What you should do is open the |
This commit makes sure the test covers all lines in the new code and writes the test in a slightly shorter way via an auxiliary subfunction.
Ah, got it. The new commit should deal with that. |
x = Symbol('x') | ||
|
||
def _check(expr, simplified, deep=True): | ||
assert nc_simplify(expr, deep=deep) == simplified |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Were you going to add an expand
sanity check here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At first I was, but then expand((c*a)**-1) == (c*a*)**-1
and not expand((c*a)**-1 == a**-1*c**-1
so I'd have to add a keyword argument to ignore the expand
check for some of the tests. I didn't in the end because all of the expressions I used are sufficiently short that I know with certainty the simplified version is correct. But I guess I could add an ignore
keyword argument, and have the check anyway just in case, if you think it's worth it. Or, alternatively, expand((c*a)**-1
should probably be a**-1*c**-1
- I think I could write the PR for it today already, and then if it's merged first, I wouldn't need an additional keyword argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, I don't even need a separate PR, I can do it right here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a test in core/tests/test_expand.py
that does expand(A*B*(A*B)**-1) == A*B*(A*B)**-1
for non-commutative A, B
after this issue. So I'm somewhat unsure if that was intentional. After the change I'm proposing to do it'll be expand(A*B*(A*B)**-1) == 1
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the bug was a wrong result, so the fix was likely to just disable noncommutative expanding for negative powers. But doing the correct thing and actually expanding seems better. I agree expand(A*B*(A*B)**-1)
should give 1.
This commit makes sure negative powers are expanded properly, e.g. expand((a*b)**-1) == b**-1*a**-1, and adds an expand check to the tests for nc_simplify.
Can this be merged? Everything looks good on my end but I want to double check that there wasn't anything else you planned to do. |
Yep, this is the final version (though I might do bit more work on this later, after it's merged). |
If NC expressions are now simplified, maybe it is easier to get their coefs? #14833 |
@homocomputeris I don't see why simplification would make it harder. Is there some reason to believe this might be the case? |
@valglad I had a typo: if there are now simplified, I believe, it shouldn't be hard to get correctly their coefs from an expression. |
@homocomputeris Ah, I see. I don't think that would help. Have a look at my first comment in the issue - this 0 was an intended feature when there are conflicting coefficients, and simplifying the multiplicative terms wouldn't resolve the conflict. |
This PR introduces
nc_simplify
- a function for simplifying expressions with non-commutative symbols. Specifically, it simplifies the terms of the expression that involve only multiplication and raising to a power by grouping together repeated subterms, e.g.nc_simplify(a*b*a*b) == (a*b)**2
.At present, SymPy doesn't really do much to simplify such expressions so that
simplify(a*b*a*b) == a*b*a*b
. On this branch,simplify
usesnc_simplify
if the expression is non-commutative, before attempting further simplifications as it normally would.Other examples of
nc_simplify
with non-commutative symbolsa, b, c
are:More examples are included in the docstring and the tests.