Automatic Simplification

Leonid Kovalev edited this page Feb 7, 2018 · 16 revisions
Clone this wiki locally

These are my thoughts on automatic simplification of expressions. This is adapted from this message to the mailing list.

-- Aaron Meurer

When creating a new object, or modifying an already existing one, there is often a temptation to make some evaluation and simplification rules happen automatically. However, this is often a bad idea. I will elaborate on why in a bit. There are four conditions that should be met for any evaluation or simplification rule to be done automatically (i.e., at object instantiation time):

  1. The rule is correct.
  2. The rule will always make the object simpler.
  3. The rule is very cheap in every case.
  4. The rule is so trivial that no one will ever not want it done.

1. is probably the most obvious of the rules, but it often does not hold. For example, in SymPy 0.6.5 and earlier, sqrt(x*y) was automatically reduced to sqrt(x)*sqrt(y). But this does not hold for general x and y, for the same reason that sqrt(x**2) does not equal x in general. For example, if x == y == -1, then sqrt(x*y) == sqrt(-1*-1) == sqrt(1) == 1, but sqrt(-1)*sqrt(-1) == I*I == -1. This simplification is only true in general if x, y >= 0.

As of SymPy 0.6.6, if x and y are given these assumptions (x, y = symbols('x y', nonnegative=True)), then SymPy will automatically convert sqrt(x*y) to sqrt(x)*sqrt(y), but if they do not have those assumptions, then sqrt(x*y) will remain unevaluated.

It usually happens that the simplification that you want to apply do hold for a subset of possible inputs, like above, but fail to be correct in the general case. In SymPy, 99% of the time this is related to assumptions.

2. also seems obvious, but it too is often violated. The problem is often that the automatically evaluated form may seem simpler for certain input values, like small input values. But becomes clear that the rule makes things far more complicated with more carefully chosen input values. Since less simple expressions are by their nature larger, a violation of this rule will invariably mean a violation of rule 3..

For example, suppose that you wanted to apply the rules log(a**p) == p*log(a) and log(a*b) == log(a) + log(b) automatically, where a and b are positive integers and p is real (apologies to Chris Smith, who wanted to do this; if you can think of a better example, I will use it instead). This rule is correct, because of the assumptions on a, b, and p. If you look at small values, it looks like it might be something worth doing. For example, log(18) == log(2*3**2) == log(2) + 2*log(3). And it would allow things like log(6) - log(2) to automatically simplify to log(3), because the log(6) would automatically be expanded to log(2) + log(3), and the log(2)s would cancel.

But if you consider large inputs values, you will soon see that this makes things much less simple. log(factorial(10)) would become 2*log(5) + 4*log(3) + 8*log(2) + log(7) instead of just log(3628800). You can imagine what log(factorial(100)) would look like.

Or consider polynomials, which is simpler: (x+a)**100 or x**100 + ... + a**100? And what's about special functions? Which is better: Ylm(20, 10, theta, phi) or the explicit definition? One answer is: it depends on what we want to do.

More to come…

Work from general to special

I would build a general expression that is "as written", then transform it as needed once the specific context can be clearly identified.


If we go this route we would always construct a symbolic and general representation. For examples legendre(5, x) could easily be changed into a polynomial. But the route back to the form legendre(5, x) is very difficult in general. We lose some information about our object when we transform it to a polynomial. This could make further processing impossible.

Summarized this means that we should never touch (and change) objects without explicit demand (from the user or an algorithm processing the object).

This special request could look like: refine(ourObject, furtherInformation) and ask the object to transform. One form currently present in sympy is ourObject.rewrite("...").

However this approach has a clear and obvious disadvantage. Many 'simple' changes (which most users probably want and which some algorithms might even need to work correctly) have to be done manually. Who really wants sin(0) to stay so and not be changed to 0?

  • We shouldn't let the interface dictate the implementation. An end-user will never want to see sin(0), but a tool writer might need it. sin(x) should return something like Apply(SinFunction, x). That way, sin(0) can return 0 and those who need it can use Apply(SinFunction, 0) (which should obviously not auto-evaluate in any way). --Ronan


Canonicalization can be understood as a kind of transformation for objects which transfers objects into a unique canonical form. For example in case of polynomials we can always make the coefficient of the leading monomial equal 1.

This sort of changes may be required for some algorithms to work. But is it a good idea to perform them by default? And with no possibility to be turned off?

Even this set of very restricted transformation rules may violate rule 4! (Under the assumption that point 1 to 3 are fulfilled, which can be wrong for 2 and 3.)

Jo "toolforger": I like the general approach.

One detail: whether something is "simpler" depends on what metric you apply. If you count number of operators and constants, 2*log(5) + 4*log(3) + 8*log(2) + log(7) has a metric of 19, while log(3628800) has a metric of 2, hence is far simpler.

If your metric is "number of prime factors in the largest constant", then the former expression has a metric of 1, the latter one of 15. (This metric can indeed be more interesting.)

Just a vague idea: Maybe we want a framework that can consider different metrics. If a transformation reduces a metric, we do it, if it doesn't, we skip it.

We do already have such a framework. Look at the docstring of simplify(). Note that this document is specifically about automatic simplification, the kind that happens at expression instantiation time and that can not easily be undone. --Aaron

The danger of automatic simplification

One reason to avoid automatic simplification is that people will start to write algorithms that assume that that simplification will always take place, either intentionally or accidentally. In other words, the automatic simplification will become an invariant of the object. This makes it very hard to someday remove the simplification, because those algorithms that used the invariant will become incorrect.

A great example of this is the simplification that we still have not removed 2*(x + y) -> 2*x + 2*y ( Removing this in Mul.flatten is easy. The problem is that there are a lot of test failures. Many of them are just due to a different output, but many are completely wrong results, because the algorithms assume that such things will distribute.

This also happened back when we changed powers to not automatically combine. It used to happen that x**a*x**b always automatically became x**(a + b), even though this clearly breaks rule 4 above. Removing it was easy, but then tests failed in nontrivial ways. The easiest way to fix it was to add powsimp calls in the correct places in the code, but better would have been to fix the algorithms themselves to not assume this.