Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Clone this wiki locally
These are my thoughts on automatic simplification of expressions. This is adapted from this message to the mailing list.
-- Aaron Meurer
When creating a new object, or modifying an already existing one, there is often a temptation to make some evaluation and simplification rules happen automatically. However, this is often a bad idea. I will elaborate on why in a bit. There are four conditions that should be met for any evaluation or simplification rule to be done automatically (i.e., at object instantiation time):
- The rule is correct.
- The rule will always make the object simpler.
- The rule is very cheap in every case.
- The rule is so trivial that no one will ever not want it done.
1. is probably the most obvious of the rules, but it often does not hold. For example, in SymPy 0.6.5 and earlier,
sqrt(x*y) was automatically reduced to
sqrt(x)*sqrt(y). But this does not hold for general
y, for the same reason that
sqrt(x**2) does not equal
x in general. For example, if
x == y == -1, then
sqrt(x*y) == sqrt(-1*-1) == sqrt(1) == 1, but
sqrt(-1)*sqrt(-1) == I*I == -1. This simplification is only true in general if
x, y >= 0.
As of SymPy 0.6.6, if
y are given these assumptions (
x, y = symbols('x y', nonnegative=True)), then SymPy will automatically convert
sqrt(x)*sqrt(y), but if they do not have those assumptions, then
sqrt(x*y) will remain unevaluated.
It usually happens that the simplification that you want to apply do hold for a subset of possible inputs, like above, but fail to be correct in the general case. In SymPy, 99% of the time this is related to assumptions.
2. also seems obvious, but it too is often violated. The problem is often that the automatically evaluated form may seem simpler for certain input values, like small input values. But becomes clear that the rule makes things far more complicated with more carefully chosen input values. Since less simple expressions are by their nature larger, a violation of this rule will invariably mean a violation of rule 3..
For example, suppose that you wanted to apply the rules
log(a**p) == p*log(a) and
log(a*b) == log(a) + log(b) automatically, where
b are positive integers and
p is real (apologies to Chris Smith, who wanted to do this; if you can think of a better example, I will use it instead). This rule is correct, because of the assumptions on
p. If you look at small values, it looks like it might be something worth doing. For example,
log(18) == log(2*3**2) == log(2) + 2*log(3). And it would allow things like
log(6) - log(2) to automatically simplify to
log(3), because the
log(6) would automatically be expanded to
log(2) + log(3), and the
log(2)s would cancel.
But if you consider large inputs values, you will soon see that this makes things much less simple.
log(factorial(10)) would become
2*log(5) + 4*log(3) + 8*log(2) + log(7) instead of just
log(3628800). You can imagine what
log(factorial(100)) would look like.
Or consider polynomials, which is simpler:
x**100 + ... + a**100?
And what's about special functions? Which is better:
Ylm(20, 10, theta, phi) or
the explicit definition? One answer is: it depends on what we want to do.
More to come…
Work from general to special
I would build a general expression that is "as written", then transform it as needed once the specific context can be clearly identified.
If we go this route we would always construct a symbolic and general representation.
legendre(5, x) could easily be changed into a polynomial. But the
route back to the form
legendre(5, x) is very difficult in general. We lose
some information about our object when we transform it to a polynomial. This could
make further processing impossible.
Summarized this means that we should never touch (and change) objects without explicit demand (from the user or an algorithm processing the object).
This special request could look like:
and ask the object to transform. One form currently present in sympy is
However this approach has a clear and obvious disadvantage. Many 'simple' changes (which
most users probably want and which some algorithms might even need to work correctly) have
to be done manually. Who really wants
sin(0) to stay so and not be changed to
- We shouldn't let the interface dictate the implementation. An end-user will never want to
sin(0), but a tool writer might need it.
sin(x)should return something like
Apply(SinFunction, x). That way,
0and those who need it can use
Apply(SinFunction, 0)(which should obviously not auto-evaluate in any way). --Ronan
Canonicalization can be understood as a kind of transformation for objects which transfers objects into a unique canonical form. For example in case of polynomials we can always make the coefficient of the leading monomial equal 1.
This sort of changes may be required for some algorithms to work. But is it a good idea to perform them by default? And with no possibility to be turned off?
Even this set of very restricted transformation rules may violate rule 4! (Under the assumption that point 1 to 3 are fulfilled, which can be wrong for 2 and 3.)
Jo "toolforger": I like the general approach.
One detail: whether something is "simpler" depends on what metric you apply. If you count number of operators and constants,
2*log(5) + 4*log(3) + 8*log(2) + log(7) has a metric of 19, while
log(3628800) has a metric of 2, hence is far simpler.
If your metric is "number of prime factors in the largest constant", then the former expression has a metric of 1, the latter one of 15. (This metric can indeed be more interesting.)
Just a vague idea: Maybe we want a framework that can consider different metrics. If a transformation reduces a metric, we do it, if it doesn't, we skip it.
We do already have such a framework. Look at the docstring of
simplify(). Note that this document is specifically about automatic simplification, the kind that happens at expression instantiation time and that can not easily be undone. --Aaron
The danger of automatic simplification
One reason to avoid automatic simplification is that people will start to write algorithms that assume that that simplification will always take place, either intentionally or accidentally. In other words, the automatic simplification will become an invariant of the object. This makes it very hard to someday remove the simplification, because those algorithms that used the invariant will become incorrect.
A great example of this is the simplification that we still have not removed
2*(x + y) -> 2*x + 2*y (https://code.google.com/p/sympy/issues/detail?id=1497). Removing this in Mul.flatten is easy. The problem is that there are a lot of test failures. Many of them are just due to a different output, but many are completely wrong results, because the algorithms assume that such things will distribute.
This also happened back when we changed powers to not automatically combine. It used to happen that
x**a*x**b always automatically became
x**(a + b), even though this clearly breaks rule 4 above. Removing it was easy, but then tests failed in nontrivial ways. The easiest way to fix it was to add powsimp calls in the correct places in the code, but better would have been to fix the algorithms themselves to not assume this.