Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
An unevaluated expression is any mathematical expression that is represented
as closely to the way it was input as possible. For example,
unevaluated if it remains as an expression tree
Mul(3, 4). It is evaluated
if it becomes
12. This is closely related to the notion of [[automatic
At its core, an unevaluated expression is simply an expression tree. By remaining unevaluated, in some sense it loses its mathematical meaning and is more structural, but in some sense it still does not (I will make this more clear below).
The classic way to create an unevaluated expression in SymPy is to call the
class constructor with
>>> Mul(3, 4, evaluate=False) 3*4
The main problem with this is that it is extremely verbose. Imagine creating the
(1 + 2*3)/2 like this (remember that
/2 is really
Some simpler methods have been proposed. One is the
manager. This sets a global flag in the core that makes
become the default.
>>> with evaluate(False): ... print(Mul(3, 4)) 3*4
The advantage here is that this works even with operators (assuming the arguments are sympified, of course):
>>> with evaluate(False): ... print(Integer(3)*4) 3*4
The disadvantage here is that every object that supports
adhere to this global flag. This gets at a core problem with
which I will expand upon below, that it isn't subclassable.
A second method is
sympify(evaluate=False). This works at the parser level
to parse a string with the specific classes, using evaluate=False. For
sympify('3*4', evaluate=False) is converted (using an AST
Mul(Integer(3), Integer(4), evaluate=False).
The advantage here is that arguments are sympified automatically. The disadvantage is that it only works for those classes that are known by the AST transformer.
Which brings us to the chief disadvantage of
evaluate=False: not all classes
support it. The real problem here is that the flag is not required by the
Basic superclass (or the metaclass). And it isn't part of the [[args
invariant]]. So classes are free to not support it, and indeed, quite a few
don't. Quite a few do, including the common classes in the core, and anything
that subclasses from
Idea 1: Require
evaluate=False support everywhere
This is the simplest solution architecturally. It would be somewhat annoying to implement, and it would be a rather large change in the sense that it would add something to the args invariant.
The main downside here is that
evaluate=False isn't really the best possible
design in the first place, as noted above. It is verbose. To work well it
should respect the global flag.
Idea 2: Bypassing the constructor completely.
It is possible to create any
Basic subclass in a truly unevaluated way
simply by doing
Basic.__new__(cls, *args). The key issue here is invariants,
which I want to discuss now.
An invariant is any statement that is true of every possible instance of a
particular class. For example, every
Basic subclass should satisfy the basic
args invariants. Even more simply, every
Basic subclass is hashable. In
order for invariants to hold, generally, it should be maintained in the
constructor. This is not true of all invariants (for instance, some invariants
are held simply by virtue of a method being defined on a class), but for the
purposes of this discussion, I will only consider those invariants that are
maintained by the constructor, since that is what we wish to bypass.
Invariants in SymPy can range from the very simple to the mathematically complex. The basic thesis of the automatic simplification article is that invariants from automatic simplifications shouldn't be too complicated.
Example of a very simple invariant:
sin only has exactly one argument:
>>> sin(x) # allowed >>> sin(x, y) # not allowed
Example of a more complicated invariant: the argument of
sin is not an
integer multiple of
>>> sin(2*pi) # The resulting object is not `sin` 0
The reason why I am talking about these things as invariants rather than
automatic simplifications and type checking, is that any code that receives a
sin object can assume, whether explicitly or implicitly, that these facts
will be true about it. For instance, a function may process a
simply by looking at its
.args, knowing that
.args is a tuple of length
exactly 1. Another function might assume that a
sin object is fully
simplified. In general, the simple assumptions tend to be more explicit,
whereas the more complicated, mathematical assumptions are subtler. They tend
to only reveal themselves if you try to reverse the invariant, either by
removing the automatic simplification or by creating an unevaluated object.
Finally, the most basic invariant of all,
obj.func(*obj.args) == obj, is
broken by unevaluated expressions.
An unevaluated object, by definition, breaks the invariants of the underlying class (except in the trivial cases).
This causes a lot of problems in practice. In the most simple case, an unevaluated object cannot be rebuilt. Many functions assume this, and the result is that passing an unevaluated object to these functions "evaluates" the object. Historically, this has been seen most often in the printers, since they are the most common function called on any given object, especially in an interactive session, where unevaluated expressions are most likely to be used.
Now to the idea of using
Basic.__new__(cls, *args). This is a complete
nuclear option. Unlike
cls(*args, evaluate=False), there is no way for
cls's constructor to do anything at all with args. For example:
>>> sin(x, y, evaluate=False) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "./sympy/core/function.py", line 438, in __new__ 'given': n}) TypeError: sin takes exactly 1 argument (2 given) >>> Basic.__new__(sin, x, y) sin(x, y)
Now, I am not very worried about the type checking side of things, such as the
implications of an expression like
sin(x, y). The general
rule in SymPy is garbage in, garbage
out, meaning if
someone creates an expression that is mathematically nonsensical, one should
expect to get nothing better than nonsensical results out (or an exception).
But this shows that
Basic.__new__ really skips everything. This is a
problem, as quite a few objects with
evaluate=False do maintain some very
basic invariants, generally ones that don't really affect the unevaluated-ness
of the resulting expression (like the setting of assumptions).
The suggestion to make
sympify do this with all classes is issue
13999. This is somewhat
worrisome as it means that expressions cannot be trusted in any functions,
unless they are first "rebuilt" (e.g., using
A potential midway solution here would if there were a separate constructor
for unevaluated expressions. For example,
would be the same as
cls(*args, evaluate=False). This would perform the most
basic invariants that don't related to unevaluated-ness, but which prevent
simple bugs from occurring. The advantage of a separate
constructor is that a default method could be implemented on Basic, and hence
it would be enforceable to some degree from the superclass. It would
also make it much more explicit as to what is a true object invariant and what
is only constructor (evaluated) invariant.
Idea 3: Separate classes for unevaluated expressions
The other idea is to not try to pretend that a
Mul(3, 4) object can exist.
Sure, it can exist by using
Mul(3, 4, evaluate=False) or
Basic.__new__(Mul, S(3), S(4)), but who knows where it will work and where
it will break. After all, a normal
Mul always has all Numbers (integers,
rationals, and floats) combined in the first argument. So who knows what
functions would make this invalid assumption on such an object, and what kinds
of bugs or even wrong results would ensue.
The main problem with the current strategy of reusing the existing classes for unevaluated objects is that some things might still work (if they happen to not care about the broken invariants), and some things won't. But there's no way to really tell without either testing them or auditing the code.
Instead of reusing the classes, another option would be to use separate classes entirely for unevaluated expressions. These classes would be very basic expression trees, which do not know anything about their mathematical representation.
Here, any function that wants to operate on unevaluated expressions would need
to know about these classes. The printers obviously would need to, but likely
some other functions would as well. Most users of unevaluated expressions want
to perform some mathematics on those expressions. This is the key tension, as
on the one hand they want to be able to represent something like
1 + 2*3 as
unevaluated, but on the other they want SymPy to be able to do mathematics on
Which functions should support such unevaluated classes directly is unclear to me. In general, you could always convert an unevaluated expression to an equivalent evaluated one and operate on that.
The advantage here is that a function that doesn't know about unevaluated
expressions would simply treat the unevaluated classes as an unknown function
UnevaluatedAdd(1, 2) would be treated the same as
This would avoid wrong results (except for "wrong" results in the sense of
functions not doing what they are "supposed" to do on unevaluated
The basic tradeoff here is
Reuse existing classes: many things "just work", but wrong results and accidental evaluation are possible
Separate classes: Wrong results and accidental evaluation are impossible, but nothing works unless explicitly designed to.
A proof-of-concept implementation of this idea is
sympy.core.expr. This works by wrapping the expression (so there isn't a
separate class for every possible expression class), and defining
Currently only the printers know about it. This design allows unevaluated
expressions to be used only partially. For instance, you could have
1 + UnevaluatedExpr(2) + x and it will behave functionally the same as
Add(1, 2, x, evaluate=False).
The downside of this is that it works "inwords" only.
UnevaluatedExpr(1) + 0
UnevaluatedExpr(1). You have to wrap the objects that might be
evaluated. In general, creating a larger expression from an
could result in evaluation. Only those parts that are wrapped are "masked"
This method is very similar to the existing workaround of creating Symbols for
Symbol('1') + Symbol('2'), except it is easier to later
evaluate the expression.