New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a guide on writing custom functions #23488
Conversation
✅ Hi, I am the SymPy bot (v167). I'm here to help you write a release notes entry. Please read the guide on how to write release notes. Your release notes are in good order. Here is what the release notes will look like: This will be added to https://github.com/sympy/sympy/wiki/Release-Notes-for-1.11. Click here to see the pull request description that was parsed.
Update The release notes on the wiki have been updated. |
🟠Hi, I am the SymPy bot (v167). I've noticed that some of your commits add or delete files. Since this is sometimes done unintentionally, I wanted to alert you about it. This is an experimental feature of SymPy Bot. If you have any feedback on it, please comment at sympy/sympy-bot#75. The following commits add new files:
If these files were added/deleted on purpose, you can ignore this message. |
Benchmark results from GitHub Actions Lower numbers are good, higher numbers are bad. A ratio less than 1 Significantly changed benchmark results (PR vs master) Significantly changed benchmark results (master vs previous release) before after ratio
[77f1d79c] [f4691d30]
<sympy-1.10.1^0>
+ 99.2±0.5ms 180±0.7ms 1.81 sum.TimeSum.time_doit
Full benchmark results can be found as artifacts in GitHub Actions |
doc/src/guides/custom-functions.md
Outdated
If you find yourself defining an `eval` method on a `Function` subclass where | ||
you always return a value and never return `None`, you should consider just | ||
using a normal Python function instead, as there is no benefit to using a | ||
symbolic `Function` subclass in that case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is potential benefit if you can use evaluate=False
and also have printing support. Another point to consider is that maybe it should be possible to call the function with symbols as arguments so the eval method should maybe handle that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But do we really want to recommend that? evaluate=False is very fragile. It's a much better design to just avoid automatic evaluation and move it to some other method such as doit
. Even for something like printing, it can be more complicated than it might seem, because that doesn't really tell the whole story. What if you want to manipulate the expression somewhat to get it in some form to print. What if you later decide to do some stuff that isn't printing. The more code you have build up around a function, the harder it is to remove automatic evaluation from it. That's my experience with SymPy functions and I believe it is also true of user defined functions.
And even if you are just doing printing and nothing else wouldn't it still be simpler to just have a function that doesn't evaluate (in fact, you could just use a Function('f')
like I described earlier in the guide) than passing evaluate=False everywhere?
Another point to consider is that maybe it should be possible to call the function with symbols as arguments so the eval method should maybe handle that.
I'm not following this. Both a def
function and an eval
method of a Function
subclass should handle symbols as arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only benefit I can see is maybe if we implement #4787, you could have a symbolic function object f
which always evaluates. But even then, I don't know if having functions as objects that can be inside of other expressions is something that too many people really need. #4787 is important, but that's the most important part of it. Not many people request that feature.
The reason I am stressing the "don't just return an expression in eval
" and also the "use a def
function if you just want a simple shortcut" is because it's a common thing I see people do on StackOverflow and the mailing list. I think part of it is that people don't really sit down and think about what they actually want, but also part of it is that they don't completely understand how eval
works and maybe don't even realize that just returning an expression in eval
has this implication.
Finally, I would hope that everything in this guide should apply equally well to functions in library code as to user defined functions, and indeed I expect it to be useful for contributors. So I'm trying to apply the highest standard to the recommendations here. Users are of course free to do whatever they want, but we should not recommend something that we wouldn't do ourselves. The goal is to replace this wiki page, which isn't written very well https://github.com/sympy/sympy/wiki/Development-Tips-:-About-implementing-special-functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that automatic evaluation should be discouraged for symbolic functions. My suggestion though is that if someone is in the position of writing a Function subclass with an eval method that always returns maybe they should consider changing it so that the eval method does not always return so that then their function can also work with symbolic arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think which one you really want requires introspection as the whether you're looking for a symbolic function or just a shorthand. I can try to make it clearer here that there are two options.
doc/src/guides/custom-functions.md
Outdated
Note that in many cases, functions like these can be represented directly | ||
using SymPy classes. For example, the above function can be represented | ||
symbolically using `Piecewise`. The function can then be evaluated using | ||
{meth}`subs() <sympy.core.basic.Basic.subs>`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also perhaps worth mentioning Lambda
and maybe even lambdify
here:
In [19]: f = Lambda(x, Piecewise((0, Eq(x, 0)), (x+1, True)))
In [20]: f
Out[20]:
⎧ 0 for x = 0
x ↦ ⎨
⎩x + 1 otherwise
In [21]: f(0)
Out[21]: 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mentioning lambdify is a good idea. I thought about Lambda, but I'm a little unsure because I'm not sure just how well supported it is throughout SymPy. Very few things make use of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The potential advantage of Lambda
is that you get a symbolic object that can be used like a function with Python's call syntax (assuming that's what is wanted):
In [1]: eq_y = Lambda(x, Piecewise((0, Eq(x, y)), (1, True)))
In [2]: eq_y
Out[2]:
⎧0 for x = y
x ↦ ⎨
⎩1 otherwise
In [3]: eq_y(y+1)
Out[3]: 1
In [4]: eq_y(y)
Out[4]: 0
Unlike other options the Lambda
function can be used with subs
to substitute something that is not the "argument" of the function:
In [5]: eq_4 = eq_y.subs(y, 4)
In [6]: eq_4
Out[6]:
⎧0 for x = 4
x ↦ ⎨
⎩1 otherwise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing I don't understand about Lambda
is that it's an Expr
. But does it actually make sense to include a Lambda
inside of other expressions? You can't call it unless the top-level expression is a Lambda
. Does something like Lambda(...) + 1
even make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer it if Lambda
wasn't a subclass of Expr. There is a backwards way of calling it:
In [1]: f = Lambda(x, x**2)
In [2]: f
Out[2]:
2
x ↦ x
In [3]: exp(f)
Out[3]:
2
x ↦ x
ℯ
In [4]: exp(f).rcall(3)
Out[4]:
9
ℯ
I'm not sure I've ever seen that used in the wild though.
doc/src/guides/custom-functions.md
Outdated
This guide describes how to define functions that map a subset of | ||
$\mathbb{C}^n$ to $\mathbb{C}$. Functions that accept or return other kinds of | ||
objects should subclass another class, such as {class}`~.Boolean`, | ||
{class}`~.MatrixExpr`, {class}`~.Expr`, or {class}`~.Basic`. Much of what is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hopefully this will change in the future if we can integrate the kind system into Function (#4787, which might change other aspects of this guide as well).
GitHub has added |
Nice! (Although apparently it doesn't work with quoting...) |
Very cool. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new LaTeX math support if great!
Suggest splitting this guide into several pages, to decrease page size and length. |
We can define numerical evaluation for [our example $\operatorname{versin}(x)$ | ||
function](custom-functions-versine-definition) by recursively evaluating | ||
$2\sin^2\left(\frac{x}{2}\right)$, which is a more numerically stable way of writing $1 - | ||
\cos(x)$. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See also #20321
The {meth}`doit() <sympy.core.basic.Basic.doit>` method is used to evaluate | ||
"unevaluated" functions. To define `doit()` implement `doit(self, deep=True, | ||
**hints)`. If `deep=True`, `doit()` should recursively call `doit()` on the | ||
arguments. `**hints` will be any other keyword arguments passed to the user, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be much better to have _eval_doit
so that subclasses don't need to implement recursion (#17280)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes definitely. I noticed this problem for several methods. expand
does it correctly, but most methods don't.
... | ||
... # For symbolic arguments, require m and n to be integer. | ||
... if m.is_integer is False or n.is_integer is False: | ||
... raise TypeError("m and n should be integers") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we wouldn't use eval
for type checking so that we can still check types when evaluate=False
.
doc/src/guides/custom-functions.md
Outdated
... if isint is True: | ||
... return 1 | ||
... elif isint is False: | ||
... return 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't these be sympfied
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I guess so. That's another problem with defining raw methods rather than _eval_*
methods.
... elif isint is False: | ||
... return 0 | ||
... else: | ||
... return divides(m, n) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this should use evaluate=False
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there are any cases where this can make a difference. And even if there were, I'm unclear if that would be what you'd want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea should be that we don't need to check for evaluation again. I suppose that m
and n
might now be non-integer and divides
could check that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The m
and n
here are evaluated with doit
(when deep=True), so they could be anything. In this example, the doit
is just a more advanced version of the stuff in eval
, but in general, they might not be related. I guess an optimization that can be made here is to just return self
if deep=False
since in that case the arguments are the same. But I also don't know if it's that important.
Really, I don't think evaluate=False
should ever be used. If you're using it, that's a sign that you have something in eval
that should be somewhere else like doit
. If evaluate=False is being used to prevent some evaluation, you're creating an object that breaks the function's invariants, and will be fragile (due to the nature of evaluate=False). If evaluate=False is being used just to skip unnecessary work, that's maybe a sign that eval
is doing too much. Ideally it should be so fast that it doesn't matter if it gets called again.
```py | ||
>>> class versin(Function): | ||
... def fdiff(self, argindex=1): | ||
... return sin(self.args[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps worth explicitly pointing out that argindex uses 1-based indexing rather than 0-based.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's mentioned in the text above but I'll also note it here with a comment. I decided to ignore invalid argindex
, since that should never happen from diff
and users shouldn't really ever call fdiff
(see #23490).
|
||
To leave a derivative unevaluated, raise | ||
`sympy.core.function.ArgumentIndexError(self, argindex)`. This is the default | ||
behavior if `fdiff()` is not defined. Here is an example function $f(x, y)$ that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does returning None
not work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doc/src/guides/custom-functions.md
Outdated
There are many other functions in SymPy whose behavior can be defined on | ||
custom functions via a custom `_eval_*` method, analogous to the ones | ||
described above. See the documentation of the specific function for details on | ||
how define each method. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how to define
doc/src/guides/custom-functions.md
Outdated
... def _eval_is_positive(self): | ||
... x = self.args[0] | ||
... # versin(x) is positive if x is real and not an even multiple of pi | ||
... return fuzzy_and([x.is_real, fuzzy_not((x/pi).is_even)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating a new expression like x/pi
in an assumptions query is problematic. This kind of thing is bad for performance and also often leads to infinite recursion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, OK. I thought this was a pretty common pattern, but I can see how it might cause problems.
So is a better pattern to parse the expression (like with the _peeloff_pi helper in the SymPy trig functions)? That seems more complex, especially for end users. What's the simplest way to write this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This kind of thing is more common in the codebase than it should be. There might not really be a better way but there could be some discussion about this in this guide.
The problem is that creating an expression leads to evaluation which leads to assumptions queries. So if assumptions queries create expressions then we have recursion. Whenever you have recursion you need some proof that your recursion terminates. If you know that you are always creating a smaller expression than the one that you started with then that is some proof. Likewise if your assumptions queries only ever query assumptions directly on their args then you know that you are always recursing to a query on a smaller expression tree.
In this case it is probable fine to do this but for a different reason: you are recursing from a query on a versin
expression to a query on an expression from the set of expressions that do not involve versin
. If you know that the set of expressions that do not involve versin
will never create a versin
then that is safe.
Getting this right is non-trivial and there should be clear warnings about it. The general rule is:
- Ideally an assumptions query on x is resolved just by doing some direct assumptions queries on
x.args
. - Assumption queries should aboid creating any new expressions.
- If it is necessary to create any new expression then you need to think very carefully about whether this recursion is guaranteed to terminate.
In this particular case you could use as_independent
to do a structural check that you do actually have a Mul
involving pi
:
coeff, pi_ = x.as_independent(pi, as_Add=False)
if pi_ is pi:
return fuzzy_and([x.is_real, fuzzy_not(coeff.is_even)])
elif x.is_real is False:
return False
Here we always know that as_independent
only creates smaller expressions. At worst it returns the original expression as the coeff and 1 for pi_
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I do remember seeing issues in the past with that sort of thing. It's not something that we can really show with this example, because versin
isn't used anywhere else in the codebase, so it's impossible for it to sneak in. But it can happen if someone is editing assumptions for a more core function in SymPy.
Actually I don't even think it's possible to "think very carefully" about whether the recursion will terminate. It really depends on what assumptions queries are being done elsewhere, and what expressions might create other expressions, which you cannot possibly know unless you've read the whole codebase. It's tough though because there's a lot of advanced logic in other class' assumptions handlers and you really want to be able to reuse that whenever possible.
This really is a major design flaw in the old assumptions. It makes the handlers implement both the specification of what the predicates are and the actual method by which they are resolved. That leads to problems like this where a handler would need to have global information to do that properly, but it can't. It also makes certain optimizations in the inference impossible, even basic ones like not evaluating the same inference twice. Contrast that with the new assumptions where the predicates are specified symbolically and a global solver determines how to resolve everything and which sub-predicates need to be evaluated. Global solvers (like SAT solvers) can handle cyclical inference chains just fine without infinite recursion:
>>> list(satisfiable(And(x >> y, y >> z, z >> x), all_models=True))
[{y: False, z: False, x: False}, {y: True, z: True, x: True}]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I don't even think it's possible to "think very carefully" about whether the recursion will terminate
In general no it isn't possible if arbitrary code can be in any of the handlers. However it is possible if we limit what can go into the handlers. Two rules are enough but they need to be applied universally:
- Assumptions queries should not create new expressions.
- A query on expr should be resolved only by querying assumptions on the args of expr (and not other queries on expr itself).
With these two rules we always know that any assumptions query only ever recurses to queries on smaller expressions and therefore the recursion must at some point terminate.
This really is a major design flaw in the old assumptions
The flaw is expecting the old assumptions to be something that they are not. They are relatively well defined for their purpose which is to be a limited inference system for use in evaluation. Remember that much of what they do is not really about "assumptions" in the sense of assuming things about symbols but actually just about computing basic properties often of numeric expressions.
Review comments have been addressed. I used your example for the It's a little unfortunate that we have to use |
doc/src/guides/custom-functions.md
Outdated
... def _eval_is_nonnegative(self): | ||
... # versin(x) is nonnegative iff x is real | ||
... x = self.args[0] | ||
... return x.is_real |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to be more careful here:
In [12]: (1 - cos(pi+I*pi))
Out[12]: 1 + cosh(π)
If x
is not real then that does not imply that versin(x)
is not nonnegative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah good catch. Of course, the real range is just [0, 2], so there are complex values that map it to (2, oo).
Really I would like to just write (1 - cos(x)).is_real
here. If you look in the guide, a lot of the methods do just delegate to 1 - cos(x)
. Partly that is just me being lazy/keeping the guide simple. But I also do think there are legitimate reasons why you'd want to do that in general. If something can be reduced to something already implemented somewhere else, you'd like to be able to use it. It's generally going to be more robust to reuse something already in SymPy as a user since it's going to be way better tested, and even for library code, it reduces duplication which also minimizes bugs. So it's really unfortunate to have a "no recursive expressions" rule, which I agree is better for performance and avoiding infinite recursion, but makes it harder to avoid rewriting hard mathematical logic.
But also, one of the use-cases for custom functions that this guide talks about is functions that are shorthands for another function (the see FMA example in the guide). In that case, you have a function that represents another expression and evaluates with doit.
Maybe we just need to have a specific way to create something like this, since the basic pattern is
def Unevaluated(Function):
@classmethod
def eval(cls, *args):
# Evaluate to <expression> on explicit numerical inputs
def doit(self, deep=True):
return <expression>
def _eval_whatever(self):
return <expression>._eval_whatever()
where the last part is basically the same for every possible method that you would want to define. This whole thing can be wrapped up into a single Function
subclass which you could subclass to specify the expression you want to represent, and only override methods if you don't want the default behavior (like defining a custom code printer).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes it harder to avoid rewriting hard mathematical logic.
It does but really I think the problem is the use of classes with automatic evaluation:
- The mathematical logic is all tied to instances so you need an instance to be able to call the existing code.
- The instances evaluate whenever you create them and actually most of the mathematical logic is in the evaluation code (so using evaluate=False probably doesn't work).
- We don't want to run all of that that code every time there is an assumptions query because pretty much anything triggers an assumptions query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See also #23693 where creating expressions breaks assumptions queries under evaluate(False)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made #23701 about the wrapper function superclass idea
Any more comments on this? |
... return True | ||
... | ||
... def _eval_is_positive(self): | ||
... # versin(x) is positive iff x is real and not an even multiple of pi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The example above shows that this is not an if and only if.
>>> from sympy import sin | ||
>>> class versin(Function): | ||
... def _eval_evalf(self, prec): | ||
... return (2*sin(self.args[0]/2)**2)._eval_evalf(prec) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this is okay but in general I would prefer not to invoke symbolic evaluation as part of numeric evaluation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not following. What is your concern here?
|
||
`FMA` also represents an example of a continuous function defined on multiple | ||
vriables, which demonstrates how `argindex` works in the | ||
[`fdiff`](custom-functions-differentiation) example. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
variables
... # If x is an integer multiple of pi, x/pi will cancel and be an Integer | ||
... n = x/pi | ||
... if isinstance(n, Integer): | ||
... return 1 - (-1)**n |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's worth discussing the use of sympify somewhere. I think you've changed it so that both the inputs and outputs of this function are automatically sympified. That's very different from other Basic subclasses so it should probably be clarified somewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With things being sympified automatically it's not something you really need to worry about. I'd love for all Basic
subclasses and methods to work like that, and also for all classes to more generally use an eval
pattern rather than overriding __new__
. There are too many bugs caused by people forgetting to sympify inputs/outputs, plus people tend to use sympify instead of _sympify when they do it themselves.
I had a couple of comments but I think this looks good in any case. |
Actually I meant to leave that for you to merge but hit the wrong button... As I said I think it looks good! |
These were left as comments on sympy#23488 but were not addressed before it was merged.
Followup at #23732 |
This is still a work in progress, but early reviews are always welcome.
Reminder to reviewers that you can use the PR docs preview by clicking "Click here to see a preview of the documentation." in the checks list below.
References to other Issues or PRs
Brief description of what is fixed or changed
Other comments
Release Notes