# Design Notes: Evaluating expr that results in `nan` or `oo`

These are notes on possible solutions to a problem I'm having in evaluating expressions in `sympy.physics.mechanics`. They are here for review by the community, and possible suggestions.

# The Problem:

When evaluating symbolic expressions at points, sometimes the result is `nan` or `oo` unless the expression is fully simplified:

``````>>> expr = sin(x)/tan(x)
>>> expr.subs(x, 0)
nan
>>> simplify(expr).subs(x, 0)
1
``````

For small expressions this simplification is acceptable, but for large expressions (like the kind often generated in `sympy.physics.mechanics` this operation is too time consuming. I have come up with a few proposed solutions, which I will outline here:

## Selective Simplification

Only apply `simplify` to expressions that would evaluate to `nan`. This is what I tried first. The algorithm is fairly simple, and can be done in two traverses of the expression tree:

1. Replace all `tan` with `sin/cos`. This will allow us to only have to check for denominator = 0 conditions.

2. For nodes that are fractions, check if the denominator evaluates to 0. If so, apply simplify to the fraction, then evaluate at the `subs_dict`.

This has been written up and tested. You can check out the code in this blog post (scroll down to the simplification section).

After some testing, it was found to be considerably faster than applying simplify, and then subs. However, for large expressions it consumed 4 GB of my RAM and ran for 6 hours before I had to kill it. This is due to the size of the expressions that need to be simplified. One possible solution is to run `expand` on the expression beforehand to reduce the size of each fraction, but this is also time consuming.

## CSE subs

I haven't tried this, but it's a thought. Perform `cse` on the expression, then evaluate each sub-expression. If it evaluates to `nan` simplify it, then repeat. I have doubts that this will work though, as the parts that could be simplified out will probably already be numbers due to the previous subs operations. By this I mean:

``````>>> expr = cos(x + y)*tan(x + y)
>>> cse(expr)
([(x0, x(t) + y(t))], [cos(x0)*tan(x0)])
>>> sub_dict = {x: 0, y: pi/2}
``````

In this case, x0 would evaluate fine, but then `cos(pi/2)*tan(pi/2) = nan`. In this case applying `simplify` to the final expression before subbing in `x0` would work, but my gut feeling is that there will be expressions where this won't work out. I still plan on trying this.

## Use `limit` to evaluate the troublesome expressions

Similar to the selective simplification, but instead take the limit of the troubling sub-expression. This would probably work, but there are some issues:

• Multivariable `sub_dict`:

For most cases, the sub_dict used contains many variables. These could each be evaluated individually using normal subs, and if it evaluates to `nan` then take the limit for that variable. For example (by hand):

``````>>> expr = cos(x + y)*tan(x + y)
>>> expr1 = expr.subs(x, 0)
>>> expr1
cos(y)*tan(y)
>>> expr2 = expr.subs(y, pi/2)
>>> expr2
nan
>>> expr3 = limit(expr2, y, pi/2)
>>> expr3
1
``````

This would change the algorithmic complexity, as it would require a traversal for each key in the sub_dict (where before it was done in 1 traversal by `msubs`). However, I think this may be the best option for finding a correct solution everytime.

• `limit` doesn't work with `dynamicsymbols` (sympy type `AppliedUndef`)

In `mechanics` the sub_dict is almost always composed of `{dynamicsymbol: value}`. This is less of an issue. Perhaps `limit` could be modified to work with `AppliedUndef`, or a `Dummy` pre-subbed in for the dynamicsymbol before taking the limit. This would add another traversal though, increasing runtime.

## Calculate the value by perturbation

Similar to above (check if `nan`, then apply select rule to sub-expression). Instead of applying `limit`, perturb from the sub_dict value by epsilon (some small value), then check if `expr.subs(sub_dict_plus_epsilon) == expr.subs(sub_dict_minus_epsilon)` within some specified tolerance.

This is probably a bad way to do it, because (if I understand correctly) mathematically we'd need to check all possible paths to the setpoint. For example, suppose we have some expression `f(x, y, z)`, with a setpoint `setpoint=[a, b, c]` that we'd like to evaluate it at. If we have `epsilon=1e-10`, then we'd need to check all possible combinations of `[a +/- epsilon, b +/- epsilon, c +/- epsilon]`. Also, this is more numerical, and may not be proper for sympy.

The benefit is that for we wouldn't have to traverse the expression for each key in the sub_dict, as this could be evaluated for all keys at the same time.

Note that I'm not sure if this is numerically/mathematically correct. In my mind it makes sense, but my preliminary research hasn't turned up any algorithms for calculating limits numerically.

##### Clone this wiki locally
You can’t perform that action at this time.
Press h to open a hovercard with more details.