Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EPSILON is a bad error margin and should not be recommended [float_cmp] #6816

Open
CAD97 opened this issue Mar 1, 2021 · 11 comments · May be fixed by #11948
Open

EPSILON is a bad error margin and should not be recommended [float_cmp] #6816

CAD97 opened this issue Mar 1, 2021 · 11 comments · May be fixed by #11948
Labels
A-documentation Area: Adding or improving documentation C-bug Category: Clippy is not doing the correct thing good-first-issue These issues are a good way to get started with Clippy L-suggestion Lint: Improving, adding or fixing lint suggestions

Comments

@CAD97
Copy link
Contributor

CAD97 commented Mar 1, 2021

[f32|f64]::EPSILON are the machine epsilon of the type, or (as stated in the Rust docs), the distance between 1.0 and the next representable floating point number.

The page linked in the more info specifically says that abs( a - b ) < epsilon is wrong for any value of espilon. However, it's especially egregious with f__::EPSILON, because for floating point numbers outside the range -2..=2, floating point numbers cannot be f__::EPSILON close, so abs( a - b ) < f__::EPSILON is actually equivalent to a strict equality check.

There isn't a generally applicable solution to recommend. The most thorough resource I've found suggests comparison in ULPs for testing against a non-zero number, and testing against a fixed epsilon (but one bigger than f__::EPSILON.

At the least, we (and probably std) shouldn't be recommending comparing against f__::EPSILON, as it's basically as poor as bitwise equality and gives a false sense of handling the problem, when it isn't really handled.

@llogiq
Copy link
Contributor

llogiq commented Mar 1, 2021

When I was still writing float-heavy code, I always had a function to calculate a suitable epsilon-value based on the values to compare and the number of mantissa bits that I expected to be equal.

@camsteffen
Copy link
Contributor

So then should we lint against abs(a - b) < f__::EPSILON?

@camsteffen camsteffen added A-documentation Area: Adding or improving documentation C-bug Category: Clippy is not doing the correct thing good-first-issue These issues are a good way to get started with Clippy L-suggestion Lint: Improving, adding or fixing lint suggestions labels Mar 2, 2021
@camsteffen
Copy link
Contributor

This also has implications for float_equality_without_abs.

@CAD97
Copy link
Contributor Author

CAD97 commented Mar 2, 2021

So then should we lint against abs(a - b) < f__::EPSILON?

I'm not completely certain. In cases where a and b are expected to be "small" (for some value of "small" smaller than 1), this might be a reasonable comparison epsilon, because ULPs for small enough floats are really small, and the machine epsilon might be a reasonable bound for "small enough". At the same time, the article I linked recommends some small multiple of the machine epsilon for these cases.

That said, I do think that abs(a - b) < f__::EPSILON most likely came about by an unfortunately misguided attempt to address the "imprecise floats problem." As such, I think a lint against specifically < f__::EPSILON is fair, so long as it doesn't fire for small multiples of the machine epsilon, as that shows that some amount of thought went into picking the comparison epsilon.

To be extra pedantic, a pedantic lint against abs(a - b) < ANY_FLOAT_CONST where a and b are function arguments could be argued for, pointing out how a fixed epsilon is subtly wrong for arbitrary floating point due to its varrying precision.

I don't have any strong opinions on linting float comparisons, though, beyond not recommending comparison against machine epsilon.

@ghost
Copy link

ghost commented Feb 16, 2022

The old classic "What Every Computer Scientist Should Know About Floating-Point Arithmetic" says that when a real number is rounded to the closest floating point number it's relative error is bounded by machine epsilon. Note that is relative error not absolute.

So based on that I would expect the correct comparison to be abs( (a - b)/a ) <= epsilon for a != 0. When a is zero I guess you could use fxx::MIN_POSITIVE.

@ghost
Copy link

ghost commented Feb 16, 2022

Oh, it looks like fxx::MIN_POSITIVE is the smallest positive normal value. So that wouldn't be right.

@trentj
Copy link

trentj commented Feb 18, 2023

This bad suggestion came up on URLO.

@CAD97
Copy link
Contributor Author

CAD97 commented May 10, 2024

This came up on IRLO again and will likely continue to do so until this is somehow categorically fixed.

@curoli
Copy link

curoli commented May 10, 2024

Is it that hard? The test should be:

2.0*abs(a-b) <= eps*(abs(a) + abs(b))

where eps is the relative precision the user needs. If eps is around f__::EPSILON or smaller, it becomes strict equality, which most algorithms will fail to achieve, so you want eps to be larger than f__::EPSILON, probably at least twice as large.

But before we do any of the above, consider alternatives:

  • Use integers if possible. Some calculations, like taxes, wages, bank statements, or election results, should not be implemented using floating points, but fixed point numbers, i.e. integers.
  • If you are checking whether a number is zero to see if you can divide by it, or whether the determinant of a matrix is zero to see whether it can be inverted, keep in mind that you will often get inaccurate results if these values are close to zero. Consider alternative algorithms. Consider recognizing that your algorithm does not work in all cases.
  • If you have an iterative algorithm that converges to a solution, instead of checking if the solution is good, check whether iterations make it better or not. Even better if you can figure out in advance how many iteration you need.

@curoli
Copy link

curoli commented May 10, 2024

Also, I'd say use f64 instead of f32. Now that most architectures are 64 bit, I'm not sure using f32 buys you anything, and the added precision of f64 makes things way more robust.

@jedbrown
Copy link

A backward-stable algorithm evaluating a function $f$ with condition number $\kappa$ has a relative error bounded by $\kappa , \epsilon_{\text{machine}}$. In many cases for which $f(x) = 0$, the condition number blows up. For example, the algorithm |x| (1.0 + x) - 1.0 is unstable because its second operation has unbounded condition number as $x\to 0$, and indeed produces relative error of size 1. We can't give reliable advice for a (relative or absolute) tolerance without knowing the condition numbers involved. In the above example, the condition number of the entire function is 1 (great!) but the algorithm is unstable and thus violates any useful relative error bound. It would be nice if users always wrote backward-stable algorithms, but it's going to be really confusing if the lint is telling people to test with a tolerance that can only be achieved with by doing so. When the functions are differentiable, we could estimate a condition number and provide diagnostics about numerical stability using Enzyme (cf. rust-lang/rust#124509).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-documentation Area: Adding or improving documentation C-bug Category: Clippy is not doing the correct thing good-first-issue These issues are a good way to get started with Clippy L-suggestion Lint: Improving, adding or fixing lint suggestions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants