notes/fp-myths.lyx

#LyX 2.3 created this file. For more info see http://www.lyx.org/
\lyxformat 544
\begin_document
\begin_header
\save_transient_properties true
\origin unavailable
\textclass article
\begin_preamble
\newcommand{\fl}{\operatorname{fl}}
\end_preamble
\use_default_options true
\maintain_unincluded_children false
\language english
\language_package default
\inputencoding auto
\fontencoding global
\font_roman "default" "default"
\font_sans "default" "default"
\font_typewriter "default" "default"
\font_math "auto" "auto"
\font_default_family default
\use_non_tex_fonts false
\font_sc false
\font_osf false
\font_sf_scale 100 100
\font_tt_scale 100 100
\use_microtype false
\use_dash_ligatures false
\graphics default
\default_output_format default
\output_sync 0
\bibtex_command default
\index_command default
\paperfontsize default
\spacing single
\use_hyperref false
\papersize default
\use_geometry true
\use_package amsmath 2
\use_package amssymb 2
\use_package cancel 1
\use_package esint 1
\use_package mathdots 1
\use_package mathtools 1
\use_package mhchem 1
\use_package stackrel 1
\use_package stmaryrd 1
\use_package undertilde 1
\cite_engine basic
\cite_engine_type default
\biblio_style plain
\use_bibtopic false
\use_indices false
\paperorientation portrait
\suppress_date false
\justification true
\use_refstyle 1
\use_minted 0
\index Index
\shortcut idx
\color #008000
\end_index
\topmargin 1in
\secnumdepth 3
\tocdepth 3
\paragraph_separation indent
\paragraph_indentation default
\is_math_indent 0
\math_numbering_side default
\quotes_style english
\dynamic_quotes 0
\papercolumns 1
\papersides 1
\paperpagestyle default
\tracking_changes false
\output_changes false
\html_math_output 0
\html_css_as_file 0
\html_be_strict false
\end_header

\begin_body

\begin_layout Title
Some myths about floating-point arithmetic
\end_layout

\begin_layout Author
Steven G.
 Johnson
\end_layout

\begin_layout Standard
(This list is adapted from the 
\begin_inset Quotes eld
\end_inset

prevalent misconceptions about floating-point arithmetic
\begin_inset Quotes erd
\end_inset

 by William Kahan's 2004 presentation 
\begin_inset Quotes eld
\end_inset

How Java’s Floating-Point Hurts Everyone Everywhere.
\begin_inset Quotes erd
\end_inset

)
\end_layout

\begin_layout Standard
As in Trefethen's book, we denote floating point operations by 
\begin_inset Formula $\oplus,\otimes,\ldots$
\end_inset

.
 We denote the set of floating-point numbers by 
\begin_inset Formula $\mathbb{F}$
\end_inset

, and 
\begin_inset Formula $\fl(x)$
\end_inset

 denotes the closest element of 
\begin_inset Formula $\mathbb{F}$
\end_inset

 to 
\begin_inset Formula $x\in\mathbb{R}$
\end_inset

.
 Assuming 
\begin_inset Formula $x$
\end_inset

 does not overflow or underflow (exceed the max/min exponent), two key facts
 are that 
\begin_inset Formula $|\fl(x)-x|\leq\epsilon|x|$
\end_inset

, where 
\begin_inset Formula $\epsilon$
\end_inset

 is the machine precision, and that (assuming IEEE 
\begin_inset Quotes eld
\end_inset

correct rounding
\begin_inset Quotes erd
\end_inset

) 
\begin_inset Formula $x\odot y=\fl(x\cdot y)$
\end_inset

 for binary operations 
\begin_inset Formula $\cdot\in\{\times,\pm,/\}$
\end_inset

.
 The other key fact is to understand that 
\begin_inset Formula $\mathbb{F}$
\end_inset

 is a specific set of rational numbers: 
\begin_inset Formula $p$
\end_inset

-digit integers multiplied by powers of two (in binary floating-point) or
 powers of 10 (in decimal floating point).
 
\end_layout

\begin_layout Standard
A number of pernicious myths about floating-point arithmetic are prevalent.
 They include:
\end_layout

\begin_layout Itemize
A unpredictable random number of order 
\begin_inset Formula $\epsilon$
\end_inset

 is added to every result.
 e.g.
 
\begin_inset Formula $1\oplus1$
\end_inset

 may give 
\begin_inset Formula $2\pm\epsilon$
\end_inset

, and 
\begin_inset Formula $0\otimes x$
\end_inset

 may give 
\begin_inset Formula $\pm\epsilon$
\end_inset

.
 
\series bold
False
\series default
.
 (e.g.
 
\begin_inset Formula $1\oplus1$
\end_inset

 always gives exactly 2, and 
\begin_inset Formula $0\otimes x$
\end_inset

 always gives exactly 0 [unless 
\begin_inset Formula $x$
\end_inset

 is 
\begin_inset Formula $\pm$
\end_inset

Inf or NaN], since 
\begin_inset Formula $2$
\end_inset

 and 
\begin_inset Formula $0$
\end_inset

 are exactly representable.)
\end_layout

\begin_layout Itemize
Integer arithmetic is more accurate than floating-point arithmetic.
 
\series bold
False
\series default
.
 (See above: integer arithmetic is performed exactly in floating-point.)
\end_layout

\begin_layout Itemize
Integer arithmetic is much faster than floating-point arithmetic.
 
\series bold
False
\series default
 on any modern general-purpose CPU.
 (Maybe true in 1980s, or on small embedded systems.)
\end_layout

\begin_layout Itemize
Computational 
\series bold
precision
\series default
 (the number of digits stored) is the same thing as the computational 
\series bold
accuracy
\series default
.
 
\series bold
False
\series default
.
 (Numbers can be much more accurate than the number of digits stored, e.g.
 integers are stored exactly, or much less accurate, e.g.
 due to error accumulation.)
\end_layout

\begin_layout Itemize
\begin_inset Quotes eld
\end_inset

Arithmetic much more precise than the data it operates upon is needless,
 and wasteful.
\begin_inset Quotes erd
\end_inset

 (Kahan) 
\series bold
False
\series default
: even if you only need 3 significant digits in the final result, you may
 need many more digits at intermediate steps.
\end_layout

\begin_layout Itemize
Floating-point arithmetic incurs rounding errors in representing typical
 decimal fractions, e.g.
 0.1 or 3.1415.
 
\series bold
True
\series default
 for 
\emph on
binary
\emph default
 floating-point, but 
\series bold
False
\series default
 for 
\emph on
decimal
\emph default
 floating-point (available in many software libraries and some hardware).
\end_layout

\begin_layout Itemize
\begin_inset Quotes eld
\end_inset

In floating–point arithmetic nothing is ever exactly 0 ; but if it is, no
 useful purpose is served by distinguishing +0 from -0.
\begin_inset Quotes erd
\end_inset

 (Kahan) 
\series bold
False.
 
\series default
(Signed zeros are useful for tracking 
\emph on
underflow
\emph default
, indicating branch cuts, and other situations.)
\end_layout

\end_body
\end_document