## Type of Numbers
### Integers
The only issue is __overflow__, and the issue with division (Euclidean division or floating depends on algorithms)

### floating-point numbers
Often approximates reals

#### Representation of floating numbers
$\pm d_1.d_2d_3d_4...d_p \times \beta^n \equiv $ [significant/fraction part/mantissa] $\times$ [exponent] where  
$\beta$: base,  
$p$: procession,  
$0\leq d_i < \beta$  
If normalized, $d_1 \neq 0$  
$L \leq n \leq U$

__Underflow limit (UFL)__ the smallest positive number before getting a underflow is $1.00..0 \times \beta^L$
__Overflow limit (OFL)__ the largest positive number before getting an overflow is $[\beta-1].[\beta-1]...[\beta-1] \times \beta^U = (\beta - \beta^{1-p})\times \beta^U = (1-\beta^{-p})\beta^{U+1}$.

#### IEEE Floating Point Standard

|   | p |L   |U  | decimal numbers | exponent range in base 10|
|--- |---| ---|---| --- | --- |
|single precision | 24 | -126 | 127 | 6-7 | -37 ~ +37 | 
|double precision | 53 | -1022 | 1023 |16 | -308 ~ +308 | |

#### Rounding
Most of the time, we can round to the nearest, while when the rounding is exactly at the middle, round to the nearest even least-significant-digit (__round to even__). In binary, such case will always round to 0 (as 0 is "even")

IEEE has 4 rounding modes, but we will only encounter rounding to the nearest. 

IEEE has 5 __Basic operations__: $+,-,\times, /, \sqrt{x} $   

__Example__ Consider 3 decimal digit numbers, i.e. $(p = 3, \beta = 10)$, assume $L,U = [-10, 10]$. Then 
$$1.54\times 10^1 + 2.56\times 10^{-1} = 15.4 + 0.256 = 15.656 \rightarrow 1.57 \times 10^1$$

#### Distance between two adjacent floating number => Machine epsilon $\epsilon_{mach}$
The distance from 1 to the next bigger floating-point number will be $1.00...0 \times \beta^0$, the next number will be $1.00...01\times \beta^0$, $d=\beta^{1-p}=:\epsilon_{mach}$, which is the machine epsilon. 

For each adjacent pair of numbers $a, b$ in $[\beta^i, \beta_{i+1}), d(a, b) = \beta^i \epsilon_{mach}$, a.k.a. numbers are evenly spread with tte distance apart $=\beta^i \epsilon_{mach}$

For some real number $a$, considering the rounding to the nearest floating point number, the absolute error $fp(a)$, $|fl(a) - a| \leq \frac{\beta^i\epsilon_{mach}}{2}$ and the relative error $\frac{|fp(a)-a|}{|a|}\leq \frac{\beta^i\epsilon_{mach}}{2\beta^i} = \epsilon_{mach}/2$

#### IEEE Rule of operations
$fl(a \: op\: b):=$ the floating point number closest to $a\: op\:b$, assuming using rounding to nearest mode and not exceeding UFL or OFL

$\Rightarrow |\frac{fl(a\: op\: b) - (a\: op\: b)}{a\: op\: b}| \leq \frac{\epsilon_{mach}}{2}$

#### Subnormal (denormal) numbers and Gradual Underflow
Subnormal numbers have $d_1 = 0$.  
Benefit: We filled in the gap between $[0, \beta^L)$  
Penalty: The machine epsilon number rule does not hold, i.e. $\exists a. |fl(a)-a|/|a| > \epsilon_{mach}/2$

__Example__ with $\beta = 10, p = 3, L = -10, U = 10$  

$$\begin{align}
1.01 \times 10^{-5} \cdot 2.02 \times 10^{-6} &= 2.0402 \times 10^{-11}\\
&= 0.20402\times 10^{-10} &\text{using subnormal}\\
&\rightarrow 0.20\times 10^{-10} &\text{still need leading 0 to tell this is subnormal} \\
1.01 \times 10^{-6} \cdot 2.02 \times 10^{-7} &= 2.0402 \times 10^{-13}\\
&= 0.0020402\times 10^{-10} \\
&\rightarrow 0.00\times 10^{-10} &\text{underflow}\\
&\rightarrow 0
\end{align}$$
The second case will be the underflow to a subnormal number

$$\begin{align}
3.56 \times 10^6 \cdot 5.41 \times 10^6 &= 1.92596 \times 10^{11}\\
&\rightarrow +Inf &\text{overflow happens}
\end{align}$$

#### Inf and NaN
When overflow happens, IEEE rules it as $\pm Inf$ (it just means greater than the computer's capacity, not actually infinity)  
$Inf  + Inf\rightarrow +Inf, Inf-Inf\rightarrow NaN, Inf/Inf \rightarrow NaN, 0/0\rightarrow NaN$ `NaN` better understands as I don't know what it is