# Floating-Point Arithmetic Operations

---

## 1. Introduction

In standard mathematics with real numbers, we take certain properties for granted. For example, the order in which we add numbers doesn't change the result (associativity), and multiplication distributes over addition. However, in the world of computer-based floating-point arithmetic, these rules can be broken.

Because computers store numbers with finite precision, every operation is a potential source of a small rounding error. When many operations are performed, these small errors can accumulate and lead to significant inaccuracies. More surprisingly, the **order of operations** can dramatically change the final result.

In this notebook, we will demonstrate that for floating-point numbers:
- **Associativity of addition is not guaranteed**:  `(a + b) + c` is not always equal to `a + (b + c)`
- **Distributivity is not guaranteed**: `a * (b + c)` is not always equal to `a * b + a * c`

We will analyze these properties by simulating a simple floating-point system and using absolute and relative errors to see which order of operations yields a better result.

## 2. Simulating a Floating-Point System

To make the effects of rounding visible, we will simulate a hypothetical decimal floating-point system **F(β=10, t=3, L, U)**. This means:
- **β = 10**: We are working in base 10.
- **t = 3**: We only have a precision of **3 significant digits** for our mantissa.
- **L, U**: We assume the exponent range is sufficient for our examples.

The most important rule in our system is:
> **After every arithmetic operation, the result is rounded to 3 significant digits before it can be used in the next calculation.**

We will use the notation $fl(x)$ to denote the floating-point representation of a number $x$ after it has been rounded to 3 significant digits.

## 3. Testing Associativity: `(a + b) + c` vs. `a + (b + c)`

Let's test the associative property with the following values:
- `a = 11.4`
- `b = 3.18`
- `c = 5.05`

First, let's calculate the true value with full precision.
$$ \text{True Value} = 11.4 + 3.18 + 5.05 = 19.63 $$

Now, we will compute the result using our floating-point system, following two different orders of operation.

#### **Path 1: (a + b) + c**
1.  First, we compute `a + b`:
    $$ a + b = 11.4 + 3.18 = 14.58 $$
2.  Now, we store this result in our 3-digit system, which requires rounding:
    $$ fl(14.58) = 14.6 $$
3.  Finally, we add `c` to this rounded intermediate result:
    $$ fl(a+b) + c = 14.6 + 5.05 = 19.65 $$
4.  We round the final result to 3 digits:
    $$ \text{Result 1} = fl(19.65) = 19.7 $$

#### **Path 2: a + (b + c)**
1.  First, we compute `b + c`:
    $$ b + c = 3.18 + 5.05 = 8.23 $$
2.  This result already has 3 significant digits, so no rounding is needed:
    $$ fl(8.23) = 8.23 $$
3.  Next, we add `a` to this result:
    $$ a + fl(b+c) = 11.4 + 8.23 = 19.63 $$
4.  We round the final result to 3 digits:
    $$ \text{Result 2} = fl(19.63) = 19.6 $$

#### Error Analysis

Let's compare the errors for both paths against the true value of **19.63**.

| Path              | Result | Absolute Error ($\|19.63 - \bar{x}\|$) | Relative Error ($E_a / \|19.63\|$) |
| ----------------- | ------ | ------------------------------------- | ------------------------------ |
| **(a + b) + c** | 19.7   | $\|19.63 - 19.7\| = 0.07$              | $0.07 / 19.63 \approx 0.356\%$       |
| **a + (b + c)** | 19.6   | $\|19.63 - 19.6\| = 0.03$              | $0.03 / 19.63 \approx 0.153\%$       |

**Conclusion:** The final results are different (**19.7** vs. **19.6**), proving that addition is not associative in this floating-point system. The second path, `a + (b + c)`, produced a smaller error and is therefore the better solution in this case.

## 4. Testing Distributivity: `a * (b - c)` vs. `a*b - a*c`

Now, let's test the distributive property. This example will highlight a common and severe source of error known as **catastrophic cancellation**.

We will compare `5.55 * (4.45 - 4.35)` with `(5.55 * 4.45) - (5.55 * 4.35)` in our `F(10, 3, L, U)` system.

The exact true value is:
$$ \text{True Value} = 5.55 \times (4.45 - 4.35) = 5.55 \times 0.1 = 0.555 $$

#### **Path 1: a * (b - c)**
This is the numerically stable approach.

1.  First, perform the subtraction inside the parentheses:
    $$ b - c = 4.45 - 4.35 = 0.1 $$
2.  The result is exact and doesn't need rounding in our system:
    $$ fl(0.1) = 0.100 $$
3.  Now, perform the multiplication:
    $$ a \times fl(b-c) = 5.55 \times 0.100 = 0.555 $$
4.  The final result fits within 3 digits, so no rounding is needed:
    $$ \text{Result 1} = fl(0.555) = 0.555 $$

#### **Path 2: (a * b) - (a * c)**
This is the numerically unstable approach.

1.  First, we compute `a * b`:
    $$ a \times b = 5.55 \times 4.45 = 24.6975 $$
2.  We round this intermediate result to 3 significant digits:
    $$ fl(a \times b) = 24.7 $$
3.  Next, we compute `a * c`:
    $$ a \times c = 5.55 \times 4.35 = 24.1425 $$
4.  And we round this result as well:
    $$ fl(a \times c) = 24.1 $$
5.  Finally, we subtract the two rounded, intermediate results:
    $$ \text{Result 2} = fl(a \times b) - fl(a \times c) = 24.7 - 24.1 = 0.6 $$

#### Error Analysis: Catastrophic Cancellation

Let's compare the errors for both paths against the true value of **0.555**.

| Path              | Result | Absolute Error ($|0.555 - \bar{x}|$) | Relative Error ($E_a / |0.555|$) |
| ----------------- | ------ | ------------------------------------- | ---------------------------------- |
| **a * (b - c)** | 0.555  | $|0.555 - 0.555| = 0.0$               | $0.0\%$                           |
| **(a*b) - (a*c)** | 0.600  | $|0.555 - 0.600| = 0.045$             | $0.045 / 0.555 \approx 8.11\%$    |

**Conclusion:** The results are dramatically different. Path 1 produced the exact answer, while Path 2 produced a result with a massive **8.11%** relative error!

This happened because Path 2 suffered from **catastrophic cancellation**. By subtracting two large, nearly equal numbers (`24.7` and `24.1`), the leading digits cancelled each other out. The final result (`0.6`) is constructed almost entirely from the less significant digits, which are the parts most affected by the initial rounding errors. All the original precision was lost.

## 5. Summary

**Key Takeaways:**

1.  **Arithmetic rules for real numbers do not always apply to floating-point numbers.** Properties like associativity and distributivity can fail due to rounding after each operation.
2.  **The order of operations is critical for numerical accuracy.** As demonstrated, changing the order can lead to very different results and errors.
3.  **Beware of Catastrophic Cancellation.** Subtracting two nearly equal floating-point numbers is one of the most common sources of severe numerical error. If possible, algebraic expressions should be rearranged to avoid this situation.