<a href="https://colab.research.google.com/github/helghand1/MAT421/blob/main/Module_A_Representation_of_Numbers.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Hussein ElGhandour
# MAT 421
# Module A: Representation of Numbers Homework

## Base-N and Binary

### Introduction:
The **Base-N system** represents numbers using N digits, ranging from 0 to N-1. Each digit corresponds to a power of the base N. While Base-10 (decimal) is the system used in everyday life, other bases, such as **Base-2 (binary)**, are crucial in computing.

In a Base-N number system:
$$
\text{Number} = d_k \cdot N^k + d_{k-1} \cdot N^{k-1} + \ldots + d_0 \cdot N^0
$$
Where $d_i$ are the digits of the number, and N is the base.

### Why Binary?
**Binary (Base-2)** is fundamental to computers because digital circuits operate using two states: high (1) and low (0). This simplicity makes binary efficient for processing and storage.

### Conversion Concepts:
1. **Decimal to Binary**: Repeatedly divide the decimal number by 2, recording the remainders. The binary representation is the sequence of remainders read in reverse.
2. **Binary to Decimal**: Multiply each digit of the binary number by $2^i$, where $i$ is the position of the digit from right to left (starting at 0), and sum the results.
3. **Arbitrary Base Conversion**: The same logic applies, with N replacing 2 for other bases.

### Examples:
1. **Decimal to Binary Conversion**: Convert 19 (decimal) to binary:
   - $(19 \div 2 = 9) \enspace remainder \enspace (1)$
   - $(9 \div 2 = 4) \enspace remainder \enspace (1)$
   - $(4 \div 2 = 2) \enspace remainder \enspace (0)$
   - $(2 \div 2 = 1) \enspace remainder \enspace (0)$
   - $(1 \div 2 = 0) \enspace remainder \enspace (1)$
   - Binary Representation (remainders read in reverse): $10011$.

2. **Binary to Decimal Conversion**: Convert 10011 (binary) to decimal:
$$
10011 = 1 \cdot 2^4 + 0 \cdot 2^3 + 0 \cdot 2^2 + 1 \cdot 2^1 + 1 \cdot 2^0 = 16 + 0 + 0 + 2 + 1 = 19
$$

3. **Decimal to Base-5 Conversion (Arbitrary Base Converstion Example)**: Convert 47 (decimal) to Base-5:
   - $(47 \div 5 = 9) \enspace remainder \enspace (2)$
   - $(9 \div 5 = 1) \enspace remainder \enspace (4)$
   - $(1 \div 5 = 0) \enspace remainder \enspace (1)$
   - Base-5 Representation (remainders read in reverse): $142$.


In [None]:
# Decimal to Binary Conversion
decimal_number = 27
binary_result = ""
while decimal_number > 0:
    remainder = decimal_number % 2
    binary_result = str(remainder) + binary_result
    decimal_number = decimal_number // 2
print("Binary representation of 27 (Decimal) is:", binary_result)

# Binary to Decimal Conversion
binary_number = "10101"
decimal_result = 0
position = 0
for digit in reversed(binary_number):
    decimal_result += int(digit) * (2 ** position)
    position += 1
print("Decimal representation of 10101 (Binary) is:", decimal_result)

# The next two examples show an arbitrary N for Base-N conversions
# In these cases, N = 7

# Decimal to Base-7 Conversion
decimal_number = 46
base7_result = ""
while decimal_number > 0:
    remainder = decimal_number % 7
    base7_result = str(remainder) + base7_result
    decimal_number = decimal_number // 7
print("Base-7 representation of 46 (Decimal) is:", base7_result)

# Base-7 to Decimal Conversion
base7_number = "52"
decimal_result = 0
position = 0
for digit in reversed(base7_number):
    decimal_result += int(digit) * (7 ** position)
    position += 1
print("Decimal representation of 52 (Base-7) is:", decimal_result)

Binary representation of 27 (Decimal) is: 11011
Decimal representation of 10101 (Binary) is: 21
Base-7 representation of 46 (Decimal) is: 64
Decimal representation of 52 (Base-7) is: 37


## Floating Point Numbers

### Introduction:
In numerical computing, real numbers are represented as **floating point numbers** to allow a wide range of values while maintaining precision. Unlike integers, floating point numbers can represent fractions, as well as very large or very small values, efficiently.

### Structure of Floating Point Numbers:
A floating point number is typically expressed in the form:
$$
n = (-1)^s \cdot m \cdot 2^e
$$
Where:
- $s$ is the **sign** bit (0 for positive, 1 for negative),
- $m$ is the **mantissa** (or significand), representing the significant digits,
- $e$ is the **exponent**, representing the power of 2 by which $m$ is scaled.

### IEEE 754 Standard:
Modern computers use the **IEEE 754 standard** for floating point representation. The most commonly used format is **double precision (64 bits)**, consisting of:
1. **Sign bit (1 bit)**: Determines whether the number is positive (0) or negative (1).
2. **Exponent (11 bits)**: Encodes the power of 2, using a bias of 1023 (i.e., the stored exponent is $e + 1023$).
3. **Mantissa (52 bits)**: Stores the significant digits. The leading 1 in the mantissa is implicit and not stored to save space.

### Example Representation:
Consider the number $15.0$. Its floating point representation in IEEE 754 is as follows:
1. **Binary Representation**: $15.0 = 1111.0 = 1.111 \cdot 2^3$
2. **Sign**: 0 (positive number)
3. **Exponent**: $3 + 1023 = 1026 \enspace (10000000010 \enspace in \enspace binary)$ (This is done by rearranging the formula for the exponent)
4. **Mantissa**: $111000... \enspace (truncated \enspace to \enspace 52 \enspace bits)$

Thus, $15.0$ in IEEE 754 format is:
$$
0 \enspace 10000000010 \enspace 1110000000000000000000000000000000000000000000000000
$$

#### Python Demonstration:
Using Python, this example can observed:

In [None]:
15 == (-1)**0 * (2**((1*2**10 + 1*2**1) - 1023)) * (1 + 1*2**-1 + 1*2**-2 + 1*2**-3)

True

### Key Properties of Floating Point Numbers:
1. **Range**: Floating point numbers can represent very large and very small values, but there are limits.
2. **Precision**: Numbers are approximated to a finite number of significant digits, leading to **round-off errors** ( more about this can be found in the next section).
3. **Special Values**:
   - **Infinity $(-\infty, +\infty)$**: Represented when the exponent is all 1s, and the mantissa is all 0s.
   - **NaN (Not a Number)**: Represented when the exponent is all 1s, and the mantissa is non-zero.
   - **Subnormal Numbers**: Represent values smaller than the smallest normalized floating point number.

### Examples:
1. Represent 6.5 as a floating point number:
   - $6.5 = 110.1 = 1.101 \cdot 2^2$
   - **Sign**: 0 (positive)
   - **Exponent**: 2 + 1023 = 1025 (10000000001 in binary) (This is done by rearranging the formula for the exponent)
   - **Mantissa**: 101000... (truncated to 52 bits)

2. Represent 0.1 as a floating point number:
   - 0.1 = 1.100110011... (repeating binary) scaled by $2^{-4}$
   - **Sign**: 0 (positive)
   - **Exponent**: -4 + 1023 = 1019 (01111111011 in binary) (This is done by rearranging the formula for the exponent)
   - **Mantissa**: Binary representation of 0.100110011...

### Overflow and Underflow

Floating-point numbers in the IEEE 754 standard have a finite range due to the limited number of bits available for the exponent. This limitation can result in two key issues:

#### 1. Overflow:
- **Definition**: Overflow occurs when a number is too large to be represented within the maximum range of the floating-point format.
- **Result**: When a number exceeds the maximum representable value $( \text{max value} \approx 1.8 \times 10^{308} \text{ for double precision})$, it is set to $+\infty$ or $-\infty$, depending on the sign.
- **Example**: Multiplying the maximum double-precision number by 10 results in overflow:
$$
  \text{max} \times 10 = +\infty.
$$

#### 2. Underflow:
- **Definition**: Underflow occurs when a number is too small (in magnitude) to be represented within the minimum normalized range of the floating-point format.
- **Result**: Numbers smaller than the minimum normalized value $( \text{min value} \approx 2.2 \times 10^{-308} \text {for double precision})$ are represented as **subnormal numbers** or rounded to 0.0 if they are smaller than the subnormal range.
- **Example**: Subtracting two very close numbers can result in underflow:
$$
  2.0 \times 10^{-308} - 1.0 \times 10^{-308} = 1.0 \times 10^{-308}.
$$
  However, if the result is smaller than the smallest subnormal number, it becomes \(0.0\).

#### Python Demonstration:
Using Python, overflow and underflow can be observed with the `sys.float_info` module:

In [None]:
import sys
print("Maximum representable value:", sys.float_info.max)
print("Minimum normalized value:", sys.float_info.min)

# Overflow
large_value = sys.float_info.max*10
print("Overflow result:", large_value)  # Expect infinity (inf)

# Underflow
small_value = 1e-324  # Smallest subnormal number
underflow_result = small_value / 2
print("Underflow result:", underflow_result)  # Expect 0.0


Maximum representable value: 1.7976931348623157e+308
Minimum normalized value: 2.2250738585072014e-308
Overflow result: inf
Underflow result: 0.0


### Gaps Between Representable Floating-Point Numbers:
In floating-point representation, the numbers are spaced unevenly across the range. This spacing is referred to as **gaps** between representable numbers.

1. **How Gaps Work**:
   - The distance between consecutive floating-point numbers increases as the magnitude of the numbers increases.
   - This happens because the fraction is multiplied by the exponent, $2^{\text{e} - 1023}$.

2. **Implications**:
   - For small numbers, the gaps are tiny, allowing for high precision.
   - For large numbers, the gaps are larger, which can lead to rounding errors in calculations.

3. **Example**:
   - Near $1.0$, the gap between consecutive representable numbers is approximately $2^{-52} \approx 2.22 \times 10^{-16}$.
   - Near $10^9$, the gap grows to approximately $1.19 \times 10^{-7}$.

4. **Python Demonstration**:
   The gap can be calculated at any number using NumPy's `spacing` function, and if a number is added that is less than half the gap then it will not change the result:


In [None]:

   import numpy as np
   print(np.spacing(1.0))  # Gap near 1.0
   print(np.spacing(1e9))  # Gap near 1e9

   1e9 == 1e9 + np.spacing(1e9)/4

2.220446049250313e-16
1.1920928955078125e-07


True

## Round-Off Errors

### Introduction:
**Round-off errors** occur because floating-point numbers cannot represent real numbers with perfect precision. This limitation arises from:
1. The **finite number of bits** used to store floating-point numbers.
2. The inability to represent some numbers exactly in binary form (e.g., 0.1).

### Types of Round-Off Errors:
1. **Representation Error**:
   - Certain numbers cannot be represented exactly in binary form, leading to approximation.
   - For example, 0.1 in decimal is a repeating binary fraction:
$$
     0.1 = 0.0001100110011\ldots_\text{binary}
$$
     The binary representation is truncated or rounded, causing a small error.

2. **Arithmetic Error**:
   - When performing operations (e.g., addition or subtraction) with floating-point numbers, the results may be slightly inaccurate due to rounding during intermediate calculations.
   - Example:
     $$
     0.3 + 0.6 \neq 0.9
     $$
     In Python:

In [None]:
print(0.3 + 0.6 == 0.9)  # Outputs: False

False


3. **Accumulation Error**:
   - Errors accumulate when many operations with rounding are performed in sequence.
   - Example:
     Adding 1/3 repeatedly and then subtracting the same number of times:



In [None]:
     def add_and_subtract(iterations):
         result = 1
         for _ in range(iterations):
             result += 1/3
         for _ in range(iterations):
             result -= 1/3
         return result

     print(add_and_subtract(100))  # Outputs: 1.0000000000000002
     print(add_and_subtract(10000))  # Outputs: 1.0000000000001166
     print(add_and_subtract(1000000))  # Outputs: 0.9999999999727986

1.0000000000000002
1.0000000000001166
0.9999999999727986


### Causes of Round-Off Errors:
1. **Finite Precision**:
   - Double-precision floating-point numbers store only 52 bits for the mantissa. This limits the number of significant digits that can be represented.
   - Example:
     The number 1/3 = 0.333333... is truncated after 52 bits in binary.

2. **Base Conversion**:
   - Decimal fractions like 0.1 cannot be exactly represented in binary, as binary uses powers of 2, not 10.

3. **Loss of Significance**:
   - Subtracting two nearly equal numbers causes significant digits to cancel, amplifying errors.
   - Example:
     Subtracting 1.00000001 - 1.00000000 should result in 0.00000001, but this is not the case as most digits are lost:

In [None]:
1.00000001 - 1.00000000

9.99999993922529e-09