# Number Representation

How numbers are stored in computers: bits/bytes, Python integers, floating-point limits, and common pitfalls.
        


## Learning Objectives

- Explain how bits/bytes store integers and why Python `int` differs from fixed-width types.
- Spot floating-point rounding issues and loss of significance in common calculations.
- Encode and decode text with UTF-8 to bridge between characters and bytes.


## Bits and Bytes

Computers don't understand "7" or "Hello". They only understand **0** and **1** (Electricity Off/On).
- **Bit**: A single 0 or 1.
- **Byte**: A group of 8 bits. (e.g., `01000001`).

Everything in your computer—movies, code, documents—is just a massive pile of bytes.

## Integers (Whole Numbers)

In many languages (like C or Java), integers have a maximum size. If you go over it, the number "overflows" and becomes negative (like a car odometer rolling over).
**Python is special**: Its integers can be as big as you want! It automatically uses more memory to store bigger numbers.
However, libraries like **NumPy** use fixed-size integers (like `int64`) for speed, so be careful there.

In [None]:
import numpy as np
print(2 ** 100)  # big integer

small = np.int8(127)
print("np.int8 overflow:", small + 1)
        


## Floating Point Numbers (Decimals)

Computers struggle with decimals. They use a standard called **IEEE 754** to store them as binary fractions.
The problem? Some numbers (like 0.1) cannot be written perfectly in binary (just like 1/3 cannot be written perfectly in decimal: 0.3333...).

**Result**: Tiny rounding errors. `0.1 + 0.2` is slightly larger than `0.3`.
**Lesson**: Never check if two floats are `==`. Check if they are "close enough".

In [None]:
print(0.1 + 0.2)
print((0.1 + 0.2) == 0.3)

# Round or compare with a tolerance
import math
print(math.isclose(0.1 + 0.2, 0.3))
        


### Loss of significance
Subtracting nearly equal numbers amplifies error.
        


In [None]:
large = 1e16
print((large + 1) - large)  # loses the 1 due to precision
        


## Character encoding (text as numbers)
Characters are stored as numeric codes (Unicode code points). UTF-8 is the common variable-length encoding.
        


In [None]:
text = "€"
encoded = text.encode("utf-8")
print(encoded)
print(encoded.decode("utf-8"))
        


## Summary
- Bits and bytes limit how precisely numbers are stored.
- Python `int` avoids overflow; fixed-width dtypes (NumPy) can wrap.
- Floating-point math introduces rounding error—compare with tolerances.
- Text is stored as encoded numbers (UTF-8 by default).
        
