# Floating Point Numbers

## Decimal Representation

Numbers can be represented as base 10 integers and fractions

For Example:

$0.75 = 7/10 + 5/100 = 7 * 10^{-1} + 5 * 10^{-2}$

$0.256 = 2/10 + 5/100 + 6/1000 = 2 * 10^{-1} + 5 * 10^{-2} + 6 * 10^{-3}$

$123.456 = 1 * 100 + 2 * 10 + 3 * 1 + 4 * 1/10 + 5 * 1/100 + 6 * 1/1000 = 1 * 10^{2} + 2 * 10^{1} + 3 * 10^{0} + 4 * 10^{-1} + 5 * 10^{-2} + 6 * 10^{-3}$



### A Decimal number can be generally represented as

$$
d = -1^{sign} \sum_{i=-m}^n d_i * 10^i
$$

#### Some numbers cannot be represented using finite number of the above terms

Numbers like $\pi$ and $\sqrt{2}$ cannot be represented using a finite number of terms. Even some rational numbers cannot be represented using a finite number of terms.

For example: 
* $1/3$

## Binary Representation

Numbers can be represented as base 2 integers and fractions

For Example:

$(0.11)_2 = (1/2 + 1/4)_{10} = (0.5 + 0.25)_{10} = (0.75)_{10} = 1 * 2^{-1} + 1 * 2^{-2}$

$(0.1101)_2 = (1/2 + 1/4 + 0/8 + 1/16)_{10} = (0.5 + 0.25 + 0 + 0.0625)_{10} = (0.8125)_{10} = 1 * 2^{-1} + 1 * 2^{-2} + 0 * 2^{-3} + 1 * 2^{-4}$



these have an exact float representation

### A Binary number can generally be represented as

$$
d = -1^{sign} \sum_{i=-m}^n d_i * 2^i
$$


#### We have the same problem that decimal numbers have, some numbers cannot be represented using a finite number of the above terms

For Example:



$(0.1)_{10} = 1/10 = (0.000110001100011\dots)_2 = (1/16 + 1/32 + 0/64 + 1/128 + 1/256 + 0/512 + 0/1024 + 1/2048 + 1/4096 + \dots)_{10}$



this has an approximate float representation since it is an infinite series

In [53]:
help(float)

Help on class float in module builtins:

class float(object)
 |  float(x=0, /)
 |
 |  Convert a string or number to a floating point number, if possible.
 |
 |  Methods defined here:
 |
 |  __abs__(self, /)
 |      abs(self)
 |
 |  __add__(self, value, /)
 |      Return self+value.
 |
 |  __bool__(self, /)
 |      True if self else False
 |
 |  __ceil__(self, /)
 |      Return the ceiling as an Integral.
 |
 |  __divmod__(self, value, /)
 |      Return divmod(self, value).
 |
 |  __eq__(self, value, /)
 |      Return self==value.
 |
 |  __float__(self, /)
 |      float(self)
 |
 |  __floor__(self, /)
 |      Return the floor as an Integral.
 |
 |  __floordiv__(self, value, /)
 |      Return self//value.
 |
 |  __format__(self, format_spec, /)
 |      Formats the float according to format_spec.
 |
 |  __ge__(self, value, /)
 |      Return self>=value.
 |
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |
 |  __getnewargs__(self, /)
 |
 |  __gt__(self, value, /)
 

In [54]:
float(10)

10.0

In [55]:
float(10.4)

10.4

In [56]:
float('12.4')

12.4

In [57]:
# Value Error
float('22/7')

ValueError: could not convert string to float: '22/7'

In [58]:
from fractions import Fraction

In [59]:
a = Fraction('22/7')
float(a)

3.142857142857143

In [60]:
# python is fooling us printing this out
print(0.1)

0.1


In [61]:
# you can see here its slightly off the actual value
format(0.1, '.25f')

'0.1000000000000000055511151'

In [62]:
# why is this an issue?
#  check this out

a = 0.1 + 0.1 + 0.1
b = 0.3

# a is not equal to b even though it should be
a == b

False

In [63]:
format(a, '.25f')

'0.3000000000000000444089210'

In [64]:
format(b, '.25f')

'0.2999999999999999888977698'

## Eqality Testing

In [65]:
import math

In [66]:
round(a, 3) == round(b, 3)  # works but not the best way to do it

True

In [67]:
x = .01
y = .04

round(x, 1) == round(y, 1)

True

In [68]:
x = 10000.01
y = 10000.02

round(x, 1) == round(y, 1)

True

In [69]:
# you could use the math module to check if they are close
math.isclose(a, b)

True

In [70]:
# would mess up for this though
x = .0000001
y = .0000002
math.isclose(x, y)

False

In [71]:
# To fix this we can provide a relative and absolute tolerance
math.isclose(x, y, rel_tol=0.01, abs_tol=0.01)

True

In [72]:
x = 0.0000001
y = 0.0000002

a = 123456789.01
b = 123456789.02

print(math.isclose(x, y, abs_tol=0.0001, rel_tol=0.01))
print(math.isclose(a, b, abs_tol=0.0001, rel_tol=0.01))

True
True


## Float -> Integer

There will be data loss when converting a float to an integer.

There are different ways to do this conversion:

i.e. 
* 10.4... is it 10 or 11?
* 10.5... is it 10 or 11?
* 10.6... is it 10 or 11?
* 10.9... is it 10 or 11?
* 10.00001... is it 10 or 11?
* 10.99999... is it 10 or 11?



In [73]:
help(math.trunc)

Help on built-in function trunc in module math:

trunc(x, /)
    Truncates the Real x to the nearest Integral toward 0.

    Uses the __trunc__ magic method.



In [74]:
# the math module also has a function truncate numbers after a certain number of decimal places
a = math.trunc(10.4)
b = math.trunc(10.5)
c = math.trunc(-10.6)
d = math.trunc(10.9)
e = math.trunc(10.000001)
f = math.trunc(10.999999)


print(f"a: {a}, b: {b}, c: {c}, d: {d}, e: {e}, f: {f}")

a: 10, b: 10, c: -10, d: 10, e: 10, f: 10


In [75]:
# The int constructor truncates floats as well
a = int(10.4)
b = int(10.5)
c = int(-10.6)
d = int(10.9)
e = int(10.000001)
f = int(10.999999)

print(f"a: {a}, b: {b}, c: {c}, d: {d}, e: {e}, f: {f}")

a: 10, b: 10, c: -10, d: 10, e: 10, f: 10


In [76]:
# Use floor to get the largest integer less than or equal to the number
a = math.floor(10.4)
b = math.floor(10.5)
c = math.floor(-10.6)  # -10.6 is greater than -11 so -11 is the largest integer less than or equal to -10.6 
d = math.floor(10.9)
e = math.floor(10.000001)
f = math.floor(10.999999)

print(f"a: {a}, b: {b}, c: {c}, d: {d}, e: {e}, f: {f}")

a: 10, b: 10, c: -11, d: 10, e: 10, f: 10


In [77]:
# Use ceil to get the smallest integer greater than or equal to the number

a = math.ceil(10.4)
b = math.ceil(10.5)
c = math.ceil(-10.6)  # -10.6 is less than -10 so -10 is the smallest integer greater than or equal to -10.6
d = math.ceil(10.9)
e = math.ceil(10.000001)
f = math.ceil(10.999999)

print(f"a: {a}, b: {b}, c: {c}, d: {d}, e: {e}, f: {f}")

a: 11, b: 11, c: -10, d: 11, e: 11, f: 11


In [78]:
# Use round to get the closest integer to the number
# round is calculating the distance to the closest integers and rounding to the closest one (multiple of 10)

a = round(10.4)
b = round(10.5)
c = round(-10.6)
d = round(10.9)
e = round(10.000001)
f = round(10.999999)

print(f"a: {a}, b: {b}, c: {c}, d: {d}, e: {e}, f: {f}")

a: 10, b: 10, c: -11, d: 11, e: 10, f: 11


In [79]:
""" You can even round before the decimal point using negative numbers (to the closest multiple of 10)

imagine a number line
    10          18.2     20
----*------------x-------*----
    <-----8.2---> <-1.8->
"""

round(18.2, -1)

20.0

In [80]:
# what about ties????

"""
is 1.25 closer to 1.2 or 1.3?

number line
    1.2     1.25    1.3
----*--------x--------*----
    <--.05--> <--.05-->
"""
x = 1.25

# there is not one, we have to determine how to break the tie
# the best way to do this is to round away from zero
# but python uses bankers rounding which rounds to the nearest value with an even least significant digit

round(x, 1)

1.2

In [81]:
# example rounding n (decimal places) = -1

a = round(15, -1)
b = round(25, -1)

print(f"a: {a}, b: {b}")  # both round to 20 because of bankers rounding

a: 20, b: 20


In [82]:
# why does python use bankers rounding?
# it is statistically more accurate, and less biased

# i.e

num_1, num_2, num_3 = .5, 1.5, 2.5

afz_rounding_avg = 6/3
bankers_rounding_avg = sum(round(n) for n in [num_1, num_2, num_3]) / 3

actual_avg = sum([num_1, num_2, num_3]) / 3

print(f"afz_rounding_avg: {afz_rounding_avg}, bankers_rounding_avg: {bankers_rounding_avg}, actual_avg: {actual_avg}")


afz_rounding_avg: 2.0, bankers_rounding_avg: 1.3333333333333333, actual_avg: 1.5


In [87]:
# How to actually round away from zero
def round_away_from_zero(number):
    return int(number + math.copysign(0.5, number))

a = 10.5
b = 11.5
c = -10.5
d = -11.5
e = 10.51
f = 11.49
g = -10.49
h = -11.51

print(f"a: {round_away_from_zero(a)}, b: {round_away_from_zero(b)}, c: {round_away_from_zero(c)}, d: {round_away_from_zero(d)}, e: {round_away_from_zero(e)}, f: {round_away_from_zero(f)}, g: {round_away_from_zero(g)}, h: {round_away_from_zero(h)}")

a: 11, b: 12, c: -11, d: -12, e: 11, f: 11, g: -10, h: -12
