# Numeric Data

- Numbers can be of the integer, float, decimal, or complex data type
- Operators like +, -, /, and * are built into Python and allow for basic calculations.  There are also a handful of built-in functions that work with numbers. 
- However, most numerical functions are in modules or libraries that must be imported.  We'll cover:
    1. Built-in functions
    1. Floating point data type
    1. Decimal module
    1. Math module
    1. Statistics module
    1. Random module

---

## Built-in Functions

Code | Use
--- | ---
`round()` | Returns the number rounded to the desired precision as a float.  First argument is float to be rounded.  Optional second argument is precision.  Precision is the number of decimal places.  Default is 0 precision (rounds to  nearest integer).
`abs()` | Returns the absolute value of a number
`pow()` | Returns the power of a number
`min(), max()` | Returns the min or max value.  Works on number types separated by commas, strings, ranges, and collections.
`sum()` | Returns sum from an iterable object

---

**EXAMPLES**

**`round()`**

In [1]:
round(12.345, 2)  # Second number is the precision

12.35

**`abs()`**

In [2]:
abs(-1)

1

**`pow()`**

In [3]:
pow(1,2)

1

**`min()`**

In [4]:
min(1,2,3)

1

In [5]:
min(range(3))

0

In [6]:
min('Hello')

'H'

In [7]:
min(['apple', 'Apple', 'banana'])

'Apple'

**`sum()`**

In [8]:
sum([1,2,3])

6

---

## Floating Point
- Integers are stored in Python 3 using the integer data type.  They are stored in an "unbounded" manner, meaning they can be as large or small as the computer memory can hold.  There is no limit to integer size.  Internally, these base 10 integers are stored as base 2 (binary) numbers.  Base 10 integers can be represented exactly in base 2.
- Real numbers are stored in Python 3 using the float data type.  Specifically, a float is a "double precision floating point", which uses 64 bits to store a real number.  Python does NOT use single precision floating point numbers, which uses 32 bits per number.  Floats are "Scientific notation, but in binary".  Like integers, base 10 real numbers are stored internally as base 2 numbers (binary).  Unlike integers they are:
    1. Bounded because of scientific notation 
    2. Usually can NOT be represented exactly in binary 2.

**1. BOUNDED**

- Floats are stored in scientific notation.  In the example below, 1.2 is the **mantissa**, 10 is the base, and 3 is the exponent.
    - E.g. 1.2 x 10^3,  1.2e3 or  1.2E3.
- There are a set number of digits allotted for the mantissa and a set number of digits allotted for the exponent
1. If the mantissa has too many digits it will be truncated.  This leads to a loss of **precision**.  Precision is the amount of decimal digits in the mantissa.  For most purposes, the result will still be usable.
    - E.g. pi (3.1415... x 10^1)  will need to be truncated at some point
1. If the exponent has too many digits it will be truncated.  This leads to either overflow or underflow.
    1. **Overflow**--when very large *positive* exponents are truncated. Occurs for numbers that are very large positive or very large negative.  Results off by factors of 10 that are unusable.  I.e. for most purposes 1 billion is much different than 1 trillion.
        - E.g. $1.2 x 10^{3000000000000}$ or $-1.2x 10^{3000000000000}$ may need to be truncated
    1.  **Underflow**--when very large *negative* exponents are truncated.  Occurs for numbers that are very small positive or very small negative.  Results in answer off by factors of 10 that may or may not be usable.  I.e. for many purposes 1/1 billion and 1/1 trillion are both so small that we treat them both as 0 and they are equivalent.
        - E.g. $1.2x10^{-3000000000000}$ or $-1.2x10^{-3000000000000}$ may need to be truncated
       
**2. NOT EXACT**

- Though we have represented numerical values using the base 10 numbering system, computers store numbers internally in base 2 (binary).  Unfortunately, most base 10 fractions (decimals) cannot be represented exactly as base 2 fractions. The consequence is that, in general, the base 10 floating-point numbers we enter are only approximated by the base 2 floating-point numbers actually stored in the computer.
- The problem is easier to understand at first in base 10. Consider the fraction 1/3. We can approximate that as a base 10 fraction:
    - 0.3
    - Or better yet, 0.33
    - Or better yet, 0.333
    - Or better yet, 0.3333
    - However we can never exactly write 1/3 as base 10
- In the same way, no matter how many base 2 digits we’re willing to use, the base 10 value 0.1 cannot be represented exactly as a base 2 fraction.  In base 2, the base 10 fraction 1/10 is the infinitely repeating fraction 0.000110011001100110011001100110011...
- Many users are not aware of the approximation because of the way values are displayed. 
    1. Humans enter an exact base 10 number.  E.g. 0.1
    1. Python stores a base 2 approximation with many digits.  E.g. 0.0001100110011001101
    1. When displaying the number to the human, Python first converts the base 2 number back to a base 10 number.  E.g. 0.1000003814697265625.
    1. Before displaying, Python rounds it.  E.g. 0.1
- Just remember, even though the displayed result looks like the exact value of 1/10, the actual stored value is the nearest representable base 2 fraction
- This all leads to an example of floating point arithmetic.

```
>>> .1 + .1 + .1 == .3
False
```

- Floating point should not be used for money or other applications that require very precise, very large, or very small real numbers
    - These problems can be partially avoided by using integers with smaller denominations where applicable
        - E.g. instead of recording seconds with floats we could record picoseconds with integers
    - If changing denominations is not practical, then the decimal data type can be used.  See the *Decimal* section for more information.
    
Code | Use
--- | ---
`sys` | Module
`sys.float_info()` | Returns information about float data type

---

**EXAMPLES**

**`float_info`**

In [9]:
import sys

sys.float_info

sys.float_info(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308, min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53, epsilon=2.220446049250313e-16, radix=2, rounds=1)

**Float Arithmetic**

- After floats get larger than 2^53 they start rounding

- Integers are unbounded.  Float is bounded.  Rounding for floats occurs after 2^53.

In [10]:
print((2**53) == (2**53)+1)  # Integer
print(float(2**52) == float(2**52)+1)  # Float
print(float(2**53) == float(2**53)+1)  # Float

False
False
True


- Python 

- When humans enter and exact base 10 number, Python often displays an exact base 10 number

In [11]:
print(0.1)

0.1


- Sometimes Python shows a more accurate base 10 number

In [12]:
print(0.1 + 0.1 + 0.1)

0.30000000000000004


- Python actually stores a base 2 approximation of the exact base 10 number.  This binary number is used for math equations.  The binary approximations of 0.1 added together do not equal the binary approximation of 0.3.

In [13]:
0.1 + 0.1 + 0.1 == 0.3

False

- We can use fractions written with base 10 integers to avoid rounding errors and get the answer we'd expect

In [14]:
(1/3) + (1/3) + (1/3) == 1

True

---

## Decimal
- The float data type does not give good representations of decimal numbers.  It mostly has to do with the fact that humans use base 10 numbers, computers store base 2 numbers, and base 10 fractions (decimals) usually can not be stored exactly as base 2 numbers and must be approximated. See the *Floating Point* section above for more detail.
- The decimal data type fixes these issues.  It is good for money values or when significant figures is important
- From the Python Standard Library documentation:
    - *"Decimal “is based on a floating-point model which was designed with people in mind, and necessarily has a paramount guiding principle – computers must provide an arithmetic that works in the same way as the arithmetic that people learn at school.” – excerpt from the decimal arithmetic specification.*
    - *Decimal numbers can be represented exactly. In contrast, numbers like 1.1 and 2.2 do not have exact representations in binary floating point. End users typically would not expect 1.1 + 2.2 to display as 3.3000000000000003 as it does with binary floating point.*
    - *The exactness carries over into arithmetic. In decimal floating point, 0.1 + 0.1 + 0.1 - 0.3 is exactly equal to zero. In binary floating point, the result is 5.5511151231257827e-017. While near to zero, the differences prevent reliable equality testing and differences can accumulate. For this reason, decimal is preferred in accounting applications which have strict equality invariants.*
    - *The decimal module incorporates a notion of significant places so that 1.30 + 1.20 is 2.50. The trailing zero is kept to indicate significance. This is the customary presentation for monetary applications. For multiplication, the “schoolbook” approach uses all the figures in the multiplicands. For instance, 1.3 * 1.2 gives 1.56 while 1.30 * 1.20 gives 1.5600."*
- I am not sure what the "schoolbook approach" is.  It appears to be different from standard significant figures calculations.
- See the [Python documentation on decimals](https://docs.python.org/3/library/decimal.html) for more details
    
Code | Use
--- | ---
`decimal` | Module
`decimal.getcontext()` | Returns current settings and allows us to change settings.  Allows us to change precision, rounding, or enabled traps.
`decimal.Decimal()` | Returns decimal type number.  Input is integer, float, or string.  It is NOT recommended to input float numbers because these inputs are approximations of desired real numbers. Once converted to a decimal type number, it can be used with math operators, and basic functions like those that are built-in and logarithms.

---

**EXAMPLES**

In [15]:
import decimal

- Use integer or string inputs

In [16]:
print(type(decimal.Decimal(1)))
print(decimal.Decimal(1))  # String input
print(decimal.Decimal('0.1'))  # String input

<class 'decimal.Decimal'>
1
0.1


- Don't use float inputs

In [17]:
print(decimal.Decimal(0.1))  # Float input

0.1000000000000000055511151231257827021181583404541015625


- The decimal data type fixes the problem seen with float

In [18]:
print(.1 + .1 + .1 == .3)
print(decimal.Decimal('0.1') + decimal.Decimal('0.1') + decimal.Decimal('0.1') == decimal.Decimal('0.3'))

False
True


- The decimal dat type is not 100% precise though.  It does its best to the level of precision possible.  By default this is 28 decimal digits.

In [19]:
#  The 1 int is turned into a decimal.  The decimal is divided by 3 giving another decimal type
decimal.Decimal(1) / 3

Decimal('0.3333333333333333333333333333')

In [20]:
decimal.Decimal(1)/3 + decimal.Decimal(1)/3 + decimal.Decimal(1)/3 == 1

False

- The default settings can be seen by using the `.getcontext()` function

In [21]:
print(decimal.getcontext())

Context(prec=28, rounding=ROUND_HALF_EVEN, Emin=-999999, Emax=999999, capitals=1, clamp=0, flags=[Inexact, FloatOperation, Rounded], traps=[InvalidOperation, DivisionByZero, Overflow])


---

## Math

Code | Use
--- | ---
`math` | Module
`math.log()` | Log.  By default, natural log.  Optionally, specify base.
`math.sin()` | sin.  Uses radians by default.
`math.cos()` | cos.  Uses radians by default.
`math.tan()` | tan.  Uses radians by default.
`math.pi` | pi constant to about 15 digits
`math.factorial()` | Factorial

---

**EXAMPLES**

In [22]:
import math

In [23]:
print(math.log(10))  # Natural log
print(math.log10(10))
print(math.pi)
print(math.sin(math.pi/2))
print(math.factorial(3))

2.302585092994046
1.0
3.141592653589793
1.0
6


---

## Statistics

Code | Use
--- | ---
`statistics` | Module
`statistics.mean()` | Arithmetic mean
`statistics.median()` | Median
`statistics.mode()` | Single mode
`statistics.multimode()` | List of modes
`statistics.pstdev()` | Population standard deviation
`statistics.stdev()` | Sample standard deviation (uses n-1 in formula)

---

**EXAMPLES**

In [2]:
import statistics

In [3]:
numbers = [1, 2, 3, 4, 5]
print(statistics.mean(numbers))
print(statistics.median(numbers))
#print(statistics.mode(numbers)) # Only runs correctly in Python versions 3.8 and newer
#print(statistics.multimode(numbers)) # Only runs correctly in Python versions 3.8 and newer
print(statistics.pstdev(numbers))
print(statistics.stdev(numbers))

3
3
1.4142135623730951
1.5811388300841898


---

## Random
- Used to generate pseudorandom numbers.  Not truly random, but good enough for the real world.
- Random/pseudorandom functions are different from deterministic functions.  Deterministic means that given the same inputs it will always have the same output.

Code | Use
--- | ---
`random` | Module
`random.random()` | Returns random number between 0.0 and 1.0.  Can be used in conjunction with multiplication to create numbers within a certain range.
`random.randint()` | Returns random integer.  Arguments specify range.  Both ends of range inclusive.
`random.choice()` | Returns a randomly selected element/item from an iterable object

---

**EXAMPLES**

In [26]:
import random

In [27]:
print(random.random())
print(random.randint(0,10))
numbers = range(10)
print(random.choice(numbers))

0.919398454330448
5
2


---