# Ways to store numbers in numpy
numpy numerical data types [dtypes] briefly explained. See [here](https://docs.scipy.org/doc/numpy/user/basics.types.html) for a slightly techincal overview of data types in scipy and numpy, and [wikipedia](https://en.wikipedia.org/wiki/Floating_point) for more about floating point numbers.

Numbers are stored in computers as strings of bits (ones and zeros). For example, a string of three ones and zeros (three bits) can be used to represent eight values:

    000 = 0
    001 = 1
    010 = 2
    011 = 3
    100 = 4
    101 = 5
    110 = 6
    111 = 7
    
More bits (longer strings of ones and zeros, and thus more computer memory per number) are required to store larger numbers. 4 bits are required for numbers up to 15 (`1000=8`, `1001=9`, ..., `1111=15`), 5 bits for numbers up to 31 (`10000=16`, `10001=17`, ..., `11111=31`), 6 bits for numbers up to 64 (`100000=32`, `100001=33`, ..., `111111=63`), and so on. This means that the larger the range of values you want to store in a dataset (a list, array, image, or what have you), the more memory that data will require. 

A similar principle applies when dealing with decimal values, but instead of greater *range* costing more bits, greater *precision* costs more bits: the more decimal places you wish to specify, the more memory you will need for your data. Note that there are infinitely many numbers between 0 and 1 (1.1, 1.11, 1.1111, 1.111111111, ...). So if you want to store decimals (or floating point values, as they are usually termed), you have to specify a level of precision that will satisfy you. Are you satisfied with representing the number three as 3.000, and thus with neglecting the difference between 3.0000 and 3.0001? Or do you need to be sure that the three means exactly 3.0000000000000000 and not 3.0000000000000001? Decimal precision is specified by selecting a *data type* for the numbers in your data set. numpy has several data types; the two most commonly used for floating point (decimal) numbers are `np.float32` and `np.float64` (the latter is numpy's default, and is the more precise format). Data types for arrays can often be specified at array creation with the dtype argument (see below for examples). 

In [1]:
import numpy as np

In [2]:
# Create two ten-thousand-element arrays, convert them to float32 and float64
a = np.random.rand(10000).astype(np.float32)
b = np.random.rand(10000).astype(np.float64)

In [3]:
# By default, when printing float32 and float64 numbers, python prints more decimal
# places for float64 (to reflect its greater precision)
a[0], b[0]

(0.017625982, 0.37947126409119358)

You can also check on which variable - a or b - is using up more memory. Note that both are exactly the same shape and have the same number of elements. The difference in memory use is purely due to the precision of the decimals stored in each.

In [4]:
whos

Variable   Type       Data/Info
-------------------------------
a          ndarray    10000: 10000 elems, type `float32`, 40000 bytes
b          ndarray    10000: 10000 elems, type `float64`, 80000 bytes
np         module     <module 'numpy' from '/Us<...>kages/numpy/__init__.py'>


In [5]:
a = np.float32(np.pi)
b = np.float64(np.pi)
print('a(float32): {}, b (float64):{}'.format(a, b))

a(float32): 3.1415927410125732, b (float64):3.141592653589793


The data type affects what number will round to zero

In [75]:
for x in [1, 16, 32, 46, 64, 324]:
    print('float32: 10^-{x}==0: '.format(x=x), np.float32(10**(-x))==0)
    print('float64: 10^-{x}==0: '.format(x=x), np.float64(10**(-x))==0)
    print('---')

float32: 10^-1==0:  False
float64: 10^-1==0:  False
---
float32: 10^-16==0:  False
float64: 10^-16==0:  False
---
float32: 10^-32==0:  False
float64: 10^-32==0:  False
---
float32: 10^-46==0:  True
float64: 10^-46==0:  False
---
float32: 10^-64==0:  True
float64: 10^-64==0:  False
---
float32: 10^-324==0:  True
float64: 10^-324==0:  True
---


You might think that you could store small numbers in an array with just a few bits and large numbers with larger numbers of bits. For example, you might think that you could store the array: 
```python
data = [1, 3, 12, 9]
```

like this: 

Each value in a list or array of numbers has to have a placeholder that is a certain number of bits:

| 8 bits | 8 bits | 8 bits | 8 bits | 8 bits | 8 bits |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 2 | 3 | 45 | 211 | 10 | 244 |


# if you really want to go down the rabbit hole...

http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html