# Python: NUMPY | Numerical Python Arrays Tutorial
https://www.youtube.com/watch?v=8Mpc9ukltVA

In [1]:
import numpy as np

## Advantages
### Save Coding time
* No for loops: many vector and matrix operations save coding time

Using Python List:

In [11]:
my_list = [1,2,3]

In [12]:
for i in range(len(my_list)):
    my_list[i] *= 3

In [9]:
my_list

[3, 6, 9]

Using NumPy Array:

In [6]:
my_array = np.array([1,2,3])

In [7]:
my_array *= 3

In [10]:
my_array

array([3, 6, 9])

### Faster Execution
* Single type for each field to avoid type checking (less flexible because of this)
* Uses contiguous blocks of memory

### Uses Less Memory
* Python List: an array of pointers to Python objects, with 4B+ per pointer plus 16B+ for a numerical object
* NumPy Array: No pointers; type and itemsize is same for colums
* Compact data types like uint8 and float16

## NumPy Array Pointers
* **Data type** - all elements have same numpy data type
* **Item size** - memory size of each item in bytes
* **Shape** - dimensions of the array
* **Data** - the easiest way to access the data is through indexing, not this pointer

## NymPy Data Types
### Numerical types:
* integers (int)
* unsigned integers (uint)
* floating point (float)
* complex
### Other data types:
* booleans (bool)
* string
* datetime
* Python object

Let's try some NumPy commands:

In [13]:
# pip install numpy --upgrade #if you want to update NumPy

In [16]:
a = np.array([2,3,4])
a

array([2, 3, 4])

In [17]:
a = np.arange(1,12,2)
a

array([ 1,  3,  5,  7,  9, 11])

In [18]:
a = np.linspace(1,12,6) #last is number of elements, evenly spaced
a

array([ 1. ,  3.2,  5.4,  7.6,  9.8, 12. ])

In [19]:
a.reshape(3,2)

array([[ 1. ,  3.2],
       [ 5.4,  7.6],
       [ 9.8, 12. ]])

In [20]:
a

array([ 1. ,  3.2,  5.4,  7.6,  9.8, 12. ])

In [23]:
a =  a.reshape(3,2)
a

array([[ 1. ,  3.2],
       [ 5.4,  7.6],
       [ 9.8, 12. ]])

In [25]:
a.size # how many elements are in the array

6

In [26]:
a.shape

(3, 2)

Get data type

In [27]:
a.dtype

dtype('float64')

float64 is the default

In [28]:
a.itemsize

8

float64 will be 8 bytes of memory

In [29]:
b = np.array([(1.5,2,3), (4,5,6)])
b

array([[1.5, 2. , 3. ],
       [4. , 5. , 6. ]])

In [30]:
a < 4

array([[ True,  True],
       [False, False],
       [False, False]])

In [31]:
a * 3

array([[ 3. ,  9.6],
       [16.2, 22.8],
       [29.4, 36. ]])

In [32]:
a

array([[ 1. ,  3.2],
       [ 5.4,  7.6],
       [ 9.8, 12. ]])

To save it:

In [33]:
a *= 3

In [34]:
a

array([[ 3. ,  9.6],
       [16.2, 22.8],
       [29.4, 36. ]])

Create an empty array full of zeros:

In [35]:
a = np.zeros((3,4))
a

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [36]:
a.dtype

dtype('float64')

In [37]:
a = np.ones((2,3))
a

array([[1., 1., 1.],
       [1., 1., 1.]])

In [38]:
a = np.ones(10)
a

array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])

In [39]:
a = np.array([2,3,4], dtype = np.int16)
a

array([2, 3, 4], dtype=int16)

In [40]:
a.dtype

dtype('int16')

In [41]:
a.itemsize

2

more efficient because only 2 bytes each

In [42]:
a = np.random.random((2,3))
a

array([[0.04128619, 0.81009017, 0.76737866],
       [0.88196584, 0.42178028, 0.78460316]])

Note: gives us 0 to 1

In [45]:
np.set_printoptions(precision=2, suppress=True)
# supress=True will force suppress scientific notation

**Note: this will set the options for all future print statements until we revise it.**

In [44]:
a

array([[0.04, 0.81, 0.77],
       [0.88, 0.42, 0.78]])

In [47]:
a = np.random.randint(0,10,5)
a

array([8, 7, 2, 9, 8])

In [48]:
a.sum()

34

In [52]:
a.min()

2

In [50]:
a.max()

9

In [53]:
a.mean()

6.8

In [57]:
# a.average()

In [55]:
a.var()

6.16

In [56]:
a.std()

2.4819347291981715

In [58]:
a = np.random.randint(0,10,6)
a

array([7, 9, 4, 7, 1, 7])

In [59]:
a = a.reshape(3,2)
a

array([[7, 9],
       [4, 7],
       [1, 7]])

In [60]:
a.sum(axis=1)

array([16, 11,  8])

In [61]:
a.sum(axis=0)

array([12, 23])

Can use `axis=` for any of the mathematical / statistical functions above.

In [64]:
data = np.loadtxt("data.txt", dtype=np.uint8, delimiter=",", skiprows=1)

In [65]:
data

array([[9, 3, 8, 7, 6, 1, 0, 4, 2, 5],
       [1, 7, 4, 9, 2, 6, 8, 3, 5, 0],
       [4, 8, 3, 9, 5, 7, 2, 6, 0, 1],
       [1, 7, 4, 2, 5, 9, 6, 8, 0, 3],
       [0, 7, 5, 2, 8, 6, 3, 4, 1, 9],
       [5, 9, 1, 4, 7, 0, 3, 6, 8, 2]], dtype=uint8)

`np.loadtxt()` is not very good at handling exceptions. Try using `.genfromtxt` instead if you have missing data or whatever.

In [66]:
a = np.arange(10)
a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

Randomly shuffle these: (note: this is an in-place shuffle)

In [67]:
np.random.shuffle(a)

In [68]:
a

array([4, 5, 7, 8, 9, 2, 0, 1, 6, 3])

In [69]:
np.random.choice(a)

7

In [70]:
np.random.choice(a)

9

In [71]:
np.random.choice(a)

3

In [75]:
np.random.randint(5,10 + 2)

10