## NumPy

NumPy is a Python package used for numerical calculations, working with arrays of a homogeneous values, and scientific computing. 

In previous chapters, NumPy was used for the different functions and methods it provides. In addition to NumPy math functions such as ```np.sin()``` NumPy can also be used to construct homogeneous arrays and preform mathematical operations on arrays. A NumPy array is different from a Python list. The data types stored in a Python list can all be different:

```python
python_list =[ 1, -0.038, 'gear', True]
```

The list above contains four different data types: ```1``` is an integer, ```-0.038``` is a float, ```'gear'``` is a string, and ```'True'``` is a boolean.

The code below prints the data type of each value store in ```python_list```.

In [25]:
python_list = [1, -0.038, 'gear', True]
for item in python_list:
    print(type(item))

<class 'int'>
<class 'float'>
<class 'str'>
<class 'bool'>


The data types stored in a NumPy array need to all share the same data type. Consider the NumPy array below:

```
np.array([1.0, 3.1, 5e-04, 0.007])
```

All four values stored in the NumPy array above share the same data type: ```1.0```, ```3.1```, ```5e-04```, and ```0.007``` are all floats.

If the same four elements stored in the previous Python list are stored in a NumPy array, NumPy will force all of the four items in the list to conform to the same data type. In the case below, all four items are converted to type ```'<U32'```, which is a string data type in NumPy.

In [16]:
import numpy as np
np.array([1, -0.038, 'gear', True])

array(['1', '-0.038', 'gear', 'True'], dtype='<U32')

NumPy arrays can also be two-dimensional, three-dimensional, or up to n-dimensional. The array size is only limited by compter resources, but the data type stored in each array is limited to the same type.

NumPy arrays are useful because mathematical operations can be run on an entire array simultaneously. If a list of numbers is stored in a regular Python list, when the list is multiplied by a scalar, the list extends and repeats instead of multiplying each number in the list by the scalar.

In [17]:
lst = [1, 2, 3, 4]
lst*2

[1, 2, 3, 4, 1, 2, 3, 4]

To multiply each element of a Python list by the scalar number ```2```, a loop can be used:

In [18]:
lst = [1, 2, 3, 4]
for i, item in enumerate(lst):
    lst[i] = lst[i]*2
lst

[2, 4, 6, 8]

The method above is fairly cumbersome and is also quite _computationally expensive_. An operation that is computationally expensive is an operation that takes a lot of processing time and/or storage resources like RAM. Another way of completing this same action is to use a NumPy array. The NumPy array can be multiplied by a scalar and this will produce an array with each element multiplied by the scalar.

In [19]:
nparray= np.array([1,2,3,4])
2*nparray

array([2, 4, 6, 8])

If we have a very long list, we can compare the amount of time it takes for each operation. Jupyter notebooks have a nice built-in way to time how long it takes a line of code to execute. In a Jupyter notebook, when a line start with ```%timeit``` followed by code, the notebook will run the line of code multiple times and output an average of the time spent to complete the line of code. We can use ```%timit``` to compare an mathematical operation on a Python list using a for loop to the same mathematical operation on a NumPy array.   

In [26]:
lst = list(range(10000))
%timeit for i, item in enumerate(lst): lst[i] = lst[i]*2

4.29 ms ± 1.23 ms per loop (mean ± std. dev. of 7 runs, 1000 loops each)


In [27]:
nparray= np.arange(0,10000,1)
%timeit 2*nparray

7.4 µs ± 158 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


With 10,000 integers, the Python list and for-loop takes an average of single milliseconds, while the NumPy array completing the same operation takes tens of microseconds. This is a speed increase of over 100x by using the NumPy array (1 millisecond = 1000 micorseconds). For larger lists and NumPy arrays, the speed increase using NumPy is considerable.