# Introduction to NumPy


The learning objectives of this section are:

* Understand advantages of vectorised code using NumPy (over standard python ways)
* Create NumPy arrays
    * Convert lists and tuples to NumPy arrays 
    * Create (initialise) arrays
* Inspect the structure and content of arrays
* Subset, slice, index and iterate through arrays

### NumPy Basics

NumPy is a library written for scientific computing and data analysis. It stands for numerical python.

The most basic object in NumPy is the ```ndarray```, or simply an ```array```, which is an **n-dimensional, homogenous** array. By homogenous, we mean that all the elements in a NumPy array have to be of the **same data type**, which is commonly numeric (float or integer). 

Let's see some examples of arrays.

In [2]:
# Import the numpy library
# np is simply an alias, you may use any other alias, though np is quite standard
import numpy as np

In [2]:
# Creating a 1-D array using a list
# np.array() takes in a list or a tuple as argument, and converts into an array
array_1d = np.array([2, 4, 5, 6, 7, 9])
print(array_1d)
print(type(array_1d))

[2 4 5 6 7 9]
<class 'numpy.ndarray'>


In [3]:
# Creating a 2-D array using two lists
array_2d = np.array([[2, 3, 4], [5, 8, 7]])
print(array_2d)


[[2 3 4]
 [5 8 7]]


In NumPy, dimensions are called **axes**. In the 2-d array above, there are two axes, having two and three elements respectively. 

In NumPy terminology, for 2-D arrays:
* ```axis = 0``` refers to the rows
* ```axis = 1``` refers to the columns

<img src="numpy_axes.jpg" style="width: 600px; height: 400px">

### Advantages of NumPy 

What is the use of arrays over lists, specifically for data analysis? Putting crudely, it is **convenience and speed **:<br>
1. You can write **vectorised** code on numpy arrays, not on lists, which is **convenient to read and write, and concise**. 
2. Numpy is **much faster** than the standard python ways to do computations.

Vectorised code typically does not contain explicit looping and indexing etc. (all of this happens behind the scenes, in precompiled C-code), and thus it is much more concise.

Let's see an example of convenience, we'll see one later for speed. 

Say you have two lists of numbers, and want to calculate the element-wise product. The standard python list way would need you to map a lambda function (or worse - write a ```for``` loop), whereas with NumPy, you simply multiply the arrays.

In [4]:
list_1 = [3, 6, 7, 5]
list_2 = [4, 5, 1, 7]

# the list way to do it: map a function to the two lists
product_list = list(map(lambda x, y: x*y, list_1, list_2))
print(product_list)


[12, 30, 7, 35]


In [5]:
# The numpy array way to do it: simply multiply the two arrays
array_1 = np.array(list_1)
array_2 = np.array(list_2)

array_3 = array_1*array_2
print(array_3)
print(type(array_3))

[12 30  7 35]
<class 'numpy.ndarray'>


As you can see, the NumPy way is clearly more concise.

Even simple mathematical operations on lists require for loops, unlike with arrays. For example, to calculate the square of every number in a list:

In [6]:
# Square a list
list_squared = [i**2 for i in list_1]

# Square a numpy array
array_squared = array_1**2

print(list_squared)
print(array_squared)


[9, 36, 49, 25]
[ 9 36 49 25]


This was with 1-D arrays. You'll often work with 2-D arrays (matrices), where the difference would be even greater. With lists, you'll have to store matrices as lists of lists and loop through them. With NumPy, you simply multiply the matrices.


In [1]:
import numpy as np

In [3]:
array1D=np.array([1,3,5,6])
print(array1D)
print(type(array1D))

[1 3 5 6]
<class 'numpy.ndarray'>


In [7]:
array2D=np.array([[1,2,3],[5,6,7]])
print(type(array2D))
print(array2D)


<class 'numpy.ndarray'>
[[1 2 3]
 [5 6 7]]


In [8]:
list1=[2,6,4,1]
list2=[7,8,1,0]
productlist=list(map(lambda x,y:x*y,list1,list2))
print(productlist)

[14, 48, 4, 0]


In [14]:
array1=np.array([1,4,5,6])
array2=np.array([2,5,0,1])
array3=array1*array2
arraysq=array1**2
arraysq2=array2**2
print(array3)
print(arraysq)
print(arraysq2)

[ 2 20  0  6]
[ 1 16 25 36]
[ 4 25  0  1]


In [15]:
import numpy as np
list1=[2,6,4,1]
list2=[7,8,1,0]
listsqaure=[i**2 for i in list2]
print(listsqaure)

[49, 64, 1, 0]
