# Python beginners course - Level 1 - NumPy
(inspired by work by Numan Yilmaz and exercises by Nicolas P. Rougier)

This tutorial consists of the following parts:

 - What is NumPy?
 - How to create NumPy arrays
 - Indexing, Fancy Indexing
 - Slicing
 - Universal Functions (Ufuncs)
 - Broadcasting
 - Masking, Sorting and Comparison

## 1. What is NumPy?

NumPy can be seen as the foundation of mathematical calculations in Python. It provides a user-friendly way for the users to represent numerical data as lists or matrices objects and do calculations on these objects. For example, let's say you have a list of numbers ```[1, 2, 3]``` and want to calculate the mean of the list, Numpy provides a simple syntax to do so. 

Because it's ease of use and high performance, NumPy has become the basis for virtually every data science package that exists. In this notebook, we will demonstrate some of the most important functionality that NumPy has to offer, and provide you with some excercises to help you learn use it. The exercises are designed such that they are similar to the examples provided. In the case that you are really lost or want to check your answer: you can find your answers [here](https://github.com/rougier/numpy-100/blob/master/100_Numpy_exercises.md).

Before we get started, let's check the version of NumPy and Python.

In [2]:
# import numpy
import numpy as np

# sys was imported to check the python version
import sys 

# check the version of python and numpy
print('NumPy version:', np.__version__)
print('Python version',sys.version)

NumPy version: 1.16.2
Python version 3.7.3 (default, Mar 27 2019, 17:13:21) [MSC v.1915 64 bit (AMD64)]


## 2. How to create NumPy arrays
One of the most basic objects that NumPy uses is called a ```NumPy array```. You can think of a NumPy array as a ordered list of numbers. NumPy arrays can be used to represent a lot of data.

For example, you can use NumPy arrays to represent:
- the height of 10 family members
- the temperature in the last month
- the EUR-USD Exchange rates of the past 20 years
- the first 5 million decimals of $\pi$

There are many ways to create NumPy arrays. We will take a look at a few of them here.

In [3]:
# Creates a numpy array where we specify the values
np.array([1, 2, 3])

array([1, 2, 3])

In [5]:
# Creates a numpy array of specified length (3) where all values are 0
np.zeros(3)

array([0., 0., 0.])

In [6]:
# Creates a numpy array of specified length (3) where all values are 1
np.ones(3)

array([1., 1., 1.])

In [15]:
# Creates a numpy array of specified length (3) where all values are random integers between 1 and 10
np.random.randint(1, 10, 3)

array([8, 2, 6])

### Exercise:
Run the cell above a couple of times. Do you ever see 10 appear? Adjust the code above such that 10 can also appear.

In [6]:
# Creates a numpy array of specified length (5) where all values are evenly spaced between 0 and 10
np.linspace(0, 10, 5)

array([ 0. ,  2.5,  5. ,  7.5, 10. ])

All of the NumPy arrays created above are 1-dimensional. However, most of the data we  use on a day-to-day basis is 2-dimensional; for example tabular data. Luckily, NumPy is also able to handle 2-dimensional data in the form of 2-dimensional NumPy arrays.

In [17]:
# Creates a 2-D numpy array with 3 columns and 4 rows
np.array([[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9],
          [10,11,12]])

array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])

In [19]:
# Creates a 2-D numpy array with 3 rows and 5 rows where the values are random numbers between 0 and 1
np.random.random((3,5))

array([[0.62439808, 0.46237779, 0.22109322, 0.67145785, 0.08796029],
       [0.95489768, 0.68204725, 0.37025698, 0.09148251, 0.27046809],
       [0.57078145, 0.86034435, 0.29289796, 0.14796683, 0.63082574]])

### Exercise: Create an array of zeroes of size 10
(**hint**: np.zeros)

## 3. Slightly more advanced tricks
We now have seen how to create basic 1- and 2-dimensional NumPy arrays. However, we have not yet done anything with the data. Let's see what we can do with NumPy arrays once we have created them.

Previously, we just created NumPy arrays, without actually storing them into the memory of the computer. Let's now again create a 1-dimensional array and a 2-dimensional array, but this time we will store them in the computer memory so that we can actually manipulate the data.

We will store the 1-dimensional array under the name ```a``` and store the 2-dimensional array under the name ```b```.

In [30]:
# Creates a 1-D array with predefined values and stores it under the name 'a'
a = np.array([1,2,3])

# Creates a 2-D array of size (3,3) with random integers between 0 and 10 and stores it under the name 'b'
b = np.random.randint(0,10, (3,3))

# In order to see the data, we can print it to the screen:
print("array a:")
print(a)

print("array b:")
print(b)

The variables ```a``` and ```b``` are now stored in the computers memory. Now lets do some modifications to our variables. 

In [10]:
# adding values to an existing array
a = np.append(a, 4)
a

array([1, 2, 3, 4])

In [11]:
# print the shape and dimension of arrays
print("Shape of a:", np.shape(a))
print("Shape of b:", np.shape(b))

print('Dimension of a:', np.ndim(a))
print('Dimension of b:', np.ndim(b))

Shape of a: (4,)
Shape of b: (3, 3)
Dimension of a: 1
Dimension of b: 2


In [12]:
#  number of elements in the arrays
print('Number of elements in a:', np.size(a))
print('Number of elements in b:', np.size(b))

Number of elements in a: 4
Number of elements in b: 9


### Exercise:  Create a vector with values ranging from 10 to 49
(**hint**: np.arange)

### Exercise:  Create a 10x10 array with random values and find the minimum and maximum values
(**hint**: min, max)

## 4. Indexing

Contents of a NumPy array object can be accessed and modified through indexing. Three types of indexing methods are available − field access, basic slicing and advanced indexing. Here we will explain the first two methods.

##### Tip: notice that numbering of items in an array starts at zero. 

In [15]:
# create an array integer from 1 to 10
X = np.arange(1, 11, dtype=int)
X

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [16]:
## FIELD ACCESS
## get a specific item in the array by specifying its index

# get the first item in the array
first = X[0]
# get the fourth item in the array
fourth = X[3]

7


In [None]:
# EXERCISE: get the seventh item in the array by replacing ___
seventh = X[____]
print(seventh)

In [19]:
## BASIC SLICING
## get a slice of the array by defining the start and end index

# get the third until fifth item
X[2:5]

array([3, 4, 5])

In [37]:
## Now lets try for a 2-dimensional array

# first, we create the array
Y= np.arange(1,17).reshape(4,4)
Y

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12],
       [13, 14, 15, 16]])

In [39]:
# use slicing to get the first two rows
Y[0:2]

array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

In [53]:
# use slicing on both rows and columns to get only the last two elements of the first two rows
Y[0:2, 2:4]

array([[3, 4],
       [7, 8]])

### Exercise:  Create a null vector of size 10 but the fifth value which is 1
(**hint**: np.zeros and array\[4\])

### Exercise: use slicing to get the first two elements of the last two rows

In [None]:
Y[___, ____]

## 5. Universal Functions(Ufuncs)

##### tip: press TAB after np. to see list of available ufuncs. np.{TAB}

Allow fast computation in NumPy arrays.

In [3]:
# create an array integer from 1 to 10
X = np.arange(1, 11, dtype=int)
X

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [8]:
# find the maximum element of X
np.max(X)

10

In [None]:
# EXERCISE: find the minimum element
np.__(X)

In [9]:
# find the mean of values in the X
np.mean(X)

5.5

In [11]:
# get the 4th power of each value
np.power(X, 4)

array([    1,    16,    81,   256,   625,  1296,  2401,  4096,  6561,
       10000], dtype=int32)

In [12]:
# get each value to the power 2
np.square(X)

array([  1,   4,   9,  16,  25,  36,  49,  64,  81, 100], dtype=int32)

In [13]:
# get the square root of each value
np.sqrt(X)

array([1.        , 1.41421356, 1.73205081, 2.        , 2.23606798,
       2.44948974, 2.64575131, 2.82842712, 3.        , 3.16227766])

In [14]:
# trigonometric functions 
print(np.sin(X))
print(np.tan(X))

[ 0.84147098  0.90929743  0.14112001 -0.7568025  -0.95892427 -0.2794155
  0.6569866   0.98935825  0.41211849 -0.54402111]
[ 1.55740772 -2.18503986 -0.14254654  1.15782128 -3.38051501 -0.29100619
  0.87144798 -6.79971146 -0.45231566  0.64836083]


In [45]:
# calculate more complex functions: item^3 + item^2
np.power(X, 3) + np.square(X)

array([   2,   12,   36,   80,  150,  252,  392,  576,  810, 1100],
      dtype=int32)

### Now lets try for a 2-dimensional array

In [55]:
Y= np.arange(1,17).reshape(4,4)
Y

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12],
       [13, 14, 15, 16]])

In [56]:
# multiply all elements by 2
np.multiply(Y, 2)

array([[ 2,  4,  6,  8],
       [10, 12, 14, 16],
       [18, 20, 22, 24],
       [26, 28, 30, 32]])

In [48]:
# calculate more complex functions: item^3 + item^2
np.power(Y, 3) + np.square(Y)

array([[   2,   12,   36,   80],
       [ 150,  252,  392,  576],
       [ 810, 1100, 1452, 1872],
       [2366, 2940, 3600, 4352]], dtype=int32)

### Exercise: find the median of values in X


In [None]:
np.__(X)

### Exercise: create a random vector of size 30 and find the mean value
(**hint**: mean)

### Exercise: create a 5x5 matrix with values 1,2,3,4 just below the diagonal
(**hint**: np.diag)

## 6. Sorting, Comparison and Masking

In [60]:
# create array of 15 elements between 1 and 10
X = np.random.randint(1, 10, 15)
X

array([4, 9, 2, 4, 9, 7, 4, 9, 8, 1, 6, 6, 5, 7, 6])

In [63]:
# create (3,3) size of array elements from 1 and 5
Y = np.random.randint(1,5, (3,3))
Y

array([[4, 4, 4],
       [2, 3, 3],
       [1, 2, 1]])

In [61]:
# sort elements in array X
np.sort(X)

array([1, 2, 4, 4, 4, 5, 6, 6, 6, 7, 7, 8, 9, 9, 9])

In [64]:
# sort the rows
np.sort(Y, axis=0)

array([[1, 2, 1],
       [2, 3, 3],
       [4, 4, 4]])

In [67]:
# sort within the rows
np.sort(Y, axis=1)

array([[4, 4, 4],
       [2, 3, 3],
       [1, 1, 2]])

In [68]:
# perform a true/false comparison test on every element
# == , !=, < , >, >=, <= operations on arrays

# is the element greater than 3?
x > 3

array([ True,  True,  True, False,  True,  True, False,  True, False,
        True, False,  True, False, False,  True])

In [50]:
# use masking feature to get the values of comparisons
x[x>3]

array([4, 4])

In [69]:
# more complex masking with multiple conditions
x[(x <= 3) & (x > 1)]

array([2, 3, 2, 3])

### Exercise: given a 1D array, negate all elements which are lower than 28 and higher than 37.

In [None]:
y = np.arange(60)
y[___]