<h1><center>DS 200 - Lec5: NumPy Part 2</center></h1>

# Section 1: Numpy Array Indexing, Slicing and Selection

In this section we will discuss how to select elements or groups of elements from a ndarray.

In [1]:
import numpy as np

In [2]:
#Creating sample array
arr = np.arange(0,11)

#Show
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

## 1. Indexing and Slicing
The simplest way to pick one or some elements of an array looks very similar to python lists:

In [3]:
#Get a value at an index
arr[8]



8

In [4]:
#Get values in a range
arr[1:5]



array([1, 2, 3, 4])

In [5]:
#Get values in a range
arr[:5]



array([0, 1, 2, 3, 4])

## 2. Broadcasting

Numpy arrays differ from a normal Python list because of their ability to broadcast:

In [6]:
arr * 5



array([ 0,  5, 10, 15, 20, 25, 30, 35, 40, 45, 50])

In [7]:
# Change a slice of ndarray with broadcasting
arr[:5] = 100



#Show
arr

array([100, 100, 100, 100, 100,   5,   6,   7,   8,   9,  10])

**Note**: a slice of a ndarray returns a view to the original array, therefore if we modify the slice, the original array will be modified too. 

In [8]:
# Important notes on Slices
slice_of_arr = arr[0:6]

# Show slice
slice_of_arr

array([100, 100, 100, 100, 100,   5])

In [9]:
# Change Slice
slice_of_arr[:] = 99

# Show Slice again
print('Slice: ', slice_of_arr)
print('Original: ', arr)

Slice:  [99 99 99 99 99 99]
Original:  [99 99 99 99 99 99  6  7  8  9 10]


Now note the changes also occur in our original array! This is because data is not copied, it's a view to the original array! This avoids memory problems!

But what if we need to keep the original array intact?

In [10]:
# To get a copy, need to be explicit
arr2 = arr.copy()
arr2[:6] = 90



In [11]:
arr

array([99, 99, 99, 99, 99, 99,  6,  7,  8,  9, 10])

## 3. Indexing a 2D array (matrices)

The general format is ``arr_2d[row][col]`` or `arr_2d[row,col]`. I recommend using the comma notation for clarity.

In [12]:
grade_lst = \
[[79, 95, 60],
 [95, 60, 61],
 [99, 67, 84],
 [76, 76, 97],
 [91, 84, 98],
 [70, 69, 96],
 [88, 65, 76],
 [67, 73, 80],
 [82, 89, 61],
 [94, 67, 88]]

grade_arr = np.array(grade_lst)

#Show
grade_arr

array([[79, 95, 60],
       [95, 60, 61],
       [99, 67, 84],
       [76, 76, 97],
       [91, 84, 98],
       [70, 69, 96],
       [88, 65, 76],
       [67, 73, 80],
       [82, 89, 61],
       [94, 67, 88]])

#### Exercises:

1. How to get final exam grade of student 0? 
	
2. How to get all grades of student 2?

3. How to get grades of all students in midterm 1?

4. How to get all midterm grades of the first three students?


In [13]:
# Getting individual element value
grade_arr[0, 2]




60

In [14]:
# Indexing row
grade_arr[2]



array([99, 67, 84])

In [15]:
# Getting an entire column
grade_arr[:,0]



array([79, 95, 99, 76, 91, 70, 88, 67, 82, 94])

In [16]:
# 2D array slicing
# Shape (3,2) from top right corner
grade_arr[:3,:2]



array([[79, 95],
       [95, 60],
       [99, 67]])

## 4. Fancy Indexing

Fancy indexing allows you to select entire rows or columns out of order.

In [17]:
grade_arr

array([[79, 95, 60],
       [95, 60, 61],
       [99, 67, 84],
       [76, 76, 97],
       [91, 84, 98],
       [70, 69, 96],
       [88, 65, 76],
       [67, 73, 80],
       [82, 89, 61],
       [94, 67, 88]])

#### Exercise:

1. Find all exam grades for student 0 and student 2. 
2. Get the first midterm and final exam for the first two students.
3. Get both student 0 and 2\'s first and final exam.

In [18]:
# Q1
grade_arr[[0,2]]


array([[79, 95, 60],
       [99, 67, 84]])

In [19]:
# Q1 - different format
grade_arr[np.array([0,2])]


array([[79, 95, 60],
       [99, 67, 84]])

In [20]:
# Q2
grade_arr[:2, [0,2]]


array([[79, 60],
       [95, 61]])

In [21]:
# Q3 
grade_arr[[0,2]][:,[0,2]]



array([[79, 60],
       [99, 84]])

In [22]:
# Q3 another solution




## 5. Numpy Indexing with Strides
Indexing a 2d matrix can be a bit confusing at first, especially when you start to add in step size. Here is a tutorial that you may find useful to reinforce the understanding of Numpy array indexing: link <http://www.scipy-lectures.org/intro/numpy/numpy.html>

Try google image searching NumPy indexing to fins useful images, like the one that follows. What the heck is the meaning of a[2::2, ::2]? Try to figure out by yourself.

<img src= 'http://www.scipy-lectures.org/_images/numpy_indexing.png' width=500/>

## 6. Boolean Selection

Let's briefly go over how to use brackets for selection based off of comparison operators.

In [23]:
# Reset arr
arr = np.arange(1,11)
arr

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

#### Exercise: grab out the elements that are greater than 4 from the ndarray. 

Step 1: generate a boolean vector.

In [24]:
bool_arr = arr > 4


# Show
bool_arr

array([False, False, False, False,  True,  True,  True,  True,  True,
        True])

Step 2: use the boolean vector to select the matching data. 

In [25]:
arr[bool_arr]



array([ 5,  6,  7,  8,  9, 10])

# Section 2: Arithmetic

You can easily perform array with array arithmetic, or scalar with array arithmetic. Let's see some examples:

In [26]:

arr + arr


array([ 2,  4,  6,  8, 10, 12, 14, 16, 18, 20])

In [27]:
arr * arr



array([  1,   4,   9,  16,  25,  36,  49,  64,  81, 100])

In [28]:
arr - arr



array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [29]:
# Warning on division by zero, but not an error!
# Just replaced with nan
arr / arr



array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])

In [30]:
arr2 = np.arange(0,11)

In [31]:
arr2

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [32]:
# Also warning, but not an error instead infinity
arr2/arr2



  arr2/arr2


array([nan,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.])

In [33]:
1 / arr2



  1 / arr2


array([       inf, 1.        , 0.5       , 0.33333333, 0.25      ,
       0.2       , 0.16666667, 0.14285714, 0.125     , 0.11111111,
       0.1       ])

In [34]:
arr ** 3

array([   1,    8,   27,   64,  125,  216,  343,  512,  729, 1000],
      dtype=int32)

# Section 3: Universal Array Functions (ufunc)

Numpy comes with many [universal array functions](http://docs.scipy.org/doc/numpy/reference/ufuncs.html), which are essentially just mathematical operations you can use to perform the operation across the array. Let's show some common ones:

## 1. On 1D arrays. 

In [36]:
arr

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [35]:
# Taking Square Roots
np.sqrt(arr)



array([1.        , 1.41421356, 1.73205081, 2.        , 2.23606798,
       2.44948974, 2.64575131, 2.82842712, 3.        , 3.16227766])

In [37]:
# Calcualting exponential (e^)
np.exp(arr)



array([2.71828183e+00, 7.38905610e+00, 2.00855369e+01, 5.45981500e+01,
       1.48413159e+02, 4.03428793e+02, 1.09663316e+03, 2.98095799e+03,
       8.10308393e+03, 2.20264658e+04])

In [38]:
# Find the largest
# same as arr.max()
np.max(arr)



10

In [39]:
# Caculate sin()
np.sin(arr)



array([ 0.84147098,  0.90929743,  0.14112001, -0.7568025 , -0.95892427,
       -0.2794155 ,  0.6569866 ,  0.98935825,  0.41211849, -0.54402111])

In [40]:
# Caculate logorithmic
np.log(arr)



array([0.        , 0.69314718, 1.09861229, 1.38629436, 1.60943791,
       1.79175947, 1.94591015, 2.07944154, 2.19722458, 2.30258509])

## 2. On 2D arrays

#### Exercise:

1. How to get mean grade of each exam?
2. How to get the highest grade for each exam?
3. How to get the standard deviation for each exam?
4. How to get evenly weighted average exam grade for each student?
5. How to get the summation of all grades for exam 1 and exam 2, respectively?
6. If the weight of each exam is 30%, 30%, and 40%. Calculate the weighted average grade for each student. 

In [41]:
grade_arr

array([[79, 95, 60],
       [95, 60, 61],
       [99, 67, 84],
       [76, 76, 97],
       [91, 84, 98],
       [70, 69, 96],
       [88, 65, 76],
       [67, 73, 80],
       [82, 89, 61],
       [94, 67, 88]])

In [53]:
# Q1
np.mean(grade_arr ,axis = 0)


array([84.1, 74.5, 80.1])

In [54]:
# Q2
np.max(grade_arr, axis = 0)


array([99, 95, 98])

In [55]:
# Q3
np.std(grade_arr, axis = 0)


array([10.43503713, 10.80971785, 14.44610674])

In [56]:
# Q4
np.mean(grade_arr, axis = 1)


array([78.        , 72.        , 83.33333333, 83.        , 91.        ,
       78.33333333, 76.33333333, 73.33333333, 77.33333333, 83.        ])

In [60]:
# Q5
np.sum(grade_arr, axis = 0)[:2]


array([841, 745])

In [61]:
# Q5 quicker solution
np.sum(grade_arr[:,:2], axis = 0)

array([841, 745])

In [62]:
# Q6
grade_arr.dot([0.3, 0.3, 0.4])

array([76.2, 70.9, 83.4, 84.4, 91.7, 80.1, 76.3, 74. , 75.7, 83.5])

# Great Job!