# Introduction to numpy

### Package for scientific computing with Python

* Numerical Python, or "Numpy" for short
* foundational package on which many of the most common data science packages are built.
* high performance multi-dimensional arrays.  
* key benefits: **speed** and **functionality**.

## Numpy

The key features of numpy are:

- **ndarrays**: n-dimensional arrays of the same data type which are fast and space-efficient.
- **Broadcasting**: a useful tool which defines implicit behavior between multi-dimensional arrays of different sizes.
- **Vectorization**: enables numeric operations on ndarrays.
- **Input/Output**: simplifies reading and writing of data from/to file.

### Benefits and characteristics of NumPy arrays


NumPy arrays have several advantages over Python lists. 

1. High-performance manipulation of sequences of homogenous data items

2. Vectorized operations

3. Boolean selection

4. Sliceability

### Difference between Numpy and Python

### standard deviation

In [1]:
a = [1,2,3,4,5,6,7,8,9]

In [2]:
import numpy as np

### standard deviation (python)

In [3]:
import math

def std(lst):
    mean = sum(lst) / len(lst)
    variance = 0
    for e in lst:
        variance += (e - mean) ** 2
    variance /= len(lst)
    
    return math.sqrt(variance)

std(a)

2.581988897471611

### standard deviation (numpy)

In [4]:
a = np.array(a)
a.std()

2.581988897471611

### covariance matrix (numpy)

In [5]:
x = np.array([1,2,3,4,5,6,7,8,9])
y = np.array([9,8,7,6,5,4,3,2,1])

z = np.cov(x, y)
print(z)
type(z)
z.shape

[[ 7.5 -7.5]
 [-7.5  7.5]]


(2, 2)

### Getting started with ndarray

**ndarrays** are time and space-efficient multidimensional arrays at the core of numpy.

### How to create 1 dimensional numpy arrays

In [6]:
import numpy as np                 # Importing numpy

In [22]:
an_array = np.array([3, 33, 333])  # Create a 1 dimensional array

In [26]:
print(type(an_array))   # The type of an ndarray is: "<class 'numpy.ndarray'>"

<class 'numpy.ndarray'>


In [27]:
print(an_array)

[  3  33 333]


In [28]:
# test the shape of the array we just created, it should have just one dimension
print(an_array.shape)

(3,)


### Indexing

In [29]:
# because this is a 1-rank array, we need only one index to accesss each element
print(an_array[0], an_array[1], an_array[2]) 

3 33 333


In [30]:
an_array[0] = 888                 # ndarrays are mutable, here we change an element of the array

print(an_array)

[888  33 333]


### How to create a 2 dimensional numpy array

Notice the format below of [ [row] , [row] ].  2 dimensional arrays are great for representing matrices which are often useful in data science.

In [31]:
another = np.array([[11,12,13],[21,22,23]])   # Create a 2 dimensional array

In [32]:
print(another)  # print the array

[[11 12 13]
 [21 22 23]]


In [33]:
print("The shape is 2 rows, 3 columns: ", another.shape)  # rows x columns                   

The shape is 2 rows, 3 columns:  (2, 3)


In [34]:
print("Accessing elements [0,0], [0,1], and [1,0] of the ndarray: ", 
      another[0, 0], ", ",another[0, 1],", ", another[1, 0])

Accessing elements [0,0], [0,1], and [1,0] of the ndarray:  11 ,  12 ,  21


### Ways to create numpy arrays

In [35]:
# create a 2x2 array of zeros
ex1 = np.zeros((2,2))      
print(ex1)                              

[[0. 0.]
 [0. 0.]]


In [36]:
# create a 2x2 array filled with 9.0
ex2 = np.full((2,2), 9.0)  
print(ex2)   

[[9. 9.]
 [9. 9.]]


### Ways to create numpy arrays

In [37]:
# create a 2x2 matrix with the diagonal 1s and the others 0
ex3 = np.eye(2,2)
print(ex3)  

[[1. 0.]
 [0. 1.]]


In [38]:
# create an array of ones
ex4 = np.ones((1,2))
print(ex4)    

[[1. 1.]]


In [39]:
# notice that the above ndarray (ex4) is actually rank 2, it is a 2x1 array
print(ex4.shape)

(1, 2)


In [40]:
# which means we need to use two indexes to access an element
print(ex4[0,1])

1.0


### Ways to create numpy arrays

In [41]:
# create an array of random floats between 0 and 1
ex5 = np.random.random((2,2))
print(ex5)    

[[0.14172463 0.66354215]
 [0.82974444 0.56403656]]


### Array Indexing

#### Slice indexing:

- use slice indexing to pull out sub-regions of ndarrays.

In [42]:
# Rank 2 array of shape (3, 4)
an_array = np.array([[11,12,13,14], [21,22,23,24], [31,32,33,34]])

In [43]:
print(an_array)

[[11 12 13 14]
 [21 22 23 24]
 [31 32 33 34]]


#### Slice indexing
Use array slicing to get a subarray consisting of the first 2 rows x 2 columns.

In [44]:
a_slice = an_array[:2, 1:3]
print(a_slice)

[[12 13]
 [22 23]]


#### Slice indexing
When you modify a slice, you actually modify the underlying array.

In [45]:
print("Before:", an_array[0, 1])   #inspect the element at 0, 1  
a_slice[0, 0] = 1000    # a_slice[0, 0] is the same piece of data as an_array[0, 1]
print("After:", an_array[0, 1])    

Before: 12
After: 1000


#### Integer & Slice indexing

We can use combinations of integer indexing and slice indexing to create different shaped matrices.

In [46]:
# Create a Rank 2 array of shape (3, 4)
an_array = np.array([[11,12,13,14], [21,22,23,24], [31,32,33,34]])

In [47]:
print(an_array)

[[11 12 13 14]
 [21 22 23 24]
 [31 32 33 34]]


#### Integer & Slice indexing

In [48]:
# Using both integer indexing & slicing generates an array of lower rank
row_rank1 = an_array[1, :]    # Rank 1 view 

print(row_rank1, row_rank1.shape)  # notice only a single []

[21 22 23 24] (4,)


In [49]:
# Slicing alone: generates an array of the same rank as the an_array
row_rank2 = an_array[1:2, :]  # Rank 2 view 

print(row_rank2, row_rank2.shape)   # Notice the [[ ]]

[[21 22 23 24]] (1, 4)


In [50]:
#We can do the same thing for columns of an array:
col_rank1 = an_array[:, 1]
col_rank2 = an_array[:, 1:2]

print(col_rank1, col_rank1.shape)  # Rank 1
print()
print(col_rank2, col_rank2.shape)  # Rank 2

[12 22 32] (3,)

[[12]
 [22]
 [32]] (3, 1)


### Array Indexing for changing elements

In [51]:
# Create a new array
an_array = np.array([[11,12,13], [21,22,23], [31,32,33], [41,42,43]])

In [52]:
print('Original Array:')
print(an_array)

Original Array:
[[11 12 13]
 [21 22 23]
 [31 32 33]
 [41 42 43]]


### Array Indexing for changing elements

In [53]:
# Create an array of indices
col_indices = np.array([0, 1, 2, 0])
print('\nCol indices picked : ', col_indices)


Col indices picked :  [0 1 2 0]


In [54]:
row_indices = np.arange(4)
print('\nRows indices picked : ', row_indices)


Rows indices picked :  [0 1 2 3]


In [55]:
# Examine the pairings of row_indices and col_indices.  These are the elements we'll change next.
for row,col in zip(row_indices,col_indices):
    print(row, ", ",col)

0 ,  0
1 ,  1
2 ,  2
3 ,  0


### Array Indexing for changing elements

In [56]:
# Select one element from each row
print('Values in the array at those indices: ',an_array[row_indices, col_indices])

Values in the array at those indices:  [11 22 33 41]


In [57]:
# Change one element from each row using the indices selected
an_array[row_indices, col_indices] += 100000

In [58]:
print('\nChanged Array:')
print(an_array)


Changed Array:
[[100011     12     13]
 [    21 100022     23]
 [    31     32 100033]
 [100041     42     43]]


### Hands-on Exercise 1: Get portion of numpy array

In [59]:
# DO NOT MODIFY

# we will use this dummy matrix for some portions of this exercise.

def get_matrix():
    return np.array(
      [[ 0.35066314,  0.94844269,  0.69324339,  -0.32790416],
       [ -0.7935923 ,  0.9212632 ,  0.13607887,  0.56358399],
       [ 0.25597054,  0.74834666,  -0.81322464,  0.11280075],
       [ -0.53822742,  -0.63970183,  0.1439784 ,  0.58045905]])

mat = get_matrix()

In the cell below, modify the function to RETURN elements occuring in first two rows and 2nd-3rd columns

In [60]:
# modify this cell

def find_slice(matx):
    ### BEGIN SOLUTION

    ### END SOLUTION

In [61]:
# DO NOT MODIFY
ans = [[ 0.94844269,  0.69324339],[ 0.9212632 ,  0.13607887]]

try: assert np.alltrue(find_slice(mat) == np.array(ans))
except AssertionError as e: print("Try again, your output did not match the expected answer above")

### Hands-on Exercise 2: Update portion of numpy array

In the cell below, modify the function to perform in-place addition of 1000 to every element of slice you just created and RETURN the changed version

In [65]:
# modify this cell

def update_slice(matx):
    ### BEGIN SOLUTION
    
    ### END SOLUTION

In [66]:
# DO NOT MODIFY

ans = [[ 1000.94844269,  1000.69324339],[ 1000.9212632 ,  1000.13607887]]

try: 
    mat=get_matrix()
    update_slice(mat)
    assert np.alltrue(find_slice(mat) == np.array(ans))
except AssertionError as e: print("Try again: be sure to check both your update_slice and find_slide functions.")

### Boolean Indexing

In [67]:
# create a 3x2 array
an_array = np.array([[11,12], [21, 22], [31, 32]])
print(an_array)

[[11 12]
 [21 22]
 [31 32]]


In [68]:
# create a filter which will be boolean values for whether each element meets this condition
filter = (an_array > 15)
print (filter)

[[False False]
 [ True  True]
 [ True  True]]


### Boolean Indexing

In [69]:
# we can now select just those elements which meet that criteria
print(an_array[filter])

[21 22 31 32]


In [70]:
# For short, we could have just used the approach below without the need for the separate filter array.
print( an_array[an_array > 15] )

[21 22 31 32]


### Boolean Indexing

What is particularly useful is that we can actually change elements in the array applying a similar logical filter.  Let's add 100 to all the even values.

In [71]:
an_array[an_array % 2 == 0] +=100
print(an_array)

[[ 11 112]
 [ 21 122]
 [ 31 132]]


### Hands-on Exercise 3: Filter numpy array

In the cell below, modify the function to RETURN elements of matrix that are greater than 1

In [72]:
# modify this cell

def bool_filter(matx):
    ### BEGIN SOLUTION
   
    ### END SOLUTION

In [73]:
# DO NOT MODIFY

ans = [1000.94844269,  1000.69324339,  1000.9212632, 1000.13607887]


try: assert np.alltrue(bool_filter(mat) == np.array(ans))
except AssertionError as e: print("Try again, your solution did not produce the expected output above")

### Datatypes and Array Operations

### Datatypes

In [74]:
ex1 = np.array([11, 12]) # Python assigns the  data type
print(ex1.dtype)

int64


In [75]:
ex2 = np.array([11.0, 12.0]) # Python assigns the  data type
print(ex2.dtype)

float64


In [76]:
ex3 = np.array([11, 21], dtype=np.int64) #You can also tell Python the  data type
print(ex3.dtype)

int64


### Datatypes

In [77]:
# you can use this to force floats into integers (using floor function)
ex4 = np.array([11.1,12.7], dtype=np.int64)
print(ex4.dtype)

int64


In [78]:
print(ex4)

[11 12]


In [79]:
# you can use this to force integers into floats if you anticipate
# the values may change to floats later
ex5 = np.array([11, 21], dtype=np.float64)
print(ex5.dtype)

float64


In [80]:
print(ex5)

[11. 21.]


### Hands-on Exercise 4: Int Converter


In the cell below, modify the function to convert each element of entered matrix to np.int data type AND return a new matrix with converted data type (i.e. full of integers)

In [81]:
# modify this cell

def int_converter(matx):
    ### BEGIN SOLUTION

    ### END SOLUTION

In [82]:
# DO NOT MODIFY

ans = [[   0, 1000, 1000,    0],
       [   0, 1000, 1000,    0],
       [   0,    0,    0,    0],
       [   0,    0,    0,    0]]

try: assert np.alltrue(int_converter(mat) == np.array(ans))
except AssertionError as e: print("Try again - be sure your code from Exercise 2 worked properly as well.")

### Arithmetic Array Operations

In [83]:
x = np.array([[111,112],[121,122]], dtype=np.int)
print(x)

[[111 112]
 [121 122]]


In [84]:
y = np.array([[211.1,212.1],[221.1,222.1]], dtype=np.float64)
print(y)

[[211.1 212.1]
 [221.1 222.1]]


### Addition

In [85]:
print(x + y)         # The plus sign works

[[322.1 324.1]
 [342.1 344.1]]


In [86]:
print(np.add(x, y))  # so does the numpy function "add"

[[322.1 324.1]
 [342.1 344.1]]


### Subtraction

In [87]:
print(x - y)

[[-100.1 -100.1]
 [-100.1 -100.1]]


In [88]:
print(np.subtract(x, y))

[[-100.1 -100.1]
 [-100.1 -100.1]]


### Multiplication

In [89]:
print(x * y)

[[23432.1 23755.2]
 [26753.1 27096.2]]


In [90]:
print(np.multiply(x, y))

[[23432.1 23755.2]
 [26753.1 27096.2]]


### Division

In [91]:
print(x / y)

[[0.52581715 0.52805281]
 [0.54726368 0.54930212]]


In [92]:
print(np.divide(x, y))

[[0.52581715 0.52805281]
 [0.54726368 0.54930212]]


### Square Root & Exponentiation

In [93]:
# square root
print(np.sqrt(x))

[[10.53565375 10.58300524]
 [11.         11.04536102]]


In [94]:
# exponent (e ** x)
print(np.exp(x))

[[1.60948707e+48 4.37503945e+48]
 [3.54513118e+52 9.63666567e+52]]


### Statistical Methods, Sorting, and Set Operations

### Basic Statistical Operations

In [95]:
# setup a random 2 x 4 matrix
arr = 10 * np.random.randn(2,5)
print(arr)

[[20.68560242  1.34898724  4.13774816  5.08346089 17.25015535]
 [ 7.05053415  4.61703708  5.72942328 -3.76117827  8.60944239]]


### Mean

In [96]:
# compute the mean for all elements
print(arr.mean())

7.075121270188355


In [97]:
# compute the means by row
print(arr.mean(axis = 1))

[9.70119081 4.44905173]


In [98]:
# compute the means by column
print(arr.mean(axis = 0))

[13.86806829  2.98301216  4.93358572  0.66114131 12.92979887]


### Sorting

In [99]:
# create a 10 element array of randoms
unsorted = np.random.randn(10)
print(unsorted)

[-1.36614519 -0.9984754   0.29879734 -1.40637067  0.04194029 -0.92919917
  0.39200789 -0.1301152  -1.94956902 -0.75284992]


In [100]:
# create copy and sort
sorted = np.array(unsorted)
sorted.sort()
print(sorted)

[-1.94956902 -1.40637067 -1.36614519 -0.9984754  -0.92919917 -0.75284992
 -0.1301152   0.04194029  0.29879734  0.39200789]


In [101]:
# inplace sorting
unsorted.sort() 
print(unsorted)

[-1.94956902 -1.40637067 -1.36614519 -0.9984754  -0.92919917 -0.75284992
 -0.1301152   0.04194029  0.29879734  0.39200789]


### Hands-on Exercise 5: More Filtering

In the cell below, modify the function to perform inplace update of input matrix such that
each element that is negative is replaced with 0. <br> Note that the function does not return anything but changes the matrix inplace.

In [None]:
# modify this cell

def selective_replace(matx):
    ### BEGIN SOLUTION

    ### END SOLUTION

In [None]:
# DO NOT MODIFY
mat = get_matrix()
ans = np.array([[ 0.35066314,  0.94844269,  0.69324339,  0.        ],
                [ 0.,          0.9212632,   0.13607887,  0.56358399],
                [ 0.25597054,  0.74834666,  0.,          0.11280075],
                [ 0.,          0.,          0.1439784,   0.58045905]])
try: 
    selective_replace(mat)
    assert np.alltrue(mat == ans)
except AssertionError as e: print("Try again, your function did not produce the expected output above.")

### Finding Unique elements

In [102]:
array = np.array([1,2,1,4,2,1,4,2])

print(np.unique(array))

[1 2 4]


### Set Operations with np.array data type

In [103]:
s1 = np.array(['desk','chair','bulb'])
s2 = np.array(['lamp','bulb','chair'])
print(s1, s2)

['desk' 'chair' 'bulb'] ['lamp' 'bulb' 'chair']


### Set Operations with np.array data type

In [104]:
print( np.intersect1d(s1, s2) )    # Intersection

['bulb' 'chair']


In [105]:
print( np.union1d(s1, s2) )       # Union

['bulb' 'chair' 'desk' 'lamp']


In [106]:
print( np.setdiff1d(s1, s2) )     # elements in s1 that are not in s2

['desk']


In [107]:
print( np.in1d(s1, s2) )          # which element of s1 is also in s2

[False  True  True]


### Broadcasting

In [108]:
start = np.zeros((4,3))
print(start)

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]


In [109]:
# create a rank 1 ndarray with 3 values
add_rows = np.array([1, 0, 2])
print(add_rows)

[1 0 2]


In [110]:
y = start + add_rows  # add to each row of 'start' using broadcasting

print(y)
y

[[1. 0. 2.]
 [1. 0. 2.]
 [1. 0. 2.]
 [1. 0. 2.]]


array([[1., 0., 2.],
       [1., 0., 2.],
       [1., 0., 2.],
       [1., 0., 2.]])

### Broadcasting

In [111]:
# create an ndarray which is 4 x 1 to broadcast across columns
add_cols = np.array([[0,1,2,3]])
add_cols = add_cols.T

print(add_cols)

[[0]
 [1]
 [2]
 [3]]


In [112]:
y = start + add_cols # add to each column of 'start' using broadcasting
print(y)

[[0. 0. 0.]
 [1. 1. 1.]
 [2. 2. 2.]
 [3. 3. 3.]]


### Broadcasting

In [113]:
add_scalar = np.array([1])  # this will just broadcast in both dimensions
print(start+add_scalar)

[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]


### Hands-on Exercise 6: Broadcasting  

add first row to every row of the input matrix

In the cell below, modify the function to RETURN a new matrix by taking first row of the input matrix and adding it to every row of entered matrix.

So if the input matrix is:
2, 3
1, 4

The output should be:
4, 6
3, 7

See how the first row (2,3) was added to itself and all subsequent rows.

In [None]:
# modify this cell

def first_row_adder(matx):
    ### BEGIN SOLUTION

    ### END SOLUTION
    
# modify this cell

In [None]:
# DO NOT MODIFY
test = np.array([[ 22.33,  4.53, 10.64],[ 10.64 ,  4.53, 100.97]])
ans = [[  44.66,    9.06,   21.28], [  32.97,    9.06,  111.61]]

try: assert np.allclose(first_row_adder(test), np.array(ans))
except AssertionError as e: print("Try again, first row was not added to all other rows")

### Speed Test: ndarrays vs lists

First setup paramaters for the speed test.  We'll be testing time to sum elements in an ndarray versus a list.

In [114]:
from timeit import Timer

size    = 1000000
timeits = 1000

### Numpy

In [115]:
# create the ndarray with values 0,1,2...,size-1
nd_array = np.arange(size)
print( type(nd_array) )

<class 'numpy.ndarray'>


In [116]:
# timer expects the operation as a parameter, here we pass nd_array.sum()
timer_numpy = Timer("nd_array.sum()", "from __main__ import nd_array")

print("Time taken by numpy ndarray: %.3e" % (timer_numpy.timeit(timeits)/timeits))

Time taken by numpy ndarray: 9.273e-04


### Python

In [117]:
# create the list with values 0,1,2...,size-1
a_list = list(range(size))
print (type(a_list) )

<class 'list'>


In [118]:
# timer expects the operation as a parameter, here we pass sum(a_list)
timer_list = Timer("sum(a_list)", "from __main__ import a_list")

print("Time taken by list:  %.3e" % (timer_list.timeit(timeits)/timeits))

Time taken by list:  7.619e-03


### Read or Write to Disk

#### Binary Format

In [119]:
x = np.array([ 23.23, 24.24] )

np.save('an_array', x)

y = np.load('an_array.npy')

print (y)

[23.23 24.24]


#### Text Format

In [120]:
np.savetxt('array.txt', X=x, delimiter=',')

!cat array.txt

2.323000000000000043e+01
2.423999999999999844e+01


In [121]:
np.loadtxt('array.txt', delimiter=',')

array([23.23, 24.24])

### Additional Common ndarray Operations

### Dot Product on Matrices

In [122]:
# determine the dot product of two matrices
x2d = np.array([[1,1],[1,1]])
y2d = np.array([[2,2],[2,2]])

In [123]:
print(x2d.dot(y2d))

[[4 4]
 [4 4]]


In [124]:
print(np.dot(x2d, y2d))

[[4 4]
 [4 4]]


###  Inner Product on Vectors

In [125]:
# determine the inner product of two vectors
a1d = np.array([9 , 9 ])
b1d = np.array([10, 10])

In [126]:
print(a1d.dot(b1d))

180


In [127]:
print(np.dot(a1d, b1d))

180


### Dot Product on Matrix and vector

In [128]:
# dot produce on an array and vector
print(x2d.dot(a1d))

[18 18]


In [129]:
print(np.dot(x2d, a1d))

[18 18]


### Element-wise Functions:

For example, let's compare two arrays values to get the maximum of each.

In [131]:
# random array
x = np.random.randn(8)
x

array([ 0.04182744, -1.25721899,  0.86100066,  1.25378393,  0.11825724,
        0.76297965,  0.63268867,  0.05112061])

In [132]:
# another random array
y = np.random.randn(8)
y

array([ 0.63413699, -0.96083603,  0.02871878,  0.99043101, -0.00414479,
        0.08577817, -0.74104969, -0.76921094])

In [133]:
# returns element wise maximum between two arrays

np.maximum(x, y)

array([ 0.63413699, -0.96083603,  0.86100066,  1.25378393,  0.11825724,
        0.76297965,  0.63268867,  0.05112061])

### Reshaping array:

In [134]:
# grab values from 0 through 19 in an array
arr = np.arange(20)
print(arr)

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]


In [135]:
# reshape to be a 4 x 5 matrix
arr.reshape(4,5)

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

### Transpose:

In [136]:
# transpose
ex1 = np.array([[11,12],[21,22]])

ex1.T


array([[11, 21],
       [12, 22]])

### Indexing using where():

In [137]:
x_1 = np.array([1,2,3,4,5])

y_1 = np.array([11,22,33,44,55])

filter = np.array([True, False, True, False, True])

In [138]:
out = np.where(filter, x_1, y_1)
print(out)

[ 1 22  3 44  5]


### Indexing using where():

In [139]:
mat = np.random.rand(5,5)
mat

array([[0.01389154, 0.24352332, 0.48820203, 0.43755395, 0.26079674],
       [0.33594782, 0.84444731, 0.5337959 , 0.93442802, 0.19528942],
       [0.92388965, 0.28010694, 0.52361559, 0.68469169, 0.01021842],
       [0.67322393, 0.88351375, 0.72417387, 0.0730209 , 0.88856387],
       [0.74236594, 0.2125964 , 0.46531809, 0.51985354, 0.46854751]])

In [140]:
np.where( mat > 0.5, 1000, -1)

array([[  -1,   -1,   -1,   -1,   -1],
       [  -1, 1000, 1000, 1000,   -1],
       [1000,   -1, 1000, 1000,   -1],
       [1000, 1000, 1000,   -1, 1000],
       [1000,   -1,   -1, 1000,   -1]])

### "any" or "all" conditionals:

In [141]:
arr_bools = np.array([ True, False, True, True, False ])

In [142]:
arr_bools.any()

True

In [143]:
arr_bools.all()

False

### Random Number Generation:

In [144]:

Y = np.random.normal(size = (2,5))[0]
print(Y)

[-0.06362971  0.46470904  1.10668186 -0.29806328 -1.81000475]


In [145]:
Z = np.random.randint(low=2,high=50,size=4)
print(Z)

[20 43 49 41]


In [146]:
np.random.permutation(Z) #return a new ordering of elements in Z

array([49, 20, 43, 41])

### Random Number Generation:

In [147]:
np.random.uniform(size=4) #uniform distribution

array([0.57434025, 0.73688616, 0.85050752, 0.60496441])

In [148]:
np.random.normal(size=4) #normal distribution

array([-1.44962095, -1.34521045,  0.61451917,  2.21946566])

### Merging data sets

In [149]:
K = np.random.randint(low=2,high=50,size=(2,2))
print(K)

[[34 48]
 [ 8 18]]


In [150]:
M = np.random.randint(low=2,high=50,size=(2,2))
print(M)

[[ 9 11]
 [ 2 34]]


### Merging data sets along rows:

In [151]:
np.vstack((K,M))

array([[34, 48],
       [ 8, 18],
       [ 9, 11],
       [ 2, 34]])

In [152]:
np.concatenate([K, M], axis = 0)

array([[34, 48],
       [ 8, 18],
       [ 9, 11],
       [ 2, 34]])

### Merging data sets along columns:

In [153]:
np.hstack((K,M))

array([[34, 48,  9, 11],
       [ 8, 18,  2, 34]])

In [154]:
np.concatenate([K, M.T], axis = 1)

array([[34, 48,  9,  2],
       [ 8, 18, 11, 34]])

### Removing Outliers

In [155]:
arr = [10, 386, 479, 627, 20, 523, 482, 483, 542, 699, 535, 617, 577, 471, 615, 583, 441, 562, 563, 
       527, 453, 530, 433, 541, 585, 704, 443, 569, 430, 637, 331, 511, 552, 496, 484, 566, 554, 472, 
       335, 440, 579, 341, 545, 615, 548, 604, 439, 556, 442, 461, 624, 611, 444, 578, 405, 487, 490, 
       496, 398, 512, 422, 455, 449, 432, 607, 679, 434, 597, 639, 565, 415, 486, 668, 414, 665, 763, 
       557, 304, 404, 454, 689, 610, 483, 441, 657, 590, 492, 476, 437, 483, 529, 363, 711, 543]

In [156]:
elements = np.array(arr)

In [157]:
mean = np.mean(elements, axis=0)
sd = np.std(elements, axis=0)

### Removing Outliers

In [158]:
final_list = [x for x in arr if (x > mean - 2 * sd)]
final_list = [x for x in final_list if (x < mean + 2 * sd)]

In [159]:
print(final_list)

[386, 479, 627, 523, 482, 483, 542, 699, 535, 617, 577, 471, 615, 583, 441, 562, 563, 527, 453, 530, 433, 541, 585, 704, 443, 569, 430, 637, 331, 511, 552, 496, 484, 566, 554, 472, 335, 440, 579, 341, 545, 615, 548, 604, 439, 556, 442, 461, 624, 611, 444, 578, 405, 487, 490, 496, 398, 512, 422, 455, 449, 432, 607, 679, 434, 597, 639, 565, 415, 486, 668, 414, 665, 557, 304, 404, 454, 689, 610, 483, 441, 657, 590, 492, 476, 437, 483, 529, 363, 711, 543]


# Solutions for Hands-on Exercises

### Hands-on 1 Solution

In [None]:
# modify this cell

def find_slice(matx):
    ### BEGIN SOLUTION
    return matx[:2, 1:3]
    ### END SOLUTION

### Hands-on 2 Solution

In [65]:
# modify this cell

def update_slice(matx):
    ### BEGIN SOLUTION
    matx[:2, 1:3] += 1000
    return matx
    ### END SOLUTION

### Hands-on 3 Solution

In [None]:
# modify this cell

def bool_filter(matx):
    ### BEGIN SOLUTION
    return matx[matx > 1]
    ### END SOLUTION

### Hands-on 4 Solution

In [81]:
# modify this cell

def int_converter(matx):
    ### BEGIN SOLUTION
    return np.array(matx, dtype=np.int)
    ### END SOLUTION

### Hands-on 5 Solution

In [None]:
# modify this cell

def selective_replace(matx):
    ### BEGIN SOLUTION
    matx[matx < 0] = 0
    ### END SOLUTION

### Hands-on 6 Solution

In [None]:
# modify this cell

def first_row_adder(matx):
    ### BEGIN SOLUTION
    return matx + matx[0,:]
    ### END SOLUTION
    
# modify this cell

### Hands-on 7 Solution