<img src = "https://img.betapage.co/images/77640967-77641456.png" height=50% width = 50%>

In [2]:
import numpy as np

# Introduction to NumPy

"Numpy is the core library for scientific computing in Python. It provides a high-performance multidimensional array object, and tools for working with these arrays. This library provides you with an array data structure that holds some benefits over Python lists, such as: being more compact, faster access in reading and writing items, being more convenient and more efficient."


# What is a NumPy array?

"The central feature of NumPy is the array object class. Arrays are similar to lists in Python, except that every element of an array must be of the same type, typically a numeric type like float or int. Arrays make operations with large amounts of numeric data very fast and are generally much more efficient than lists."

LINK: https://engineering.ucsb.edu/~shell/che210d/numpy.pdf

<img src = "http://community.datacamp.com.s3.amazonaws.com/community/production/ckeditor_assets/pictures/332/content_arrays-axes.png">

# NumPy Array Syntax
The function array takes two arguments: the list to be converted into the array and the type of each member of the list. 

In [3]:
#List to be converted
lst = [1,2,3,4,5,6,7,8,9]

arr = np.array(lst)
arr

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

Array elements are accessed, sliced, and manipulated just like lists.

In [4]:
#Index from the 2nd index on
arr[2:]

array([3, 4, 5, 6, 7, 8, 9])

In [12]:
#manipulate item at index 0
arr[0] = 10
arr

array([10,  2,  3,  4,  5,  6,  7,  8,  9])


<b>* Why can't we simply use a python list for these scientific computations?<b>

# Python List VS NumPy Array

"Arrays and lists are both used in Python to store data, but they don't serve exactly the same purposes. They both can be used to store any data type (real numbers, strings, etc), and they both can be indexed and iterated through, but the similarities between the two don't go much further. The main difference between a list and an array is the functions that you can perform to them. For example, you can divide an array by 3, and each number in the array will be divided by 3 and the result will be printed if you request it. If you try to divide a list by 3, Python will tell you that it can't be done, and an error will be thrown."


In [13]:
lst = [3,6,9,12,15,18,12]
lst/3

TypeError: unsupported operand type(s) for /: 'list' and 'int'

In [14]:
arr = np.array([3,6,9,12,15,18,12])
arr/3

array([1., 2., 3., 4., 5., 6., 4.])

Arrays can be multidimensional. Unlike lists, different axes are accessed using commas inside bracket notation. Here is an example with a two-dimensional array (e.g., a matrix)

In [9]:
lst1 = [1,2,3,4,5]
lst2 = [5,6,7,8,9]
arr = np.array([lst1,lst2])
arr

array([[1, 2, 3, 4, 5],
       [5, 6, 7, 8, 9]])

In [10]:
arr/3

array([[0.33333333, 0.66666667, 1.        , 1.33333333, 1.66666667],
       [1.66666667, 2.        , 2.33333333, 2.66666667, 3.        ]])

In [11]:
lst_lst = [lst1,lst2]
lst_lst

[[1, 2, 3, 4, 5], [5, 6, 7, 8, 9]]

In [12]:
lst_lst/3

TypeError: unsupported operand type(s) for /: 'list' and 'int'

# Indexing Arrays VS Lists

In [14]:
arr

array([[1, 2, 3, 4, 5],
       [5, 6, 7, 8, 9]])

In [17]:
arr[1][0]

5

In [14]:
lst_lst

[[1, 2, 3, 4, 5], [5, 6, 7, 8, 9]]

In [109]:
lst_lst[0,1]

TypeError: list indices must be integers or slices, not tuple

In [110]:
lst_lst[0][1]

2

In [111]:
arr[-1]

array([5, 6, 7, 8, 9])

In [112]:
lst_lst[-1]

[5, 6, 7, 8, 9]

In [113]:
arr

array([[1, 2, 3, 4, 5],
       [5, 6, 7, 8, 9]])

<h3> How to index a multidemsional array? </h3><br>
The individual elements of arrays can be accessed in the same way as for lists.

<img src = "http://www.scipy-lectures.org/_images/numpy_indexing.png" height = 60% width = 60%>

In [19]:
list_2d = [[0,1,2,3,4,5],
           [10,11,12,13,14,15],
           [20,21,22,23,24,25],
           [30,31,32,33,34,35],
           [40,41,42,43,44,45],
           [50,51,52,53,54,55]]

In [20]:
array_2d = np.array(list_2d)
print(array_2d)
array_2d.shape

[[ 0  1  2  3  4  5]
 [10 11 12 13 14 15]
 [20 21 22 23 24 25]
 [30 31 32 33 34 35]
 [40 41 42 43 44 45]
 [50 51 52 53 54 55]]


(6, 6)

In [21]:
print(array_2d[0,3:5])

[3 4]


In [23]:
print(array_2d[4:,4:])

[[44 45]
 [54 55]]


In [24]:
print(array_2d[:,2])

[ 2 12 22 32 42 52]


In [27]:
print(array_2d[2::2,::2]) # step by 2

[[20 22 24]
 [40 42 44]]


In [29]:
print(array_2d[:,5])

[ 5 15 25 35 45 55]


In [120]:
# adding new column to numpy array

In [28]:
calc = array_2d[:,5] * 1.05
new_column = np.array(calc)
print(new_column)

[ 5.25 15.75 26.25 36.75 47.25 57.75]


In [122]:
new_array_2d = np.column_stack((array_2d,new_column))
print(new_array_2d)

[[ 0.    1.    2.    3.    4.    5.    5.25]
 [10.   11.   12.   13.   14.   15.   15.75]
 [20.   21.   22.   23.   24.   25.   26.25]
 [30.   31.   32.   33.   34.   35.   36.75]
 [40.   41.   42.   43.   44.   45.   47.25]
 [50.   51.   52.   53.   54.   55.   57.75]]


# Changing Array to different DataType

In [30]:
arr

array([[1, 2, 3, 4, 5],
       [5, 6, 7, 8, 9]])

In [31]:
arr = arr.tolist()
arr

[[1, 2, 3, 4, 5], [5, 6, 7, 8, 9]]

In [32]:
type(arr)

list

In [33]:
arr = np.array(arr)
arr

array([[1, 2, 3, 4, 5],
       [5, 6, 7, 8, 9]])

In [34]:
type(arr)

numpy.ndarray

In [35]:
arr.shape

(2, 5)

# Change Array Shape

<img src = "https://www.safaribooksonline.com/library/view/python-for-data/9781449323592/httpatomoreillycomsourceoreillyimages1346880.png" height = 50% width = 30% style = display.left> 

Transposed versions of arrays can also be generated, which will create a new array with the final two axes switched:

In [128]:
arr

array([[1, 2, 3, 4, 5],
       [5, 6, 7, 8, 9]])

In [129]:
arr.shape

(2, 5)

In [36]:
arr.transpose()

array([[1, 5],
       [2, 6],
       [3, 7],
       [4, 8],
       [5, 9]])

In [37]:
arr.transpose().shape

(5, 2)

In [38]:
arr.reshape((5,2))

array([[1, 2],
       [3, 4],
       [5, 5],
       [6, 7],
       [8, 9]])

Make multidimensional array into one-dimensional array

In [39]:
arr

array([[1, 2, 3, 4, 5],
       [5, 6, 7, 8, 9]])

In [40]:
arr.shape

(2, 5)

In [41]:
arr.flatten()

array([1, 2, 3, 4, 5, 5, 6, 7, 8, 9])

In [42]:
arr.flatten().shape

(10,)

# Create New Array (Specific)

Numpy also provides many functions to create arrays.

Creates an array of all zeros with a specified shape.

In [136]:
#1-Dimensional
np.zeros(10)

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

In [55]:
#2-Dimensional
np.zeros((2,2), int)

array([[0, 0],
       [0, 0]])

Creates an array of all ones with a specified shape.

In [57]:
#1-Dimensional
np.ones(10, int)

array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1])

In [139]:
#2-Dimensional
np.ones((2,2))

array([[1., 1.],
       [1., 1.]])

Creates a constant array (specified number) with a specified shape.

In [140]:
#1-Dimensional
np.full(10,7)

array([7, 7, 7, 7, 7, 7, 7, 7, 7, 7])

In [141]:
#2-Dimensional
np.full((2, 2), 7)

array([[7, 7],
       [7, 7]])

Created an array of a specified shape with random values.

In [142]:
#1-Dimensional
np.random.random(10)

array([0.16201163, 0.33641115, 0.02440854, 0.56081309, 0.27404955,
       0.88830142, 0.60964334, 0.06476067, 0.3287845 , 0.80430298])

In [143]:
#2-Dimensional
np.random.random((2,2))

array([[0.13427091, 0.8833328 ],
       [0.8799714 , 0.06539739]])

Create an array of a specified length with evenly spaced values.

In [58]:
#1-Dimensional
np.arange(10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

Create an array with a specified "start", "stop", and number of values, evenly spaced.

In [59]:
#1-Dimensional
np.linspace(1, 10, num = 20)

array([ 1.        ,  1.47368421,  1.94736842,  2.42105263,  2.89473684,
        3.36842105,  3.84210526,  4.31578947,  4.78947368,  5.26315789,
        5.73684211,  6.21052632,  6.68421053,  7.15789474,  7.63157895,
        8.10526316,  8.57894737,  9.05263158,  9.52631579, 10.        ])

Creates a 2x2 identity matrix (array).

An identity matrix is a square matrix having 1s on the main diagonal, and 0s everywhere else. These are called identity matrices because, when you multiply them with a compatible matrix , you get back the same matrix.
http://www.sparknotes.com/math/algebra2/matrices/section3.rhtml

In [146]:
#2-Dimensional
np.eye(10)

array([[1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 1., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 1., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 1.]])

OR

In [147]:
#2-Dimensional
np.identity(10)

array([[1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 1., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 1., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 1.]])

# Math Functions using NumPy

"As such, it probably won’t surprise you that you can just use +, -, *, / or % to add, subtract, multiply, divide or calculate the remainder of two (or more) arrays. However, a big part of why NumPy is so handy, is because it also has functions to do this. The equivalent functions of the operations that you have seen just now are, respectively, np.add(), np.subtract(), np.multiply(), np.divide() and np.remainder()."

https://www.datacamp.com/community/tutorials/python-numpy-tutorial

In [148]:
arr = np.ones((10,10))
arr

array([[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]])

In [149]:
np.add(arr,2)

array([[3., 3., 3., 3., 3., 3., 3., 3., 3., 3.],
       [3., 3., 3., 3., 3., 3., 3., 3., 3., 3.],
       [3., 3., 3., 3., 3., 3., 3., 3., 3., 3.],
       [3., 3., 3., 3., 3., 3., 3., 3., 3., 3.],
       [3., 3., 3., 3., 3., 3., 3., 3., 3., 3.],
       [3., 3., 3., 3., 3., 3., 3., 3., 3., 3.],
       [3., 3., 3., 3., 3., 3., 3., 3., 3., 3.],
       [3., 3., 3., 3., 3., 3., 3., 3., 3., 3.],
       [3., 3., 3., 3., 3., 3., 3., 3., 3., 3.],
       [3., 3., 3., 3., 3., 3., 3., 3., 3., 3.]])

In [150]:
#OR
arr + 2

array([[3., 3., 3., 3., 3., 3., 3., 3., 3., 3.],
       [3., 3., 3., 3., 3., 3., 3., 3., 3., 3.],
       [3., 3., 3., 3., 3., 3., 3., 3., 3., 3.],
       [3., 3., 3., 3., 3., 3., 3., 3., 3., 3.],
       [3., 3., 3., 3., 3., 3., 3., 3., 3., 3.],
       [3., 3., 3., 3., 3., 3., 3., 3., 3., 3.],
       [3., 3., 3., 3., 3., 3., 3., 3., 3., 3.],
       [3., 3., 3., 3., 3., 3., 3., 3., 3., 3.],
       [3., 3., 3., 3., 3., 3., 3., 3., 3., 3.],
       [3., 3., 3., 3., 3., 3., 3., 3., 3., 3.]])

In [151]:
np.multiply(arr,2)

array([[2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.]])

In [152]:
#OR
arr*2

array([[2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.]])

In [153]:
np.subtract(arr,1)

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])

In [154]:
#OR
arr -1 

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])

In [155]:
np.divide(arr,2)

array([[0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5],
       [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5],
       [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5],
       [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5],
       [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5],
       [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5],
       [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5],
       [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5],
       [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5],
       [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5]])

In [156]:
#OR
arr/2

array([[0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5],
       [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5],
       [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5],
       [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5],
       [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5],
       [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5],
       [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5],
       [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5],
       [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5],
       [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5]])

In [157]:
np.remainder(arr,1)

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])

In [158]:
#OR
arr % 1

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])

In [159]:
arr.sum()

100.0

In [160]:
arr.min()

1.0

In [161]:
arr.max()

1.0

In [162]:
arr.mean()

1.0

# <font color = magenta> NumPy Problem 1 </font>
<font color = magenta>
Create the three arrays displayed in the image, below.

<img src = "https://i.stack.imgur.com/ojnFF.jpg">

In [30]:
list_2a = [[4,6,4],
           [1,1,8],
           [0,7,5],
           [5,3,3],
           [8,9,5]]
           


In [31]:
array_2a = np.array(list_2a)
print (array_2a)


[[4 6 4]
 [1 1 8]
 [0 7 5]
 [5 3 3]
 [8 9 5]]


In [21]:
list_2b = [[8,8,4],
          [3,4,4],
          [0,0,9],
          [3,7,3],
          [3,4,7],
          [3,4,7]]


In [22]:
array_2b = np.array(list_2b)
print (array_2b.shape)
print (array_2b)

(6, 3)
[[8 8 4]
 [3 4 4]
 [0 0 9]
 [3 7 3]
 [3 4 7]
 [3 4 7]]


In [23]:
list_2c = [[9,5,4],
           [7,7,3],
           [9,5,9],
           [8,7,8],
           [5,8,8]]


In [24]:
array_2c = np.array(list_2c)
print (array_2c.shape)
print (array_2c)

(5, 3)
[[9 5 4]
 [7 7 3]
 [9 5 9]
 [8 7 8]
 [5 8 8]]


# <font color = magenta> NumPy Problem 2 </font>
<font color = magenta>
Create a multidimensional array of your dimension choice and fill it random values(not filled manually).

In [68]:
# multidimensional array with random value.
np.random.randint(1,5+1)

5

In [77]:
list_p2 = np.random.randint(5, size = (2,3))

In [78]:
print (list_p2)

[[0 2 3]
 [2 0 2]]


Find the min and max values of your array.

In [79]:
print ("Minimum value is", list_p2.min())
print ("Maximum value is", list_p2.max())

Minimum value is 0
Maximum value is 3


# <font color = magenta> NumPy Problem 3 </font>
<font color = magenta>
Create another multidimensional array of your dimension choice and fill it random values(not filled manually). Find the max value of your new array and replace it with your min value. Find the min value and replace it in your array with the max value.

In [83]:
list_p3 = np.random.randint(10, size = (2,3))
print (list_p3)

[[9 9 2]
 [4 1 9]]


In [84]:
np.random.random(10)
maxA = list_p3.max()
minB = list_p3.min()
print ("maxA is ", maxA)
print ("minB is ", minB)
maxA, minB = minB, maxA  # Swap values
print ("maxA is ", maxA)
print ("maxB is ", minB)

maxA is  9
minB is  1
maxA is  1
maxB is  9


# <font color = magenta> NumPy Problem 4 </font>

Create a random vector of size 10 and sort it.

In [28]:
list_p4 = np.random.randint(10, size = 10)
print (list_p4)
arr = np.array([list_p4])
arr.sort()
arr

[6 2 8 2 6 6 5 6 0 2]


array([[0, 2, 2, 2, 5, 6, 6, 6, 6, 8]])

# <font color = magenta> NumPy Problem 5 </font>

<font color = magenta>
How to swap two rows of an array?

In [36]:
print (array_2a)
array_2a.shape



[[4 6 4]
 [1 1 8]
 [0 7 5]
 [5 3 3]
 [8 9 5]]


(5, 3)

# Numpy with Bay Area housing data set

In [2]:
def read_file_housing(filename):
    file_open = open(filename,"r")
    fixed_file = open("fixed-housing-data.csv","w")
    line_count = 0
    for line in iter(file_open):
        line_count += 1
        if "HomeID" in line:
            continue
        line_no_newline = line.rstrip()
        line1 = line_no_newline.replace("84085","94085") #Ex9
        line2 = line1.replace("84087","94087") #Ex9
        line3 = line2.replace("85014","95014") #Ex9
        line4 = line3.replace("85051","95051") #Ex9
        line5 = line4.replace("l","1") #Ex11 -- Car_Garage
        line_split = line5.split(",")
        if (int(line_split[5]) < 100): #Ex10 -- School_API
            line_split[5] = int(line_split[5]) * 10
        else:
            line_split[5] = int(line_split[5])
        line_split = [str(x) for x in line_split]
        myString = ",".join(line_split) + "\n"
        fixed_file.write(myString)
    return

In [3]:
read_file_housing("bayarea_home_prices.csv")

In [4]:
import numpy as np

In [5]:
"""
0 = HomeID
1 = HomeAge
2 = HomeSqft
3 = LotSize
4 = BedRooms
5 = HighSchoolAPI
6 = ProxFwy
7 = CarGarage
8 = ZipCode
9 = HomePriceK
"""

'\n0 = HomeID\n1 = HomeAge\n2 = HomeSqft\n3 = LotSize\n4 = BedRooms\n5 = HighSchoolAPI\n6 = ProxFwy\n7 = CarGarage\n8 = ZipCode\n9 = HomePriceK\n'

In [6]:
housing_data = np.loadtxt("fixed-housing-data.csv",
                          dtype=int,
                          delimiter=",")

In [7]:
print(housing_data[0:100])

[[    1    24  1757  6056     2   899     3     3 94085   894]
 [    2    10  1563  6085     2   959     4     3 94085   861]
 [    3    14  1344  6089     2   865     4     3 94085   831]
 [    4    14  1215  6129     3   959     4     2 94085   809]
 [    5    24  1866  6141     3   877     4     1 94085   890]
 [    6    18  1589  6148     2   920     3     0 94085   867]
 [    7    13  1947  6183     3   959     3     1 94085   843]
 [    8    19  1839  6186     3   905     4     0 94085   820]
 [    9    17  1501  6233     2   884     3     1 94085   874]
 [   10    24  1933  6276     2   950     4     1 94085   885]
 [   11    12  1798  6346     3   931     3     2 94085   903]
 [   12    22  1221  6430     3   904     2     1 94085   912]
 [   13    15  1541  6514     2   872     2     1 94085   933]
 [   14    25  1974  6547     2   857     4     3 94085   865]
 [   15    10  1510  6633     2   884     3     2 94085   918]
 [   16    20  1979  6680     2   894     3     0 95051

In [8]:
print(housing_data.shape)

(100, 10)


In [9]:
# home prices
print(housing_data[:,9])

[ 894  861  831  809  890  867  843  820  874  885  903  912  933  865
  918  950  882  896  942  859  904  912  916  972  908  934  914  949
  919  953  991 1049 1042  994 1030 1019 1044 1038 1024  976 1115 1128
 1071 1059 1000 1185 1015 1114 1138 1068 1068 1097 1074 1114 1075 1130
 1116 1103 1080 1150 1177 1149 1163 1132 1138 1199 1179 1173 1128 1165
 1233 1180 1240 1242 1184 1173 1194 1181 1190 1182 1221 1288 1275 1300
 1272 1294 1219 1282 1256 1205 1252 1294 1269 1335 1267 1307 1336 1284
 1269 1250]


In [10]:
print(housing_data[:,9] + 10)

[ 904  871  841  819  900  877  853  830  884  895  913  922  943  875
  928  960  892  906  952  869  914  922  926  982  918  944  924  959
  929  963 1001 1059 1052 1004 1040 1029 1054 1048 1034  986 1125 1138
 1081 1069 1010 1195 1025 1124 1148 1078 1078 1107 1084 1124 1085 1140
 1126 1113 1090 1160 1187 1159 1173 1142 1148 1209 1189 1183 1138 1175
 1243 1190 1250 1252 1194 1183 1204 1191 1200 1192 1231 1298 1285 1310
 1282 1304 1229 1292 1266 1215 1262 1304 1279 1345 1277 1317 1346 1294
 1279 1260]


In [11]:
print(housing_data.sum(axis=0)) # sum by columns

[   5050    1720  161528  784050     271   90443     310     152 9455925
  108099]


In [12]:
print(housing_data.sum(axis=1)) # sum by rows

[103724 103574 103240 103224 103896 103638 104044 103869 103609 104170
 104094 103592 103978 104376 104062 105595 104218 104335 104940 104012
 104075 104583 104684 105583 104958 104446 104876 105593 104503 106006
 105990 106058 106256 105929 106298 105629 106320 106196 106023 106261
 105369 105445 106369 105987 105936 105549 106606 105320 105358 106782
 106365 106479 106225 105461 106713 105795 106138 105594 106827 105847
 105796 106180 105869 105800 106169 107018 106669 106315 106430 106551
 107465 106524 107181 107608 106533 107139 107961 106805 106554 106595
 107843 107864 108077 107983 108182 108227 107735 107879 108478 108106
 107902 108487 107956 108654 108172 108678 108406 108111 108617 108407]


In [13]:
homes_94085 = (housing_data[:,8] == 94085)

In [14]:
print(homes_94085)
# houseCount = 0
# for j in range (homes_94085):
#     if (housing_data[:,8] == 94085):
#         houseCount_94085 += 1
# print(houseCount_94085)

[ True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True False  True  True False  True  True  True  True False
  True  True  True False  True False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False]


In [15]:
data_94085 = housing_data[homes_94085,][:,:]
#print(data_94085[1:,:])
print(data_94085)

[[    1    24  1757  6056     2   899     3     3 94085   894]
 [    2    10  1563  6085     2   959     4     3 94085   861]
 [    3    14  1344  6089     2   865     4     3 94085   831]
 [    4    14  1215  6129     3   959     4     2 94085   809]
 [    5    24  1866  6141     3   877     4     1 94085   890]
 [    6    18  1589  6148     2   920     3     0 94085   867]
 [    7    13  1947  6183     3   959     3     1 94085   843]
 [    8    19  1839  6186     3   905     4     0 94085   820]
 [    9    17  1501  6233     2   884     3     1 94085   874]
 [   10    24  1933  6276     2   950     4     1 94085   885]
 [   11    12  1798  6346     3   931     3     2 94085   903]
 [   12    22  1221  6430     3   904     2     1 94085   912]
 [   13    15  1541  6514     2   872     2     1 94085   933]
 [   14    25  1974  6547     2   857     4     3 94085   865]
 [   15    10  1510  6633     2   884     3     2 94085   918]
 [   17    23  1464  6773     3   965     4     2 94085

In [16]:
sum_price_94085 = data_94085[:,9].sum()

In [17]:
average_94085 = sum_price_94085/25
print(average_94085)

885.96


# NumPy Problem 6
### Calculate average price in each zip code: 94085, 94087, 95014, 95051
### Calculate minimum and max price in each zip code: 94085, 94087, 95014, 95051
### Calculate standard deviation of price in each zip code: 94085, 94087, 95014, 95051

In [18]:
#----------------------------------------
# zip code 94085
#----------------------------------------
homes_94085 = (housing_data[:,8] == 94085)
#print (homes_94085)
data_94085 = housing_data[homes_94085,][:,:]
#print(data_94085)
sum_price_94085 = data_94085[:,9].sum()
print(sum_price_94085)
average_94085 = sum_price_94085/25
print("Average Home Price in 94085",format(average_94085,"10.2f"))
#---------------------------------------
# zip code 94087
#---------------------------------------
homes_94087 = (housing_data[:,8] == 94087)
data_94087 = housing_data[homes_94087,][:,:]
#print (data_94087)
sum_price_94085 = data_94085[:,9].sum()
print(sum_price_94085)
average_94085 = sum_price_94085/25
print("Average Home Price in 94087",format(average_94085,"10.2f"))
#------------------------------------------
# zip code 95014
#------------------------------------------
homes_95014 = (housing_data[:,8] == 95014)
data_95014 = housing_data[homes_95014,][:,:]
#print (data_95014)
sum_price_95014 = data_95014[:,9].sum()
print(sum_price_95014)
average_95014 = sum_price_95014/25
print("Average Home Price in 95014",format(average_95014,"10.2f"))
#-------------------------------------------
# zip code 95051
#-------------------------------------------
homes_95051 = (housing_data[:,8] == 95051)
data_95051 = housing_data[homes_95051,][:,:]
#print (data_95051)
sum_price_95051 = data_95014[:,9].sum()
print(sum_price_95051)
average_95051 = sum_price_95051/25
print("Average Home Price in 95051",format(average_95051,"10.2f"))

22149
Average Home Price in 94085     885.96
22149
Average Home Price in 94087     885.96
31583
Average Home Price in 95014    1263.32
31583
Average Home Price in 95051    1263.32


In [33]:
arrayPrice_94085 = data_94085[:,9]
print ("Minimum value at 94085 is", arrayPrice_94085.min())
print ("Maximum value at 94085 is", arrayPrice_94085.max())

arrayPrice_94087 = data_94087[:,9]
print ("Minimum value at 94087 is", arrayPrice_94087.min())
print ("Maximum value at 94087 is", arrayPrice_94087.max())

arrayPrice_95014 = data_95014[:,9]
print ("Minimum value at 95014 is", arrayPrice_95014.min())
print ("Maximum value at 95014 is", arrayPrice_95014.max())

arrayPrice_95051 = data_95051[:,9]
print ("Minimum value at 95051 is", arrayPrice_95051.min())
print ("Maximum value at 95051 is", arrayPrice_95051.max())



Minimum value at 94085 is 809
Maximum value at 94085 is 934
Minimum value at 94087 is 1103
Maximum value at 94087 is 1190
Minimum value at 95014 is 1194
Maximum value at 95014 is 1336
Minimum value at 95051 is 942
Maximum value at 95051 is 1097


In [46]:
arraySTD_94085 = np.std(data_94085[:,9])
print ("Price Standard Deviation at 94085 is", format (arraySTD_94085, "10.2f"))

arraySTD_94087 = np.std(data_94087[:,9])
print ("Price Standard Deviation at 94087 is", format(arraySTD_94087, "10.2f"))

arraySTD_95014 = np.std(data_95014[:,9])
print ("Price Standard Deviation at 95014 is", format(arraySTD_95014, "10.2f"))

arraySTD_95051 = np.std(data_95051[:,9])
print ("Price Standard Deviation at 95051 is", format(arraySTD_95051, "10.2f"))



Price Standard Deviation at 94085 is      33.71
Price Standard Deviation at 94087 is      27.57
Price Standard Deviation at 95014 is      37.74
Price Standard Deviation at 95051 is      46.04


In [45]:
data_94085 = housing_data[homes_94085,][:,:]
data_94087 = housing_data[homes_94087,][:,:]
data_95014 = housing_data[homes_95014,][:,:]
data_95051 = housing_data[homes_95051,][:,:]

In [32]:
sum_price_94085 = data_94085[:,9].sum()
sum_price_94087 = data_94087[:,9].sum()
sum_price_95014 = data_95014[:,9].sum()
sum_price_95051 = data_95051[:,9].sum()

In [56]:
h1 = housing_data[housing_data[:,5].argsort()] # by school_api ascending
print(h1)

[[   65    14  1617  8394     2   850     2     0 94087  1138]
 [   73    25  1302  8668     3   850     4     2 95014  1240]
 [   23    15  1828  6956     3   851     4     3 94085   916]
 [   20    13  1358  6819     2   851     3     2 94085   859]
 [   79    17  1373  8953     2   851     2     0 94087  1190]
 [   77    17  1881  8921     3   852     2     0 95014  1194]
 [   19    10  1246  6810     2   853     4     3 95051   942]
 [   32    18  1866  7181     2   854     2     3 95051  1049]
 [   95    13  1582  9339     3   856     3     0 95014  1267]
 [   26    12  1500  7025     2   856     4     2 94085   934]
 [   99    19  1880  9470     3   857     3     3 95014  1269]
 [   53    23  1289  7873     2   857     3     0 95051  1074]
 [   67    24  1947  8502     2   857     4     0 94087  1179]
 [  100    11  1691  9476     4   857     4     0 95014  1250]
 [   14    25  1974  6547     2   857     4     3 94085   865]
 [   44    11  1415  7541     3   859     4     0 95051

In [49]:
h2 = housing_data[housing_data[:,5].argsort()[::-1]] # by school_api descending
print(h2)

[[   38    22  1724  7339     3   975     3     3 95051  1038]
 [   35    12  1943  7249     2   974     2     0 95051  1030]
 [   27    13  1836  7027     2   966     3     3 94085   914]
 [   17    23  1464  6773     3   965     4     2 94085   882]
 [   69    21  1575  8579     2   962     4     3 94087  1128]
 [   37    13  1874  7333     3   960     3     2 95051  1044]
 [   45    15  1249  7609     3   960     2     2 95051  1000]
 [    2    10  1563  6085     2   959     4     3 94085   861]
 [    4    14  1215  6129     3   959     4     2 94085   809]
 [   33    11  1953  7199     3   959     3     2 95051  1042]
 [    7    13  1947  6183     3   959     3     1 94085   843]
 [   76    12  1947  8882     3   954     3     2 94087  1173]
 [   59    22  1559  8096     2   953     2     3 95051  1080]
 [   57    11  1927  7983     3   950     3     1 94087  1116]
 [   10    24  1933  6276     2   950     4     1 94085   885]
 [   50    19  1836  7803     3   949     3     0 95051

# NumPy Problem 7
### Find top-2 listings by School API for all zipcodes

In [60]:
list_Top2API = h2[0:2]
print (list_Top2API)

[[   38    22  1724  7339     3   975     3     3 95051  1038]
 [   35    12  1943  7249     2   974     2     0 95051  1030]]


# NumPy Problem 8
### Prices are expected to go up by 4% next year.
### Add another column with predicted prices

In [67]:
calc = h2[:,9] * 1.04
new_column = np.array(calc)
print (new_column)
new_h2 = np.column_stack((h2,new_column))
print (new_h2)

[1079.52 1071.2   950.56  917.28 1173.12 1085.76 1040.    895.44  841.36
 1083.68  876.72 1219.92 1123.2  1160.64  920.4  1110.72 1110.72 1064.96
 1389.44 1319.76 1326.   1302.08 1219.92 1231.36 1033.76  931.84  991.12
  939.12 1352.   1333.28 1209.52 1228.24 1059.76 1345.76  940.16  901.68
 1345.76 1030.64 1159.6  1232.4  1335.36 1113.84 1359.28 1010.88  852.8
  948.48  948.48 1196.   1246.96 1055.6  1224.08  955.76  944.32  929.76
  988.   1269.84 1227.2   986.96 1015.04 1158.56 1118.   1183.52 1282.32
 1158.56  908.96  954.72 1229.28 1147.12 1306.24 1211.6  1267.76 1253.2
  925.6  1339.52 1175.2  1291.68  970.32 1194.96 1177.28 1173.12 1140.88
  864.24 1388.4  1322.88 1101.36  899.6  1300.   1226.16 1116.96 1319.76
  971.36 1317.68 1090.96  979.68 1241.76 1237.6   893.36  952.64 1289.6
 1183.52]
[[3.80000e+01 2.20000e+01 1.72400e+03 ... 9.50510e+04 1.03800e+03
  1.07952e+03]
 [3.50000e+01 1.20000e+01 1.94300e+03 ... 9.50510e+04 1.03000e+03
  1.07120e+03]
 [2.70000e+01 1.30000e+01 1.

# NumPy Problem 9
### Sort the matrix based on HomeID. Save the updated numpy matrix with added column in Problem 8 to a file.

In [71]:
h3 = housing_data[housing_data[:,0].argsort()] 
print (h3)
calc = h3[:,9] * 1.04
new_column = np.array(calc)
print (new_column)
new_h3 = np.column_stack((h2,new_column))
print (new_h3)



[[    1    24  1757  6056     2   899     3     3 94085   894]
 [    2    10  1563  6085     2   959     4     3 94085   861]
 [    3    14  1344  6089     2   865     4     3 94085   831]
 [    4    14  1215  6129     3   959     4     2 94085   809]
 [    5    24  1866  6141     3   877     4     1 94085   890]
 [    6    18  1589  6148     2   920     3     0 94085   867]
 [    7    13  1947  6183     3   959     3     1 94085   843]
 [    8    19  1839  6186     3   905     4     0 94085   820]
 [    9    17  1501  6233     2   884     3     1 94085   874]
 [   10    24  1933  6276     2   950     4     1 94085   885]
 [   11    12  1798  6346     3   931     3     2 94085   903]
 [   12    22  1221  6430     3   904     2     1 94085   912]
 [   13    15  1541  6514     2   872     2     1 94085   933]
 [   14    25  1974  6547     2   857     4     3 94085   865]
 [   15    10  1510  6633     2   884     3     2 94085   918]
 [   16    20  1979  6680     2   894     3     0 95051

# <font color = magenta> NumPy Problem 10 </font>

Write a function that takes a long string containing multiple words. Print the same string, except with the words in backwards order. 

<i>HINT: Use <b>YOUR_STRING<code>.split()</code></b> function<br></i>

In [10]:
import numpy as np

In [43]:
List_string = "Learning Python for DataScience"
print (List_string.split(" "))


['Learning', 'Python', 'for', 'DataScience']


In [44]:
len(List_string)

31

In [45]:
print(List_string[0])

L


In [46]:
print(List_string[30])

e


In [47]:
for i in range(len(List_string)-1, 0, -1):
    print (List_string[i], end="")

ecneicSataD rof nohtyP gninrae

In [42]:
arr = np.array(List_string)
arr

array('Learning Python for DataScience', dtype='<U31')

In [50]:
arr.split(",")

AttributeError: 'numpy.ndarray' object has no attribute 'split'

In [15]:
for i in range (0, len(List_string), -1):
    print (List_string[i])