# Python for Data Science

#Introduction to NumPy

#NumPy consists of a powerful data structure called multidimensional arrays. 
#Pandas is another powerful Python library that provides fast and easy data analysis platform. 

Understand advantages of vectorised code using Numpy (over standard Python ways)
Create NumPy arrays
Convert lists and tuples to NumPy arrays
Create (initialise) arrays
Inspect the structure and content of arrays
Subset, slice, index and iterate through arrays
Compare computation times in NumPy and standard Python lists

The most basic object in NumPy is the ndarray, or simply an array which is an n-dimensional, homogeneous array. By homogenous, we mean that all the elements in a NumPy array have to be of the same data type, which is commonly numeric (float or integer).


In [2]:
import numpy as np

In [5]:
array_1d =np.array([2,3,4,5,6,7,9])
print(array_1d)
print(type(array_1d))

[2 3 4 5 6 7 9]
<class 'numpy.ndarray'>


In [8]:
#2D array
array_2d=np.array([[2,3,4],[5,6,7]])
print(array_2d)

[[2 3 4]
 [5 6 7]]


In [9]:
list_1 = [3,6,7,5]
list_2 = [4,5,1,7]

#the list way to do it: map a function to the two lists
product_list = list(map(lambda x, y: x*y, list_1, list_2))
print(product_list)

[12, 30, 7, 35]


In [10]:
# the numpy array way to do it: simply multiply the two arrays
array_1 = np.array(list_1)
array_2 = np.array(list_2)

array_3 = array_1*array_2
print(array_3)
print(type(array_3))

[12 30  7 35]
<class 'numpy.ndarray'>


In [11]:
array_3 = array_1 + array_2
print(array_3)

[ 7 11  8 12]


In [12]:
#square a list
list_squared = [i**2 for i in list_1]

#square a numpy array
array_squared = array_1**2

print(list_squared)
print(array_squared)

[9, 36, 49, 25]
[ 9 36 49 25]


In [13]:
input_list = [[2,3,4,5],[7,8,9,6]]
list_1 = input_list[0]
list_2 = input_list[1]

import numpy as np
array_1 = np.array(list_1)
array_2 = np.array(list_2)
array_3 = array_1*array_2

print(list(array_3))

[14, 24, 36, 30]


In [14]:
input_list = [[1,2,3],[4,5,6],[7,8,9]]
list_1 = input_list[0]
list_2 = input_list[1]
list_3 = input_list[2]

import numpy as np
array_1 = np.array([list_1,list_2,list_3])

print(array_1)

[[1 2 3]
 [4 5 6]
 [7 8 9]]


# Creating NumPy Arrays

In [16]:
help(np.ones())

TypeError: ones() missing 1 required positional argument: 'shape'

In [17]:
array_from_list = np.array([2,5,6,7])
array_from_tuple = np.array((4,5,8,9))

print(array_from_list)
print(array_from_tuple)

[2 5 6 7]
[4 5 8 9]


The following ways are commonly used:

np.ones(): Create an array of 1s
np.zeros(): Create an array of 0s
np.random.random(): Create an array of random numbers between 0 and 1
np.arange(): Create an array with increments of a fixed step size
np.linspace(): Create an array of fixed length

In [18]:
#creating a 5 x 3 array of onces
np.ones((5,3))

array([[ 1.,  1.,  1.],
       [ 1.,  1.,  1.],
       [ 1.,  1.,  1.],
       [ 1.,  1.,  1.],
       [ 1.,  1.,  1.]])

In [19]:
#notice that, by defalt numpy create datatype float64
#can provide dtype explicitly

np.ones((5,3), dtype = np.int)

array([[1, 1, 1],
       [1, 1, 1],
       [1, 1, 1],
       [1, 1, 1],
       [1, 1, 1]])

In [20]:
#creating array of zeros

np.zeros(4, dtype = np.int)

array([0, 0, 0, 0])

In [21]:
#array of random numbers
np.random.random([4,2])

array([[ 0.09421022,  0.03890004],
       [ 0.91332668,  0.05004087],
       [ 0.06786937,  0.41070539],
       [ 0.30617034,  0.99887614]])

In [22]:
numbers = np.arange(10, 100, 5)
print(numbers)

[10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95]


In [23]:
#sometimes, you know the length of the array, not the step size
np.linspace(15, 18, 25)
#25 intervel

array([ 15.   ,  15.125,  15.25 ,  15.375,  15.5  ,  15.625,  15.75 ,
        15.875,  16.   ,  16.125,  16.25 ,  16.375,  16.5  ,  16.625,
        16.75 ,  16.875,  17.   ,  17.125,  17.25 ,  17.375,  17.5  ,
        17.625,  17.75 ,  17.875,  18.   ])

In [26]:
# Create a constant array of any number ‘n’
np.full(8,4)

array([4, 4, 4, 4, 4, 4, 4, 4])

In [27]:
#np.tile(): Create a new array — by repeating an existing array — for a particular number of times

np.tile(4,6)

array([4, 4, 4, 4, 4, 4])

In [34]:
#np.eye(): Create an identity matrix of any dimension

np.eye(3,3,0)

array([[ 1.,  0.,  0.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.]])

In [36]:
#np.random.randint(): Create a random array of integers within a particular range

np.random.randint(0,5,10)

array([3, 4, 4, 0, 0, 1, 4, 4, 1, 4])

Create NumPy Array
Description
Given an integer 'x', create an array of size m*n having all integer values equal to 'x'. 
Hint: Use dtype to specify integer.

Format:
Input: 
Line 1: A single integer 'x'
Line 2: A single integer 'm' indicating the number of rows
Line 3: A single integer 'n' indicating the number of columns
Output: An array of size 'm*n' having all the values as 'x'

In [55]:
int_x = 1
rows_m = 3
cols_n = 3

import numpy as np
array_x = np.ones((rows_m,cols_n), dtype = np.int)*int_x

print(array_x)

[[1 1 1]
 [1 1 1]
 [1 1 1]]


Array 'arange' Function
Description
Create an array of first 10 multiples of 5 using the 'arange' function.

In [56]:
np.arange(5,50,5)

array([ 5, 10, 15, 20, 25, 30, 35, 40, 45])

Checkerboard Matrix
Description
Given an even integer ‘n’, create an ‘n*n’ checkerboard matrix with the values 0 and 1, using the tile function. 
 
Format:
Input: A single even integer 'n'.
Output: An 'n*n' NumPy array in checkerboard format.

Example: 
Input 1: 
2 
Output 1: 
[[0 1] 
 [1 0]] 
Input 2: 
4 
Output 2: 
[[0 1 0 1] 
 [1 0 1 0]
 [0 1 0 1]
 [1 0 1 0]]


In [87]:
inputv = 4

lst = tuple(np.arange(2,inputv*4,1)%2)
lst
#print(np.asarray([lst]))



(0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1)

# Structure and Content of Arrays

It is helpful to inspect the structure of NumPy arrays, especially while working with large arrays. Some attributes of NumPy arrays are:

shape: Shape of array (n x m)
dtype: data type (int, float etc.)
ndim: Number of dimensions (or axes)
itemsize: Memory used by each array element in bytes

In [91]:
#initialising a random 1000 * 300 array

rand_array = np.random.random((1000 , 300))

#print the first row
print(rand_array[1, ])

[ 0.51415009  0.55394796  0.08875432  0.39752032  0.40072938  0.86074053
  0.76608339  0.8747973   0.90082153  0.04209759  0.46188342  0.70911722
  0.01228701  0.2671012   0.90476062  0.28285167  0.08178716  0.2566357
  0.32511433  0.6955244   0.03455081  0.81726813  0.92373802  0.55688605
  0.26011434  0.45475665  0.19874811  0.21717154  0.24332039  0.27233649
  0.61983322  0.91976147  0.37597083  0.91521041  0.2581207   0.35760159
  0.56271193  0.5842697   0.99250844  0.27231277  0.91742939  0.67990233
  0.031101    0.75953419  0.97957915  0.63272446  0.92279382  0.47136613
  0.39022986  0.05897726  0.69299221  0.4449234   0.75401668  0.95939229
  0.06442861  0.76146959  0.90644898  0.05545098  0.60515727  0.57682134
  0.66101018  0.36848358  0.71513657  0.28883871  0.58620806  0.06186832
  0.73485674  0.55701047  0.19033896  0.35409704  0.67036766  0.89057044
  0.7794677   0.34683758  0.27712581  0.53021212  0.27660473  0.94085794
  0.87133925  0.49607393  0.00354431  0.37986625  0.

In [97]:
#creating a 3D array
#reshape() simply reshapes a 1-D array
array_3d = np.arange(24).reshape(2, 3, 4)
print(array_3d)

[[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]

 [[12 13 14 15]
  [16 17 18 19]
  [20 21 22 23]]]


In [109]:
input_list = [[10,11,12,13],[15,12,13,14]]

list_1 = input_list[0]
list_2 = input_list[1]

import numpy as np

array_1 = np.array([list_1,list_2])
print(array_1)
print(np.shape(array_1))
print(np.ndim(array_1))
#print(#Type the code for shape here)
#print(#Type the code for dimension here)

[[10 11 12 13]
 [15 12 13 14]]
(2, 4)
2


# Subset, Slice, Index and Iterate through Arrays

In [111]:
#For one-dimensional arrays, indexing slicing
array_1d = np.arange(10)
print(array_1d)

[0 1 2 3 4 5 6 7 8 9]


In [118]:
#third element
print(array_1d[2])

#specific elements
#notice that array[2, 5, 6] will thros an error

print(array_1d[[2, 5, 6]])

#slicing third element onwards
print(array_1d[2:])

#slice third to seventh element
print(array_1d[2:7])

#subset starting 0 at increment of 2
print(array_1d[0::2])

2
[2 5 6]
[2 3 4 5 6 7 8 9]
[2 3 4 5 6]
[0 2 4 6 8]


In [122]:
array_1 = [1, 2, 3, 5, 4, 6, 7, 8, 5, 3, 2]

print(array_1[:3])

[1, 2, 3]


In [123]:
print(array_1[0::2])

[1, 3, 4, 7, 5, 2]


# Multidimensional Arrays

Multidimensional arrays are indexed using as many indices as the number of dimensions or axes. For instance, to index a 2-D array, you need two indices - array[x, y]

In [127]:
for row in array_2d:
    print(row)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
[[12 13 14 15]
 [16 17 18 19]
 [20 21 22 23]]


In [128]:
#3D array
array_3d = np.arange(24).reshape(2,3,4)
print(array_3d)

[[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]

 [[12 13 14 15]
  [16 17 18 19]
  [20 21 22 23]]]


In [129]:
for row in array_3d:
    print(row)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
[[12 13 14 15]
 [16 17 18 19]
 [20 21 22 23]]


2D Array
Description
From a 2D array extract all the rows of the 2 column.
Hint: 2 column will have index value as 1.

In [182]:
input_list = [[5,6,7],[7,6,5],[0,8,7]]
import numpy as np
array_2d =np.array(input_list) 

array_2d

array([[5, 6, 7],
       [7, 6, 5],
       [0, 8, 7]])

In [150]:
lst = []

for x in array_2d:
    lst.append(x[1])

list_str = str(lst).replace(',', '')

print(list_str)

[6 6 8]


In [183]:
import numpy as np
array_2d =np.array(input_list) 

print(array_2d[ : ,1])

[6 6 8]


Border Rows and Columns
Description
Extract all the border rows and columns from a 2-D array.

Format:
Input: A 2-D Python list
Output: Four NumPy arrays - First column of the input array, first row of the input array, last column of the input array, last row of the input array respectively.

Example:
Input 1:
[[11 12 13 14]
 [21 22 23 24]
 [31 32 33 34]]
Output 1:
[11 21 31]
[11 12 13 14]
[14 24 34]
[31 32 33 34]

In [181]:
input_list = [[11, 12, 13, 14], [21, 22, 23, 24], [31, 32, 33, 34]]

import numpy as np

# Convert the input list to a NumPy array
array_2d =np.array(input_list)

lst_1 = []
lst_3 = []

for x in array_2d:
    lst_1.append(x[0])
    lst_3.append(x[3])

# Extract the first column, first row, last column and last row respectively using
# appropriate indexing
col_first = str(lst_1).replace(',', '')
row_first = array_2d[0:1]
row_first = row_first.ravel()
col_last = str(lst_3).replace(',', '')
row_last = array_2d[2:3]
row_last = row_last.ravel()

print(col_first)
print(row_first)
print(col_last)
print(row_last)

[11 21 31]
[11 12 13 14]
[14 24 34]
[31 32 33 34]


In [None]:
#solution

import numpy as np

# Convert the input list to a NumPy array
array_2d =np.array(input_list)

# Extract the number of rows and columns of the array
rows = len(array_2d[:, 0])
cols = len(array_2d[0, :])

# Extract the first column, first row, last column and last row respectively using
# appropriate indexing
col_1 = array_2d[:, 0]
row_1 = array_2d[0, :]
col_last = array_2d[:, cols-1]
row_last = array_2d[rows-1, :]

print(col_1)
print(row_1)
print(col_last)
print(row_last)
