# Numpy
## Chapter 5
### Predictive Analytics for the Modern Enterprise 

This is jupyter notebook that can be used to follow along the code examples for **Chapter 5 Section 1 - Numpy** of the book.
The code examples go through some of the functionality that can be used to work with *multi-diemnsional arrays* using the Numpy library in python. 

The notebook has been tested using the following pre-requisite:

* Python V3.9.13 - https://www.python.org/
* Anaconda Navigator V3 for Python 3.9 - https://www.anaconda.com/
* Jupyter - V6.4.12 - https://jupyter.org/ 
* Desktop computer - macOS Ventura V13.1 

### Installation

In case you don't have numpy installed use the following command to install numpy in your environment

In [1]:
pip install numpy

Note: you may need to restart the kernel to use updated packages.


In [2]:
import numpy as np

### Arrays, dimensions, indexes and data types

In [3]:
np.array([0,1,2,3]) #One dimension array

array([0, 1, 2, 3])

In [4]:
np.array([[1,2,3,4], [4,3,2,1]]) #Two dimensional array 

array([[1, 2, 3, 4],
       [4, 3, 2, 1]])

In [5]:
threeD = np.array([[[1,2,3], [4,2,1]], [[5,6,7], [4,2,3]]]) #Three dimensional array
threeD

array([[[1, 2, 3],
        [4, 2, 1]],

       [[5, 6, 7],
        [4, 2, 3]]])

In [6]:
threeD[0,1,2]

1

In [7]:
a = np.array([1,2,3,4,5], dtype=np.float32) #Create and array of type to float
a

array([1., 2., 3., 4., 5.], dtype=float32)

### Generating Arrays

In [8]:
np.zeros((2,3)) #Two dimensional array of 0s

array([[0., 0., 0.],
       [0., 0., 0.]])

In [9]:
np.ones((3,2,4)) #Three dimensional array of 1s

array([[[1., 1., 1., 1.],
        [1., 1., 1., 1.]],

       [[1., 1., 1., 1.],
        [1., 1., 1., 1.]],

       [[1., 1., 1., 1.],
        [1., 1., 1., 1.]]])

In [10]:
np.random.rand(2,2,3) #Three dimensional array of random numbers

array([[[0.34722037, 0.4064136 , 0.45872499],
        [0.86784904, 0.32554903, 0.00541894]],

       [[0.08278181, 0.54906014, 0.08144777],
        [0.39925984, 0.54004031, 0.52540672]]])

In [11]:
np.arange(1,10,2) #Generate an arracy with specific sequence 

array([1, 3, 5, 7, 9])

In [12]:
np.eye(4) #Generate a 2-D array with 1 as the diagonal

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

In [13]:
a = np.eye(4,k=-2) #Move the diagonal up or down
a

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [1., 0., 0., 0.],
       [0., 1., 0., 0.]])

In [14]:
print('Dimensions: ',  a.ndim) # Print Array dimensions

Dimensions:  2


In [15]:
print('Shape: ', a.shape) # axis(s)

Shape:  (4, 4)


In [16]:
print('Size: ', a.size) #Total Number of elements

Size:  16


### Array slicing

In [17]:
#Starting at element 0 return the 4th index item of each element
a = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]]) 
print(a[0:, 4])

[ 5 10]


In [18]:
# For all subsets, look at each of the sub-elements 
# from 0-2 (2 excluded) and return element 4
a = np.array([[[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]], 
              [[4, 7, 2, 3, 1], [2, 5, 6, 1, 6]]]) 
print('Array:\n', a)
print('Element:\n', a[0:,0:2,3])

Array:
 [[[ 1  2  3  4  5]
  [ 6  7  8  9 10]]

 [[ 4  7  2  3  1]
  [ 2  5  6  1  6]]]
Element:
 [[4 9]
 [3 1]]


### Array Transformations

In [19]:
a = a = np.eye(4,k=-2)
b = np.reshape(a,(8,-1)) #-1 allows numpy to figure out the dimension
print('a:\n', a)
print('b:\n', b)

a:
 [[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [1. 0. 0. 0.]
 [0. 1. 0. 0.]]
b:
 [[0. 0.]
 [0. 0.]
 [0. 0.]
 [0. 0.]
 [1. 0.]
 [0. 0.]
 [0. 1.]
 [0. 0.]]


In [20]:
a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]) #1 dimensional array
b = a.reshape(2, 3, 2) #Reshape to 3 dimensional array
print('a:\n', a)
print('b:\n', b)


a:
 [ 1  2  3  4  5  6  7  8  9 10 11 12]
b:
 [[[ 1  2]
  [ 3  4]
  [ 5  6]]

 [[ 7  8]
  [ 9 10]
  [11 12]]]


In [21]:
a = np.array([[1,2,3],[4,5,6]])
b = np.transpose(a) # Transpose the Array Matrix
print('Original','\n','Shape',a.shape,'\n',a)
print('Transposed:','\n','Shape',b.shape,'\n',b)

Original 
 Shape (2, 3) 
 [[1 2 3]
 [4 5 6]]
Transposed: 
 Shape (3, 2) 
 [[1 4]
 [2 5]
 [3 6]]


### Other operations

In [22]:
a = np.array([[5,6,7,4],
              [9,2,3,7]])
print('a:\n', a)
print('Sort along column :','\n',np.sort(a, axis=1)) # sort along the column
print('Sort along row :','\n',np.sort(a, axis=0)) # sort along the row

a:
 [[5 6 7 4]
 [9 2 3 7]]
Sort along column : 
 [[4 5 6 7]
 [2 3 7 9]]
Sort along row : 
 [[5 2 3 4]
 [9 6 7 7]]


In [23]:
a = np.array([1, 2, 3, 4, 5, 4])
x = np.where(a == 5) #Find an element by value
print(x)

(array([4]),)


### Cinema Example

In [24]:
# Define dimensions
months = 12

# Create a 1D array with dimensions (months) 
# We have just 1 cinema and 1 category
# We are using a random function to generate ticket sales values 
# between 80 - 1200
ticket_sales = np.random.randint(80, 1201, size=(months))

print("1D Array Shape:", ticket_sales.shape)
print("Sample Data:")
ticket_sales

1D Array Shape: (12,)
Sample Data:


array([401, 534, 790, 954, 375, 339, 196, 794, 317, 132, 830, 214])

In [25]:
# Define dimensions
months = 12
categories = 3

# Create a 2D array with dimensions (months, categories) 
# We just have 1 cinema
# We are using a random function to generate ticket sales values 
# between 80 - 1200
ticket_sales = np.random.randint(80, 1201, size=(months, categories))

print("2D Array Shape:", ticket_sales.shape)
print("Sample Data:")
ticket_sales

2D Array Shape: (12, 3)
Sample Data:


array([[ 894,  760,  131],
       [ 147, 1033, 1022],
       [ 668,  110, 1030],
       [1053, 1039,  787],
       [ 251,  553,  378],
       [ 719,  424,  621],
       [ 840, 1143,  482],
       [ 369,  853,  912],
       [1099,  435,  699],
       [ 762,  876,  733],
       [ 335,  165,  842],
       [ 780,  799,  844]])

In [26]:
# Define dimensions
months = 12
categories = 3
cinemas = 9

# Create a 3D array with dimensions (months, categories, cinemas)
# We are using a random function to generate ticket sales values 
# between 80 - 1200
ticket_sales = np.random.randint(80, 1201, size=(cinemas, months, categories))

print("3D Array Shape:", ticket_sales.shape)
print("Sample Data:")
ticket_sales

3D Array Shape: (9, 12, 3)
Sample Data:


array([[[1104,  300,  306],
        [ 953,  131,  690],
        [ 863,  526,  508],
        [1072, 1145,  553],
        [ 148,  630, 1085],
        [ 832, 1149,  655],
        [ 705,  487,  355],
        [1187,  310,  798],
        [  99, 1042,  722],
        [ 273,  461,  525],
        [ 732,  273,  182],
        [ 640,  120, 1120]],

       [[ 582,  671, 1030],
        [ 129,  981,  962],
        [ 897,  752,  357],
        [ 363,  408,  975],
        [ 904,  170,  271],
        [ 846,  548,  611],
        [1060, 1003,  758],
        [ 647, 1131,  277],
        [ 777,  694, 1100],
        [ 247,  917, 1184],
        [1026,  282,  534],
        [ 465,   98, 1057]],

       [[ 851,  742,  230],
        [ 349,  588,  198],
        [1182,  880, 1149],
        [ 547,  732,  827],
        [ 102,  576,  315],
        [ 342,  603,  151],
        [ 151,  471,  592],
        [ 655,  216,  291],
        [ 180, 1126,  112],
        [ 923,  529,  889],
        [ 624,  343, 1136],
        [1052,  

Let us try and answer the questions we discussed earlier

Q1: What is the total number of tickets sold across all cinemas in a month

In [27]:
# Sum the ticket sales across all cinemas for each month
total_tickets_per_month = np.sum(ticket_sales, axis=(0, 2))

# Print the total number of tickets sold across all cinemas in each month
for month, total_tickets in enumerate(total_tickets_per_month):
    print(f"Month {month}: Total tickets sold = {total_tickets}")

Month 0: Total tickets sold = 19352
Month 1: Total tickets sold = 14104
Month 2: Total tickets sold = 17691
Month 3: Total tickets sold = 16192
Month 4: Total tickets sold = 16858
Month 5: Total tickets sold = 18923
Month 6: Total tickets sold = 18264
Month 7: Total tickets sold = 17111
Month 8: Total tickets sold = 17617
Month 9: Total tickets sold = 18557
Month 10: Total tickets sold = 14456
Month 11: Total tickets sold = 18061


Q2: How do the sales of Regular tickets in Cinema 1 compare to Cinema 7 in the month of June?

In [28]:
#(Cinema 1, June = 5 (as January = 0), Regular Category = 0)
june_sales_cinema1_regular = ticket_sales[1, 5, 0] 
#(Cinema 7, June = 5 (as January = 0), Regular Category = 0)
june_sales_cinema7_regular = ticket_sales[7, 5, 0] 
print("June Sales for Regular Tickets")
print("Cinema 1: ", june_sales_cinema1_regular)
print("Cinema 7: ", june_sales_cinema7_regular)

June Sales for Regular Tickets
Cinema 1:  846
Cinema 7:  986


### Additional code samples

In [29]:
a = np.array([True, True, False, True]) #Sorting booleans
np.sort(a)
print('a:\n', a)
print('Sorted:\n', np.sort(a))

a:
 [ True  True False  True]
Sorted:
 [False  True  True  True]


In [30]:
a = np.array(['Karachi','Dubai','New York']) #Sorting strings 
np.sort(a)
print('a:\n', a)
print('Sorted:\n', np.sort(a))

a:
 ['Karachi' 'Dubai' 'New York']
Sorted:
 ['Dubai' 'Karachi' 'New York']


In [31]:
arr = np.array([1, 2, 3, 4, 5, 4, 4])
x = np.where(arr < 4) #Find an element by comparison
print(x)

(array([0, 1, 2]),)


In [32]:
arr = np.array([10, 12, 14, 20])
x = np.searchsorted(arr, 15) # Find the index where 15 should be inserted
print(x)

3


In [33]:
a = a = np.eye(4)
b = a.flatten() # Flattens the array and Returns a copy of the original array
print('a:\n', a)
print('b:\n', b)

a:
 [[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]]
b:
 [1. 0. 0. 0. 0. 1. 0. 0. 0. 0. 1. 0. 0. 0. 0. 1.]


In [34]:
b[0] = 3
print(a) #a is unchanged when b is updated

[[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]]


In [35]:
b = a.ravel() #Flattens but maintains a link to the original array
print('a:\n', a)
print('b:\n', b)

a:
 [[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]]
b:
 [1. 0. 0. 0. 0. 1. 0. 0. 0. 0. 1. 0. 0. 0. 0. 1.]


In [36]:
b[0] = 3
print(a) #a gets changed when b is updated

[[3. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]]


In [37]:
a = np.array(["A","C","B","X"])
b = np.expand_dims(a,axis=0) #increase dimension on axis 0
c = np.expand_dims(a,axis=1) #increase dimension on axis 1
print('a:\n', a)
print('b:\n', b)
print('c:\n', c)

a:
 ['A' 'C' 'B' 'X']
b:
 [['A' 'C' 'B' 'X']]
c:
 [['A']
 ['C']
 ['B']
 ['X']]


In [38]:
a = np.array([[[1,2,3],[4,5,6]]])
b = np.squeeze(a, axis=0) #Reduce dimension on axis 0
print('a:\n', a)
print('b:\n', b)


a:
 [[[1 2 3]
  [4 5 6]]]
b:
 [[1 2 3]
 [4 5 6]]


In [39]:
a = np.array([[1,2,3,4,5],
[6,7,8,9,10]])
print('a :','\n',a)
print('Vertical Flip :','\n',np.flip(a,axis=1)) # Flips on vertical axis
print('Horizontal Flip :','\n',np.flip(a,axis=0)) # Flips on horizontal axis

a : 
 [[ 1  2  3  4  5]
 [ 6  7  8  9 10]]
Vertical Flip : 
 [[ 5  4  3  2  1]
 [10  9  8  7  6]]
Horizontal Flip : 
 [[ 6  7  8  9 10]
 [ 1  2  3  4  5]]


In [40]:
a = np.arange(0,5)
b = np.arange(5,10)
print('a :','\n',a)
print('b :','\n',b)
print('Vertical stack :','\n',np.vstack((a,b))) # Stack vertically
print('Horizontal stack :','\n',np.hstack((a,b))) #Stack horizontally

a : 
 [0 1 2 3 4]
b : 
 [5 6 7 8 9]
Vertical stack : 
 [[0 1 2 3 4]
 [5 6 7 8 9]]
Horizontal stack : 
 [0 1 2 3 4 5 6 7 8 9]


In [41]:
a = np.arange(10,20,2)
b = np.array([[2],[5]])
print('a :',a)
print('Adding two different size arrays :','\n',a+b) #Adding Arrays
print('Multiplying an ndarray and a number :',a*2) #Multiplying an array with a scalar

a : [10 12 14 16 18]
Adding two different size arrays : 
 [[12 14 16 18 20]
 [15 17 19 21 23]]
Multiplying an ndarray and a number : [20 24 28 32 36]


In [42]:
print('a :',a)
print('Subtract :',a-2)
print('Multiply :',a*10)
print('Divide :',a/50)
print('Power :',a**4)
print('Remainder :',a%3)

a : [10 12 14 16 18]
Subtract : [ 8 10 12 14 16]
Multiply : [100 120 140 160 180]
Divide : [0.2  0.24 0.28 0.32 0.36]
Power : [ 10000  20736  38416  65536 104976]
Remainder : [1 0 2 1 0]


In [43]:
a = np.arange(5,15,2)
print('a :',a)
print('Mean :',np.mean(a))
print('Standard deviation :',np.std(a))
print('Median :',np.median(a))

a : [ 5  7  9 11 13]
Mean : 9.0
Standard deviation : 2.8284271247461903
Median : 9.0


In [44]:
a = np.array([[1,6],[4,3]])
print('a :\n',a)
print('Min - column:',np.min(a,axis=0)) # minimum along a column
print('Max - row:',np.max(a,axis=1)) # maximum along a row

a :
 [[1 6]
 [4 3]]
Min - column: [1 3]
Max - row: [6 4]


In [45]:
a = np.array([[1,6,5],[4,3,7]])
print('Min :',np.argmin(a,axis=0)) #Index of min along cloumn
print('Max :',np.argmax(a,axis=1)) #Index of max along row

Min : [0 1 0]
Max : [1 2]
