# Numpy Intro
---
NumPy is a powerful library in Python used for numerical computing.   
It provides a high-performance multidimensional array object, and tools for working with these arrays efficiently.   
NumPy arrays are main way to use NumPy library   
In this course we will learn how to work with multi dimensional arrays   
Its relly fast as compared to ordinary python lists due to binding with C.

##### 1-d array - vector
##### 2-d array - matrix
##### n-d array - tensor

# Outline
---
What is NumPy   
How to create Numpy arrays   
How to create Numpy arrays using Functions   
How to reshape data   
what is -1   
Generate same random numbers   
Indexing and slicing of 1-d & 2-d arrays   
View vs Copy   
Conditional Selection   
NumPy Operations   
NumPy axis   
Shuffle and unique   
Stacking   
Exercise   

## Create NumPy arrays

In [1]:
pip install numpy

Defaulting to user installation because normal site-packages is not writeable
Note: you may need to restart the kernel to use updated packages.


In [2]:
import numpy as np
# if numpy is not installed, then run a command in anaconda prompt 'conda install numpy'

In [3]:
nums = [1,2,3]        # simple 1-d list
nums = np.array(nums) # converted into numpy arrays
nums

array([1, 2, 3])

In [4]:
nums = [1,2,3],[4,5,6]  # 2-d list
nums = np.array(nums)   # converted into 2-d numpy arrays
nums

array([[1, 2, 3],
       [4, 5, 6]])

In [5]:
nums = [[[1,2,3],[4,5,6]], [[7,8,9],[10,11,12]]]  # 3-d list
nums = np.array(nums)   # converted into 3-d numpy arrays
nums

array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[ 7,  8,  9],
        [10, 11, 12]]])

In [6]:
nums.ndim # to check dimension of array

3

In [7]:
nums.shape

(2, 2, 3)

In [8]:
# we can save only same datatype in numpy,
list = [1,2,3.0]
nums = np.array(list)
nums

array([1., 2., 3.])

In [9]:
arr = np.array(['apple', 'banana', 'cherry',1])
print(arr)

['apple' 'banana' 'cherry' '1']


## Create Numpy arrays using Function

In [10]:
#zeros, array of zeros
arr = np.zeros((4)) #1-d array of 4 zeros
arr

array([0., 0., 0., 0.])

In [11]:
arr = np.zeros((4,3)) #2-d array of 4 rows x 3 coloumns
arr

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

In [12]:
#ones, array of ones
arr = np.ones((4)) #1-d array of 4 ones
arr

array([1., 1., 1., 1.])

In [13]:
arr = np.ones((2,3)) #2-d array of 4 rows x 3 coloumns
arr

array([[1., 1., 1.],
       [1., 1., 1.]])

In [14]:
arr = np.ones((3,1,4)) # 3-d array
arr

array([[[1., 1., 1., 1.]],

       [[1., 1., 1., 1.]],

       [[1., 1., 1., 1.]]])

In [15]:
#eye, ones on diagonal
arr = np.eye((4)) #square matrix of 4 ones on diagonal
# arr = np.eye(4,3) #2-d array of ones on diagonal
arr

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

In [16]:
#diag, gives diagonal of square matrix of given list
arr = np.diag([1,2,3])
arr

array([[1, 0, 0],
       [0, 2, 0],
       [0, 0, 3]])

In [17]:
# Random array
rand_arr = np.random.randint(1,15,4) # 4 random numbers between 1 and 14
rand_arr

array([ 1,  1,  8, 13])

In [18]:
rand_arr_2d = np.random.randint(1, 15, (2, 4))
print(rand_arr_2d)

[[ 2  5 13  3]
 [ 6 13  2 12]]


In [19]:
rand_arr = np.random.rand(4)   #4 random numbers between 0 and 1
rand_arr

array([0.21150688, 0.33380623, 0.60018476, 0.75363127])

In [20]:
rand_arr = np.random.rand(4,3) #2-d array
rand_arr

array([[0.86895622, 0.51286649, 0.50263619],
       [0.9320481 , 0.37858775, 0.13391952],
       [0.36189187, 0.72705459, 0.38554728],
       [0.39467182, 0.04895092, 0.38718316]])

In [21]:
np.mean(rand_arr) #mean of all values

0.46952616020604854

## Reshaping of data

In [37]:
# with reshaping, we can change the dimensions of arrays
arr = np.random.randint(1,100,20)
# arr

In [36]:
# reshaping
# we will give the combination of total elements
arr = arr.reshape(2,10)
# arr

In [35]:
arr = arr.reshape(5,4)
# arr

In [34]:
arr = arr.reshape(5,2,2)
# arr

In [31]:
arr.ndim   #tells dimension of arr

3

In [32]:
arr.shape  #tells rows and coloumns of array

(5, 2, 2)

In [33]:
arr.size   #tells total elements of array

20

## Negative 1 use in reshape

In [39]:
# it helps to find appropriate value of rows or coloumns while reshaping
arr = np.random.randint(1,100,20)
arr = arr.reshape(5,-1) #-1 will find appropriate values of coloumns
arr 

array([[74, 43,  3, 30],
       [52, 31, 12, 51],
       [74, 22, 71, 72],
       [68,  7, 29, 89],
       [60, 46, 54, 92]])

In [40]:
arr = arr.reshape(-1,4) #-1 will find appropriate values of rows
arr

array([[74, 43,  3, 30],
       [52, 31, 12, 51],
       [74, 22, 71, 72],
       [68,  7, 29, 89],
       [60, 46, 54, 92]])

In [44]:
arr = arr.reshape(-1,10) #-1 will find appropriate values of rows
arr

array([[74, 43,  3, 30, 52, 31, 12, 51, 74, 22],
       [71, 72, 68,  7, 29, 89, 60, 46, 54, 92]])

In [45]:
np.random.seed(1)
arr = np.random.randint(1,10,5)
arr

array([6, 9, 6, 1, 1])

In [None]:
arr[0]     #6
arr[3]     #1
arr[-3:]   #array([6, 1, 1])
arr[:]     #array([6, 9, 6, 1, 1])
arr[1:4]   #array([9, 6, 1])
arr[1:4:2] #array([9, 1])
arr[::-1]  #array([1, 1, 6, 9, 6])

## Indexing and slicing of 2-d array

In [47]:
matrix = np.array([[1,2,3],[4,5,6],[7,8,9]])
matrix

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [48]:
# indexing, single bracket notation
matrix[0][2] #3
matrix[1][1] #5 

5

In [49]:
# indexing, double bracket notation
matrix[0,2] #3
matrix[1,1] #5

5

In [50]:
matrix

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [51]:
matrix[:2,:2]

array([[1, 2],
       [4, 5]])

In [52]:
matrix[1:,1:]

array([[5, 6],
       [8, 9]])

## Conditional selection

In [53]:
arr = np.arange(1,11)
arr

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [54]:
arr>5

array([False, False, False, False, False,  True,  True,  True,  True,
        True])

In [55]:
arr[arr>5]

array([ 6,  7,  8,  9, 10])

In [56]:
# conditional selection
arr[arr%2==0]

array([ 2,  4,  6,  8, 10])

In [58]:
arr[arr%2==0] = 0
arr[arr%2==1] = 1
arr

array([1, 0, 1, 0, 1, 0, 1, 0, 1, 0])

## NumPy operations with scaler 

In [59]:
# scaler operation will be performed with all elements of array
arr = np.array([1,2,3,4])
arr

array([1, 2, 3, 4])

In [60]:
arr * 2

array([2, 4, 6, 8])

In [61]:
arr

array([1, 2, 3, 4])

In [62]:
arr + 5

array([6, 7, 8, 9])

In [63]:
arr ** 2

array([ 1,  4,  9, 16])

In [66]:
print(arr // 1)

[1 2 3 4]


In [65]:
arr / 0

  arr / 0


array([inf, inf, inf, inf])

## Operations on two or more arrays 

In [67]:
arr1 = np.array([1,2,3,4])
# arr2 = np.array([5,6,7]) # due to unmatched shape, error will occure while perfoming operations
arr2 = np.array([5,6,7,8])
arr3 = np.array([9,10,11,12])

In [68]:
arr1 * arr2

array([ 5, 12, 21, 32])

In [69]:
arr1 * arr2 + arr3

array([14, 22, 32, 44])

In [71]:
# 2-d matrices
np.random.seed(24)
matrix1 = np.random.randint(1,15,9).reshape(3,3)
matrix2 = np.random.randint(16,30,(3,3))

In [72]:
matrix1

array([[ 3,  4,  1],
       [ 8,  2,  2],
       [ 2,  5, 12]])

In [73]:
matrix2

array([[20, 19, 18],
       [27, 27, 19],
       [19, 23, 25]])

In [74]:
matrix1 * matrix2

array([[ 60,  76,  18],
       [216,  54,  38],
       [ 38, 115, 300]])

In [75]:
# according to linear algabra
matrix1.dot(matrix2)

array([[187, 188, 155],
       [252, 252, 232],
       [403, 449, 431]])

## useful numpy fuctions 

In [76]:
arr = np.array([2,80,1,97,5])
arr

array([ 2, 80,  1, 97,  5])

In [77]:
np.min(arr) #to find minimum element of array

1

In [78]:
np.max(arr) #to find minimum element of array

97

In [79]:
np.argmin(arr) #to find location of minmum element

2

In [80]:
np.argmax(arr) #to find location of maximum element

3

In [81]:
np.sin(arr)

array([ 0.90929743, -0.99388865,  0.84147098,  0.37960774, -0.95892427])

In [82]:
np.cos(arr)

array([-0.41614684, -0.11038724,  0.54030231, -0.92514754,  0.28366219])

In [83]:
arr2 = np.array([1,2,3,0],dtype='bool') # non-zero elements will become True
arr2

array([ True,  True,  True, False])

In [84]:
np.random.seed(34)
matrix = np.random.randint(1,15,(3,3))
matrix

array([[ 2, 11, 11],
       [10,  7,  6],
       [ 5,  4,  6]])

In [85]:
matrix.sum()
# or
np.sum(matrix)

62

In [86]:
# rows = 1, coloumns = 0
np.min(matrix, axis=1) #min of each row

array([2, 6, 4])

In [87]:
np.max(matrix, axis=1) #max of each row

array([11, 10,  6])

In [88]:
np.sum(matrix, axis=1) #sum of each row

array([24, 23, 15])

In [89]:
np.sum(matrix, axis=0) #sum of each row

array([17, 22, 23])

In [90]:
#cumulative sum
matrix.cumsum()

array([ 2, 13, 24, 34, 41, 47, 52, 56, 62])

In [91]:
np.prod(matrix, axis=1) #product of each row

array([242, 420, 120])

## Shuffle and Unique

In [92]:
np.random.seed(34)
arr = np.random.randint(1,15,15)
arr

array([ 2, 11, 11, 10,  7,  6,  5,  4,  6, 12, 11, 14,  3, 14, 10])

In [93]:
np.random.shuffle(arr)
arr

array([ 4, 11, 11, 10, 12, 10, 11,  2, 14,  7,  3,  6,  5, 14,  6])

In [94]:
np.unique(arr) #array of unique numbers  in sorted form

array([ 2,  3,  4,  5,  6,  7, 10, 11, 12, 14])

In [95]:
np.unique(arr).size

10

## Horizontal and Vertical Stacking 

In [96]:
np.random.seed(123)
matrix1 = np.random.randint(1,20,9).reshape(3,3)
matrix2 = np.random.randint(20,40,9).reshape(3,3)

In [102]:
# matrix1

In [101]:
# matrix2

In [99]:
np.hstack((matrix1,matrix2))

array([[14,  3,  3, 35, 29, 20],
       [ 7, 18, 11, 34, 20, 35],
       [ 2,  1, 18, 39, 34, 24]])

In [100]:
np.vstack((matrix1,matrix2))

array([[14,  3,  3],
       [ 7, 18, 11],
       [ 2,  1, 18],
       [35, 29, 20],
       [34, 20, 35],
       [39, 34, 24]])

## Exercise 

In [None]:
# Import numpy library
# Create an array using array function
# Create an array containing numbers from 20 to 40
# Create an array which cantain 10 elements and every element is 5
# Create a 1-d array and then convert it into 3-d array
# Create 2-d array which contains 25 random numbers between 0 and 1
# Create array containing 20 random integers then replace every even number with -1

# you have a matrix 
# ([35, 29, 20],
#  [34, 20, 35],
#  [39, 34, 24]])
#Extract
#  [34, 20],
#  [39, 34]

# Concatinate two 1-d arrays
# Stack 2-d matrix horizontally and vertically

In [103]:
# Create an array using array function 
arr = np.array([1,2,3])
arr

array([1, 2, 3])

In [104]:
# Create an array containing numbers from 20 to 40 
arr2 = np.arange(20,41)
arr2

array([20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36,
       37, 38, 39, 40])

In [105]:
# Create an array which cantain 10 elements and every element is 5 
fives = np.ones(10) * 5
fives

array([5., 5., 5., 5., 5., 5., 5., 5., 5., 5.])

In [107]:
# Create a 1-d array and then convert it into 3-d array
arr3 = np.random.randint(1,30,9).reshape(3,1,3)
arr3

array([[[ 3, 21, 16]],

       [[25, 17,  8]],

       [[10,  4, 29]]])

In [108]:
# Create 2-d array which contains 25 random numbers between 0 and 1 
arr4 = np.random.rand(5,5)
arr4

array([[0.35772892, 0.41720995, 0.65472131, 0.37380143, 0.23451288],
       [0.98799529, 0.76599595, 0.77700444, 0.02798196, 0.17390652],
       [0.15408224, 0.07708648, 0.8898657 , 0.7503787 , 0.69340324],
       [0.51176338, 0.46426806, 0.56843069, 0.30254945, 0.49730879],
       [0.68326291, 0.91669867, 0.10892895, 0.49549179, 0.23283593]])

In [109]:
# Create array containing 20 random integers then replace every even number with -1
arr5 = np.random.randint(1,50,20)
arr5[arr5%2==0] = -1
arr5

array([21, 13, 19, -1, -1, 45, -1, 49, -1, 23, -1, -1, -1, -1, -1, -1, -1,
       35, -1, -1])

In [110]:
np.random.seed(123)
matrix = np.random.randint(20,40,9).reshape(3,3)
matrix

array([[33, 22, 22],
       [26, 37, 39],
       [30, 21, 20]])

In [111]:
matrix[1:,0:2]

array([[26, 37],
       [30, 21]])

In [113]:
# Concatinate two 1-d arrays
arr1=np.array([1,2,3])
arr2=np.array([4,5,6])
np.hstack((arr1,arr2))

array([1, 2, 3, 4, 5, 6])

In [114]:
np.vstack((arr1,arr2))

array([[1, 2, 3],
       [4, 5, 6]])

In [115]:
# Stack 2-d matrix horizontally and vertically 
np.random.seed(123)
matrix1 = np.random.randint(1,20,9).reshape(3,3)
matrix2 = np.random.randint(20,40,9).reshape(3,3)

In [116]:
np.hstack((matrix1,matrix2))

array([[14,  3,  3, 35, 29, 20],
       [ 7, 18, 11, 34, 20, 35],
       [ 2,  1, 18, 39, 34, 24]])

In [117]:
np.vstack((matrix1,matrix2))

array([[14,  3,  3],
       [ 7, 18, 11],
       [ 2,  1, 18],
       [35, 29, 20],
       [34, 20, 35],
       [39, 34, 24]])

## Numpy Arrays vs Python List

In [119]:
%timeit 7+3

14.5 ns ± 0.323 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)


In [120]:
%timeit [i**2 for i in range(1,1000000)]

151 ms ± 10.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [121]:
%timeit np.arange(1,1000000)**2

4.5 ms ± 72.5 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)


#### Scenario:
---
You are hired as a <b>Data Analyst</b> for a supermarket chain.   
You are given <b>sales data</b> of different products from multiple branches in the form of NumPy arrays.   
Your job is to perform <b>basic analysis</b> to help the business understand their performance.

In [None]:
# Branch names
branches = np.array(['North', 'South', 'East', 'West'])
# Products sold in 4 branches (rows = branches, columns = products)
sales = np.array([
    [200, 220, 250, 210],   # North
    [180, 190, 230, 200],   # South
    [210, 215, 240, 220],   # East
    [190, 205, 225, 215]    # West
])
# Prices of 4 products
prices = np.array([20, 30, 25, 22])  # in dollars

##### Tasks:
1. Total Sales Calculation
Find the total items sold by each branch.

2. Revenue Calculation
Calculate the total revenue for each branch. (Revenue = items sold × price)

3. Best Performing Branch
Find the branch with the highest total revenue.

4. Product Popularity
Find out which product is the most sold across all branches.

5. Find the percentage contribution of each product in the total sales.

In [123]:
import numpy as np

# Data Provided
branches = np.array(['North', 'South', 'East', 'West'])

sales = np.array([
    [200, 220, 250, 210],   # North
    [180, 190, 230, 200],   # South
    [210, 215, 240, 220],   # East
    [190, 205, 225, 215]    # West
])

prices = np.array([20, 30, 25, 22])  # Price per product

In [124]:
# 1. Total Items Sold by Each Branch
total_items_sold = np.sum(sales, axis=1)
print("Total items sold by each branch:", total_items_sold)

Total items sold by each branch: [880 800 885 835]


In [125]:
# 2. Total Revenue for Each Branch
# Revenue per product per branch = sales * prices
branch_revenue = np.sum(sales * prices, axis=1)
print("Total revenue by each branch:", branch_revenue)

Total revenue by each branch: [21470 19450 21490 20305]


In [126]:
# 3. Best Performing Branch (Highest Revenue)
best_branch_index = np.argmax(branch_revenue)
best_branch = branches[best_branch_index]
print("Best performing branch:", best_branch)

Best performing branch: East


In [131]:
# 4. Most Sold Product Across All Branches
product_sales_total = np.sum(sales, axis=0)
most_sold_product_index = np.argmax(product_sales_total)
print("Most sold product index:", most_sold_product_index)

[780 830 945 845]


In [133]:
#5. Percentage Contribution of Each Product in Total Sales
total_sales_all = np.sum(product_sales_total)
print(total_sales_all)
product_percentage = (product_sales_total / total_sales_all) * 100
print("Product sales percentage:", np.round(product_percentage, 2))

3400
Product sales percentage: [22.94 24.41 27.79 24.85]
