<a href="https://colab.research.google.com/github/im-ankitjaiswal/Python-Course/blob/main/Numpy_revise_01.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
import numpy as np


NumPy is a Python library used for working with arrays.

It also has functions for working in domain of linear algebra, fourier transform, and matrices.

NumPy was created in 2005 by Travis Oliphant. It is an open source project and you can use it freely.

NumPy stands for Numerical Python.


# Why Use NumPy?

In Python we have lists that serve the purpose of arrays, but they are slow to process.

NumPy aims to provide an array object that is up to 50x faster than traditional Python lists.

The array object in NumPy is called ndarray, it provides a lot of supporting functions that make working with ndarray very easy.

Arrays are very frequently used in data science, where speed and resources are very important.

## Why is NumPy Faster Than Lists?
NumPy arrays are stored at one continuous place in memory unlike lists, so processes can access and manipulate them very efficiently.

This behavior is called locality of reference in computer science.

This is the main reason why NumPy is faster than lists. Also it is optimized to work with latest CPU architectures.


# Create a NumPy ndarray Object
NumPy is used to work with arrays. The array object in NumPy is called ndarray.

We can create a NumPy ndarray object by using the array() function.

In [None]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5])

print(arr)

print(type(arr))

[1 2 3 4 5]
<class 'numpy.ndarray'>


type(): This built-in Python function tells us the type of the object passed to it. Like in above code it shows that arr is numpy.ndarray type.

To create an ndarray, we can pass a list, tuple or any array-like object into the array() method, and it will be converted into an ndarray:

In [None]:
# Use a tuple to create a NumPy array:
arr = np.array((1, 2, 3, 4, 5))
print(arr)

[1 2 3 4 5]


# Dimensions in Arrays
A dimension in arrays is one level of array depth (nested arrays).

0-D Arrays
0-D arrays, or Scalars, are the elements in an array. Each value in an array is a 0-D array.

In [None]:
# Create a 0-D array with value 42
arr = np.array(42)
print(arr)

42


# 1-D Arrays
An array that has 0-D arrays as its elements is called uni-dimensional or 1-D array.

These are the most common and basic arrays.



In [None]:
# Create a 1-D array containing the values 1,2,3,4,5:
arr = np.array([1, 2, 3, 4, 5])

print(arr)

[1 2 3 4 5]


# 2-D Arrays
An array that has 1-D arrays as its elements is called a 2-D array.

These are often used to represent matrix or 2nd order tensors.

In [None]:
arr = np.array([[1, 2, 3], [4, 5, 6]])
print(arr)

[[1 2 3]
 [4 5 6]]


# Check Number of Dimensions?
NumPy Arrays provides the ndim attribute that returns an integer that tells us how many dimensions the array have

In [None]:
import numpy as np

a = np.array(42)
b = np.array([1, 2, 3, 4, 5])
c = np.array([[1, 2, 3], [4, 5, 6]])
d = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])

print(a.ndim)
print(b.ndim)
print(c.ndim)
print(d.ndim)

0
1
2
3


# Higher Dimensional Arrays
An array can have any number of dimensions.

When the array is created, you can define the number of dimensions by using the ndmin argument.

In [None]:
import numpy as np

arr = np.array([1, 2, 3, 4], ndmin=5)

print(arr)
print('number of dimensions :', arr.ndim)

[[[[[1 2 3 4]]]]]
number of dimensions : 5


# NumPy Array Slicing
Slicing arrays
Slicing in python means taking elements from one given index to another given index.

We pass slice instead of index like this: [start:end].

We can also define the step, like this: [start:end:step].

If we don't pass start its considered 0

If we don't pass end its considered length of array in that dimension

If we don't pass step its considered 1

# Data Types in NumPy
NumPy has some extra data types, and refer to data types with one character, like i for integers, u for unsigned integers etc.

Below is a list of all data types in NumPy and the characters used to represent them.


i - integer
b - boolean
u - unsigned integer
f - float
c - complex float
m - timedelta
M - datetime
O - object
S - string
U - unicode string
V - fixed chunk of memory for other type ( void )




In [None]:
# Checking the Data Type of an Array
# The NumPy array object has a property called dtype that returns the data type of the array:


arr = np.array([1, 2, 3, 4])

print(arr.dtype)

int64


**NumPy Array Shape**

Shape of an Array
The shape of an array is the number of elements in each dimension.



In [None]:
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])

print(arr.shape)

# The example above returns (2, 4), which means that the array has 2 dimensions, where the first dimension has 2 elements and the second has 4.

(2, 4)


What does the shape tuple represent?
Integers at every index tells about the number of elements the corresponding dimension has.

In the example above at index-4 we have value 4, so we can say that 5th ( 4 + 1 th) dimension has 4 elements.

# Reshaping arrays
Reshaping means changing the shape of an array.

The shape of an array is the number of elements in each dimension.

By reshaping we can add or remove dimensions or change number of elements in each dimension.

In [None]:
# Convert the following 1-D array with 12 elements into a 2-D array.
# The outermost dimension will have 4 arrays, each with 3 elements:

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])

newarr = arr.reshape(4, 3)

print(newarr)

[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]


In [None]:
# Convert the following 1-D array with 12 elements into a 3-D array.
# The outermost dimension will have 2 arrays that contains 3 arrays, each with 2 elements:

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])

newarr = arr.reshape(2, 3, 2)

print(newarr)

[[[ 1  2]
  [ 3  4]
  [ 5  6]]

 [[ 7  8]
  [ 9 10]
  [11 12]]]


In [None]:
# Can We Reshape Into any Shape?
# Yes, as long as the elements required for reshaping are equal in both shapes.
# We can reshape an 8 elements 1D array into 4 elements in 2 rows 2D array but we cannot reshape it into a 3 elements 3 rows 2D array as that would require 3x3 = 9 elements.


arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])

newarr = arr.reshape(3, 3)

print(newarr)

ValueError: cannot reshape array of size 8 into shape (3,3)

In [None]:
# Check if the returned array is a copy or a view:

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])

print(arr.reshape(2, 4).base)

[1 2 3 4 5 6 7 8]


# Flattening the arrays
Flattening array means converting a multidimensional array into a 1D array.

We can use reshape(-1) to do this.

In [None]:
import numpy as np

arr = np.array([[1, 2, 3], [4, 5, 6]])

newarr = arr.reshape(-1)

print(newarr)

[1 2 3 4 5 6]


If there is "n" dimension then we can not write n for loops to get the elemebts from array

# Iterating Arrays Using nditer() - used for complex iternation onitems
 The function nditer() is a helping function that can be used from very basic to very advanced iterations. It solves some basic issues which we face in iteration, lets go through it with examples.

In [None]:


# Iterating on Each Scalar Element
# In basic for loops, iterating through each scalar of an array we need to use n for loops which can be difficult to write for arrays with very high dimensionality.

# Example
# Iterate through the following 3-D array:

import numpy as np

arr = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])

for x in np.nditer(arr):
  print(x)

1
2
3
4
5
6
7
8


# NumPy Splitting Array
Splitting is reverse operation of Joining.

Joining merges multiple arrays into one and Splitting breaks one array into multiple.

We use array_split() for splitting arrays, we pass it the array we want to split and the number of splits.

In [None]:

arr = np.array([1, 2, 3, 4, 5, 6])

newarr = np.array_split(arr, 3)

print(newarr[0])
print(newarr[1])
print(newarr[2])
newarr

[1 2]
[3 4]
[5 6]


[array([1, 2]), array([3, 4]), array([5, 6])]

In [None]:
# Splitting 2-D Arrays
# Use the same syntax when splitting 2-D arrays.
# Use the array_split() method, pass in the array you want to split and the number of splits you want to do.


import numpy as np

arr = np.array([[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12]])

newarr = np.array_split(arr, 3)

print(newarr)

[array([[1, 2],
       [3, 4]]), array([[5, 6],
       [7, 8]]), array([[ 9, 10],
       [11, 12]])]


# NumPy Searching Arrays
You can search an array for a certain value, and return the indexes that get a match.

To search an array, use the where() method.

In [None]:

arr = np.array([1, 2, 3, 4, 5, 4, 4])

x = np.where(arr == 4)

print(x)
# returns the index where element where found

(array([3, 5, 6]),)


# Search Sorted
There is a method called searchsorted() which performs a binary search in the array, and returns the index where the specified value would be inserted to maintain the search order.

The searchsorted() method is assumed to be used on sorted arrays.

Example
Find the indexes where the value 7 should be inserted:

In [None]:
import numpy as np

arr = np.array([6, 7, 8, 9])

x = np.searchsorted(arr, 7)

print(x)


1


Example explained: The number 7 should be inserted on index 1 to remain the sort order.

The method starts the search from the left and returns the first index where the number 7 is no longer larger than the next value.


# Search From the Right Side
By default the left most index is returned, but we can give side='right' to return the right most index instead.

Example
Find the indexes where the value 7 should be inserted, starting from the right:



In [None]:
import numpy as np

arr = np.array([6, 7, 8, 9])

x = np.searchsorted(arr, 7, side='right')

print(x)


2


Example explained: The number 7 should be inserted on index 2 to remain the sort order.

The method starts the search from the right and returns the first index where the number 7 is no longer less than the next value.

# NumPy Sorting Arrays
Sorting means putting elements in an ordered sequence.

Ordered sequence is any sequence that has an order corresponding to elements, like numeric or alphabetical, ascending or descending.

The NumPy ndarray object has a function called sort(), that will sort a specified array.


In [None]:

arr = np.array([3, 2, 0, 1])

print(np.sort(arr))

[0 1 2 3]


# Sorting a 2-D Array
 If you use the sort() method on a 2-D array, both arrays will be sorted:

In [None]:


# Example
# Sort a 2-D array:

import numpy as np

arr = np.array([[3, 2, 4], [5, 0, 1]])

print(np.sort(arr))

[[2 3 4]
 [0 1 5]]


# NumPy Filter Array
Getting some elements out of an existing array and creating a new array out of them is called filtering.

In NumPy, you filter an array using a boolean index list.

In [None]:
import numpy as np

arr = np.array([41, 42, 43, 44])

x = arr[[True, False, True, False]]

print(x)


[41 43]


The example above will return [41, 43], why?

Because the new array contains only the values where the filter array had the value True, in this case, index 0 and 2.

In [None]:
# Create a filter array that will return only values higher than 42:

import numpy as np

arr = np.array([41, 42, 43, 44])

# Create an empty list
filter_arr = []

# go through each element in arr
for element in arr:
  # if the element is higher than 42, set the value to True, otherwise False:
  if element > 42:
    filter_arr.append(True)
  else:
    filter_arr.append(False)

newarr = arr[filter_arr]

print(filter_arr)
print(newarr) # contains the filtered value from the array

[False, False, True, True]
[43 44]


# **Broadcasting Basics** 🔥
Broadcasting in NumPy is a powerful mechanism that allows NumPy to work with arrays of different shapes during arithmetic operations. It is an elegant way to vectorize array operations so that looping is done in C instead of Python, allowing for efficient computation.

The simplest example of broadcasting occurs when performing arithmetic operations between an array and a scalar. NumPy allows you to add a scalar to an array or multiply an array by a scalar directly:

In [None]:
import numpy as np

# Example 1: Adding a scalar to an array
array = np.array([1, 2, 3])
scalar = 2
result = array + scalar  # [3, 4, 5]
print(result)

[3 4 5]


# Broadcasting with Arrays of Different Shapes
Broadcasting can also happen with arrays of different shapes. For two arrays to be broadcast-compatible, the following rules must be satisfied:

If the arrays differ in their number of dimensions, the shape of the smaller array is padded with ones on its left side.
Arrays are compatible for broadcasting if in all dimensions, the axis lengths are either the same or one of them is 1.
Here are some examples:

Example 2: Broadcasting a 1D array to a 2D array

In [None]:
# 2D array
matrix = np.array([[1, 2, 3], [4, 5, 6]])

# 1D array
vector = np.array([1, 0, 1])

# Broadcasting the 1D array to match the 2D array
result = matrix + vector
# result is:
# array([[2, 2, 4],
#        [5, 5, 7]])


In [None]:
# In this example, the 1D array vector is broadcasted to each row of the 2D array matrix.

# **Practical Example**
Suppose we have a dataset where we need to normalize the features by subtracting the mean and dividing by the standard deviation of each feature. Here's how you can do it using broadcasting:

In [None]:
# Example dataset (rows are samples, columns are features)
data = np.array([[1.0, 2.0, 3.0],
                 [4.0, 5.0, 6.0],
                 [7.0, 8.0, 9.0]])

# Compute the mean and standard deviation along the columns
mean = np.mean(data, axis=0)  # mean along columns
std = np.std(data, axis=0)    # standard deviation along columns

# Normalize the dataset
normalized_data = (data - mean) / std
# normalized_data is:
# array([[-1.22474487, -1.22474487, -1.22474487],
#        [ 0.        ,  0.        ,  0.        ],
#        [ 1.22474487,  1.22474487,  1.22474487]])


In this example, the arrays mean and std are broadcasted across each row of data to perform element-wise subtraction and division.

Conclusion
Broadcasting allows NumPy to perform operations on arrays of different shapes and sizes efficiently. It eliminates the need for explicit loops and makes code more readable and concise. Understanding broadcasting rules is essential for writing efficient numerical computations in NumPy.

# **Real Life Usefullness of Broadcasting**

# Example 1: Image Processing - Adjusting Brightness
Suppose you have an image represented as a 3D NumPy array (height x width x color channels), and you want to increase the brightness of the image by a certain factor. This can be done using broadcasting:

In [None]:
import numpy as np

# Example image (height=2, width=3, color channels=3)
image = np.array([[[100, 150, 200], [120, 170, 220], [130, 180, 230]],
                  [[140, 190, 240], [150, 200, 250], [160, 210, 260]]], dtype=np.uint8)

# Brightness factor
brightness_increase = 50

# Broadcasting to increase brightness
brightened_image = np.clip(image + brightness_increase, 0, 255)
# brightened_image is:
# array([[[150, 200, 250],
#         [170, 220, 255],
#         [180, 230, 255]],
#        [[190, 240, 255],
#         [200, 250, 255],
#         [210, 255, 255]]], dtype=uint8)


Here, the scalar brightness_increase is broadcasted to each pixel in the 3D image array, and np.clip is used to ensure pixel values stay within the valid range [0, 255].

# Example 2: Financial Data - Normalizing Stock Prices

Imagine you have daily stock prices for multiple companies over a year, and you want to normalize these prices by subtracting the mean and dividing by the standard deviation for each company.

In [None]:
# Example stock prices (rows are days, columns are companies)
stock_prices = np.array([[100, 200, 300],
                         [110, 210, 310],
                         [120, 220, 320],
                         [130, 230, 330]])

# Compute the mean and standard deviation along the columns (for each company)
mean_prices = np.mean(stock_prices, axis=0)
std_prices = np.std(stock_prices, axis=0)

# Normalize the stock prices
normalized_prices = (stock_prices - mean_prices) / std_prices
# normalized_prices is:
# array([[-1.34164079, -1.34164079, -1.34164079],
#        [-0.4472136 , -0.4472136 , -0.4472136 ],
#        [ 0.4472136 ,  0.4472136 ,  0.4472136 ],
#        [ 1.34164079,  1.34164079,  1.34164079]])


In this example, the arrays mean_prices and std_prices are broadcasted across each row of stock_prices to perform element-wise subtraction and division.

# Example 3: Data Analysis - Scaling Features

Suppose you have a dataset with multiple features (columns) and want to scale each feature to a range [0, 1]. This is commonly done in machine learning preprocessing.

In [None]:
# Example dataset (rows are samples, columns are features)
data = np.array([[2.0, 8.0, 6.0],
                 [4.0, 2.0, 8.0],
                 [1.0, 3.0, 7.0]])

# Compute the minimum and maximum values along the columns
min_vals = np.min(data, axis=0)
max_vals = np.max(data, axis=0)

# Scale the dataset to [0, 1]
scaled_data = (data - min_vals) / (max_vals - min_vals)
# scaled_data is:
# array([[0.5       , 1.        , 0.5       ],
#        [1.        , 0.        , 1.        ],
#        [0.        , 0.16666667, 0.75      ]])


Here, the arrays min_vals and max_vals are broadcasted across each row of data to perform element-wise subtraction and division.