## NumPy

### Introduction

NumPy (Numerical Python) is a fundamental library in Python that is used for scientific computing. It provides support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays. 

1. Core Component: ndarray
ndarray (N-dimensional array): The primary object in NumPy is the ndarray, a powerful N-dimensional array. It is a grid of values, all of the same type, indexed by a tuple of non-negative integers.
Dimensions (axes): The number of dimensions is called the rank of the array, and the size of the array along each dimension is determined by its shape, which is a tuple of integers.

2. Key Features
Efficient Storage: Arrays in NumPy are more compact and efficient than Python lists, particularly for large data sets. This is because arrays are stored as contiguous blocks of memory, allowing for fast access and manipulation.
Mathematical Operations: NumPy supports a wide range of mathematical operations such as element-wise operations, linear algebra (e.g., matrix multiplication), and statistical functions (e.g., mean, standard deviation).
Broadcasting: NumPy arrays support broadcasting, which allows you to perform arithmetic operations on arrays of different shapes without needing to explicitly reshape them.
Universal Functions (ufuncs): These are functions that operate on ndarray in an element-wise fashion, such as numpy.add, numpy.subtract, and trigonometric functions like numpy.sin.

3. Applications
Data Analysis: NumPy is widely used in data analysis and is often the basis for other data analysis libraries like Pandas.
Machine Learning: It forms the foundation for libraries such as TensorFlow and PyTorch, which are used in machine learning and deep learning.
Scientific Computing: Researchers use NumPy in scientific computing fields like physics, chemistry, and biology for simulations and data processing.

4. Interoperability
Integration with Other Libraries: NumPy is highly interoperable with other Python libraries, like SciPy (for more advanced mathematical operations), Matplotlib (for plotting), and Pandas (for data manipulation).


In [12]:
import numpy as np

## Create an array using Numpy
## Create a 1D array

arr1 = np.array([1,2,3,4,5,6])

print(arr1)
print(type(arr1))
print(arr1.shape) # arr.shape is used to find the no of rows and columns for an N-Dimensional Array.

## Create a 2D array
# While creating 2D array remember that all rows must have equal number of columns.
arr2 = np.array([[1,2,3,4,5],[5,6,7,8,9]])

[1 2 3 4 5 6]
<class 'numpy.ndarray'>
(6,)


In [13]:
## arr.reshape(rows, columns) are used to reshape the given array. But row*columns should be equal to len(arr)
arr3 = np.array([1,2,3,4,5,6])
arr3.reshape(2,3) 

array([[1, 2, 3],
       [4, 5, 6]])

In [16]:
np.arange(1,10,2) ##Here it arange(start,stop,step) generates similar to list
np.arange(2,50,2).reshape(4,6)

array([[ 2,  4,  6,  8, 10, 12],
       [14, 16, 18, 20, 22, 24],
       [26, 28, 30, 32, 34, 36],
       [38, 40, 42, 44, 46, 48]])

In [65]:
print(np.ones((5,6))) #Generates ones with 5 rows and 6 columns
print(np.ones(4, dtype=int))
print(np.ones((5,6)))
print(np.zeros((5,6))) #Generates zeros with 5 rows and 6 columns

[[1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1.]]
[1 1 1 1]
[[1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1.]]
[[0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0.]]


In [20]:
##Identity Matrix
##Returns matrix with all zeros except ones in diagonal
np.eye(6)

array([[1., 0., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0., 0.],
       [0., 0., 1., 0., 0., 0.],
       [0., 0., 0., 1., 0., 0.],
       [0., 0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 0., 1.]])

In [21]:
## Attributes of Numpy Array
arr = np.array([[1, 2, 3], [4, 5, 6]])

print("Array:\n", arr)
print("Shape:", arr.shape)  # Output: (2, 3)
print("Number of dimensions:", arr.ndim)  # Output: 2
print("Size (number of elements):", arr.size)  # Output: 6
print("Data type:", arr.dtype)  # Output: int32 (may vary based on platform)
print("Item size (in bytes):", arr.itemsize)  # Output: 8 (may vary based on platform)

Array:
 [[1 2 3]
 [4 5 6]]
Shape: (2, 3)
Number of dimensions: 2
Size (number of elements): 6
Data type: int32
Item size (in bytes): 4


In [58]:
## NumPy vectorized operations

arr5 = np.array([10,20,30,40,50])
arr6 = np.array([1,2,3,4,5])

#### Here operations of additions etc, happens element wise unlike like list

### Element Wise addition
print("Addition:", arr5+arr6)

## Element Wise Substraction
print("Substraction:", arr5-arr6)

# Element-wise multiplication
print("Multiplication:", arr5 * arr6)

# Element-wise division
print("Division:", arr5 / arr6)


##If you compare list and np.arr these are operations are different for both of them
lst = [1,2,3,4,5]
arr6 = np.array([1,2,3,4,5])

print(lst*2) ##o/p : [1, 2, 3, 4, 5, 1, 2, 3, 4, 5]
print(arr6*2) ##o/p :[ 2  4  6  8  10] 


Addition: [11 22 33 44 55]
Substraction: [ 9 18 27 36 45]
Multiplication: [ 10  40  90 160 250]
Division: [10. 10. 10. 10. 10.]
[1, 2, 3, 4, 5, 1, 2, 3, 4, 5]
[ 2  4  6  8 10]


In [26]:
## Universal Function
## Here all these universal functions will be applied to entire functions
arr=np.array([2,3,4,5,6])

## square root
print(np.sqrt(arr))

## Exponential
print(np.exp(arr))

## Sine
print(np.sin(arr))

## natural log
print(np.log(arr))

[1.41421356 1.73205081 2.         2.23606798 2.44948974]
[  7.3890561   20.08553692  54.59815003 148.4131591  403.42879349]
[ 0.90929743  0.14112001 -0.7568025  -0.95892427 -0.2794155 ]
[0.69314718 1.09861229 1.38629436 1.60943791 1.79175947]


In [43]:
## Array Slicing and Indexing Operations

arr7 = np.array([[1,2,3,4,5],[6,7,8,9,10],[11,12,13,14,15]])
print("arr7 = \n", arr7)

arr7[1][0]
arr7[0:2, 1:3]
arr7[1:, 1:4]




arr7 = 
 [[ 1  2  3  4  5]
 [ 6  7  8  9 10]
 [11 12 13 14 15]]


array([[ 7,  8,  9],
       [12, 13, 14]])

In [45]:
## Modify array elements
arr7[0,0]=100
print(arr7)

arr7[2:] = 100
print(arr7)

[[100   2   3   4   5]
 [  6   7   8   9  10]
 [ 11  12  13  14  15]]
[[100   2   3   4   5]
 [  6   7   8   9  10]
 [100 100 100 100 100]]


In [46]:
### statistical concepts--Normalization
##to have a mean of 0 and standard deviation of 1
data = np.array([1, 2, 3, 4, 5])

# Calculate the mean and standard deviation
mean = np.mean(data)
std_dev = np.std(data)

# Normalize the data
normalized_data = (data - mean) / std_dev
print("Normalized data:", normalized_data)

Normalized data: [-1.41421356 -0.70710678  0.          0.70710678  1.41421356]


In [47]:
data = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

# Mean
mean = np.mean(data)
print("Mean:", mean)

# Median
median = np.median(data)
print("Median:", median)

# Standard deviation
std_dev = np.std(data)
print("Standard Deviation:", std_dev)

# Variance
variance = np.var(data)
print("Variance:", variance)

Mean: 5.5
Median: 5.5
Standard Deviation: 2.8722813232690143
Variance: 8.25


In [55]:
## Logical operation
data=np.array([1,2,3,4,5,6,7,8,9,10])

print(data>6)
print(data[data>5])
data[(data>=5) & (data<=8)]

[False False False False False False  True  True  True  True]
[ 6  7  8  9 10]


array([5, 6, 7, 8])

In [57]:
##np.linespace(start, stop, num) provides num points between start and stop
np.linspace(1,10,50)

array([ 1.        ,  1.18367347,  1.36734694,  1.55102041,  1.73469388,
        1.91836735,  2.10204082,  2.28571429,  2.46938776,  2.65306122,
        2.83673469,  3.02040816,  3.20408163,  3.3877551 ,  3.57142857,
        3.75510204,  3.93877551,  4.12244898,  4.30612245,  4.48979592,
        4.67346939,  4.85714286,  5.04081633,  5.2244898 ,  5.40816327,
        5.59183673,  5.7755102 ,  5.95918367,  6.14285714,  6.32653061,
        6.51020408,  6.69387755,  6.87755102,  7.06122449,  7.24489796,
        7.42857143,  7.6122449 ,  7.79591837,  7.97959184,  8.16326531,
        8.34693878,  8.53061224,  8.71428571,  8.89795918,  9.08163265,
        9.26530612,  9.44897959,  9.63265306,  9.81632653, 10.        ])

In [66]:
## Gives random values between 0 and 1 in the given rows*columns shape
np.random.rand(3,3)

array([[0.8712898 , 0.40107639, 0.41654583],
       [0.767591  , 0.3513714 , 0.7602165 ],
       [0.25877722, 0.21434004, 0.13256026]])

In [67]:
## Gives random values in standard normal distribution in the given rows*columns shape
np.random.randn(4,4)

array([[-0.84758222, -1.24273397,  2.01037357,  1.55203471],
       [-1.03558289,  0.79806283,  1.19870829, -0.39880597],
       [ 1.73829601, -1.15704738, -0.49322039, -0.95482812],
       [-0.55306824,  1.25029203, -1.02989427, -1.04302969]])

In [72]:
## randint(low, high, size) provide array of size between low (inclusive) and high(exclusive)
np.random.randint(1,100,5)

array([47, 71, 45, 99, 18])

In [74]:
## Returns random floats in the half open interval of [0.0, 1.0]
np.random.random_sample((1,5))

array([[0.08502066, 0.12869843, 0.19963522, 0.69825183, 0.74365366]])