<b> Introduction to NumPy:</b>

NumPy, short for Numerical Python, is a fundamental package for scientific computing in Python. It provides support for arrays (multi-dimensional data structures) and matrices, along with a collection of mathematical functions to operate on these arrays. Arrays are more efficient than lists for numerical operations due to their fixed size and homogeneous data type.

<b> NumPy Array Creation: </b>

Creating arrays in NumPy can be done in several ways, one of them is converting Python lists.

<b> -> From Lists: Convert a Python list to a NumPy array </b>

In [3]:
import numpy as np
array_from_list = np.array([1, 2, 3, 4, 5])
print(array_from_list)

[1 2 3 4 5]


<b> Basic Operations:</b>

NumPy allows element-wise arithmetic operations on arrays, which are executed faster than traditional for-loops in Python. It also supports broadcasting, enabling operations on arrays of different shapes.

<b>->Arithmetic Operations:</b> Perform element-wise operations

In [25]:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
sum_ab = a + b  
print(sum_ab)

[5 7 9]


<b>->Broadcasting:</b> Automatically expand dimensions for operations

In [26]:
a = np.array([1, 2, 3])
b = 2
product_ab = a * b  
print(product_ab)

[2 4 6]


<b>Array Properties:</b>

Arrays have various properties such as size and data type, which provide important information about the array structure and content.

<b>->Size:</b> The total number of elements

In [28]:
array_size = array_from_list.size  
print(array_size )

5


<b>->Data Type:</b> The type of elements in the array

In [29]:
array_dtype = array_from_list.dtype 
print(array_dtype)

int32


<b> Data Manipulation:</b>

Data manipulation in NumPy involves creating arrays, accessing elements through indexing, extracting subarrays via slicing, reshaping arrays, and applying mathematical operations. These operations enable efficient data processing and transformation.

<b>Array Creation:</b>
Creating arrays is the first step in data manipulation. Arrays can be created from lists, tuples, or other array-like structures. Arrays can also be initialized with specific values using functions like np.zeros(), np.ones(), and np.full().

In [42]:
import numpy as np
array_1d = np.array([1, 2, 3, 4, 5])
array_2d = np.array([[1, 2, 3], [4, 5, 6]])
print(array_1d)
print(array_2d)

[1 2 3 4 5]
[[1 2 3]
 [4 5 6]]


<b>NumPy Array Indexing: </b>

Array indexing is the same as accessing an array element.You can access an array element by referring to its index number.The indexes in NumPy arrays start with 0, meaning that the first element has index 0, and the second has index 1 etc.

In [31]:
#Get the first element from the following array:
import numpy as np
arr = np.array([1, 2, 3, 4])
print(arr[0])

1


In [32]:
#Get the second element from the following array:
import numpy as np
arr = np.array([1, 2, 3, 4])
print(arr[1])

2


<b>NumPy Array Slicing:</b>

Slicing in python means taking elements from one given index to another given index. We pass slice instead of index like this: [start:end]. We can also define the step, like this: [start:end:step]. If we don't pass start its considered 0, If we don't pass end its considered length of array in that dimension, If we don't pass step its considered 1.

In [33]:
#Slice elements from index 1 to index 5 from the following array:
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[1:5])

[2 3 4 5]


In [34]:
#Slice elements from the beginning to index 4 (not included):
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[:4])

[1 2 3 4]


In [35]:
#Slice elements from index 4 to the end of the array:
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[4:])

[5 6 7]


<b>Negative Slicing:</b>Use the minus operator to refer to an index from the end

In [36]:
#Slice from the index 3 from the end to index 1 from the end:
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[-3:-1])

[5 6]


<b>Step:</b>

In [37]:
#Return every other element from index 1 to index 5:
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[1:5:2])

[2 4]


In [38]:
#Return every other element from the entire array:
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[::2])

[1 3 5 7]


<b>Reshaping: </b>

Reshaping changes the shape of the array without altering its data. This is useful for converting one-dimensional arrays to multi-dimensional arrays and vice versa.

In [43]:
reshaped_array = array_1d.reshape(1, 5)
print(reshaped_array)

[[1 2 3 4 5]]


<b>Mathematical Operations:</b>

Mathematical operations like addition, subtraction, multiplication, and division can be applied element-wise on arrays. NumPy also provides functions for more complex operations like trigonometry, logarithms, and exponentiation.

<b>Element-wise Operations</b>

Element-wise operations in NumPy are performed directly between arrays of the same shape. These operations include addition, subtraction, multiplication, and division. NumPy also supports more complex mathematical functions such as trigonometric functions, logarithms, and exponentiation, which can be applied element-wise to arrays.

<b>Addition (+ or np.add()):</b> Adds corresponding elements of two arrays.

<b>Subtraction (- or np.subtract()):</b> Subtracts corresponding elements of two arrays.

<b>Multiplication (* or np.multiply()):</b> Multiplies corresponding elements of two arrays.

<b>Division (/ or np.divide()):</b> Divides corresponding elements of two arrays.

<b>Complex Operations</b>

<b>Trigonometric Functions (np.sin(), np.cos(), np.tan()):</b> Compute the sine, cosine, and tangent of each element.

<b>Logarithmic Functions (np.log(), np.log10()):</b> Compute the natural logarithm or the base-10 logarithm of each element.

<b>Exponentiation (np.exp()):</b> Compute the exponential (e^x) of each element.

In [54]:
import numpy as np

# Creating sample arrays
array1 = np.array([1, 2, 3, 4, 5])
array2 = np.array([10, 20, 30, 40, 50])

# Element-wise addition
addition_result = np.add(array1, array2)
print("Addition:", addition_result)  

# Element-wise subtraction
subtraction_result = np.subtract(array1, array2)
print("Subtraction:", subtraction_result)  

# Element-wise multiplication
multiplication_result = np.multiply(array1, array2)
print("Multiplication:", multiplication_result)  

# Element-wise division
division_result = np.divide(array1, array2)
print("Division:", division_result)  

# Trigonometric functions
sin_result = np.sin(array1)
print("Sine:", sin_result)  

cos_result = np.cos(array1)
print("Cosine:", cos_result) 

tan_result = np.tan(array1)
print("Tangent:", tan_result)  

# Logarithmic functions
log_result = np.log(array1)
print("Natural Logarithm:", log_result)  

log10_result = np.log10(array1)
print("Base-10 Logarithm:", log10_result) 

# Exponentiation
exp_result = np.exp(array1)
print("Exponential:", exp_result)  


Addition: [11 22 33 44 55]
Subtraction: [ -9 -18 -27 -36 -45]
Multiplication: [ 10  40  90 160 250]
Division: [0.1 0.1 0.1 0.1 0.1]
Sine: [ 0.84147098  0.90929743  0.14112001 -0.7568025  -0.95892427]
Cosine: [ 0.54030231 -0.41614684 -0.9899925  -0.65364362  0.28366219]
Tangent: [ 1.55740772 -2.18503986 -0.14254654  1.15782128 -3.38051501]
Natural Logarithm: [0.         0.69314718 1.09861229 1.38629436 1.60943791]
Base-10 Logarithm: [0.         0.30103    0.47712125 0.60205999 0.69897   ]
Exponential: [  2.71828183   7.3890561   20.08553692  54.59815003 148.4131591 ]


<b>Data Aggregation:</b>
Data aggregation involves computing summary statistics, such as mean, median, standard deviation, and sum. These functions provide insights into the dataset and help summarize data concisely.

<b>->Summary Statistics:</b>
Summary statistics provide a quick overview of the data, highlighting central tendencies, dispersion, and overall range. Functions like np.mean(), np.median(), and np.std() are used for this purpose.

In [55]:
mean = np.mean(array_1d)  
median = np.median(array_2d)  
std_dev = np.std(array_2d)  
print(mean)
print(median)
print(std_dev)

3.0
3.5
1.707825127659933


<b>Grouping Data and Aggregations:</b>

Grouping data and performing aggregations on groups can reveal patterns and relationships within the data. The np.unique() function helps identify unique elements and their counts in the array.

In [53]:
grouped_data = np.array([1, 2, 2, 3, 3, 3, 4])
unique_elements, counts = np.unique(grouped_data, return_counts=True)
print(grouped_data)
print(unique_elements, counts )

[1 2 2 3 3 3 4]
[1 2 3 4] [1 2 3 1]


<b>Data Analysis:</b>

Data analysis in NumPy includes finding correlations, identifying outliers, and calculating percentiles. These operations help uncover relationships, detect anomalies, and understand data distribution.

<b>->Correlation:</b>
Correlation measures the linear relationship between two datasets. The np.corrcoef() function returns the correlation coefficient matrix, indicating the strength and direction of the relationship.

In [47]:
array_3 = np.array([1, 2, 3, 4, 5])
array_4 = np.array([5, 4, 3, 2, 1])
correlation = np.corrcoef(array_3, array_4)
print(correlation )

[[ 1. -1.]
 [-1.  1.]]


<b>Identifying Outliers:</b>
Outliers are data points that differ significantly from other observations. Identifying outliers is crucial for data cleaning and ensuring accurate analysis. The standard deviation method is commonly used to detect outliers.

In [48]:
data = np.array([10, 12, 12, 13, 12, 15, 16, 100])
mean = np.mean(data)
std_dev = np.std(data)
outliers = data[np.abs(data - mean) > 2 * std_dev]
print(mean )
print(std_dev)
print(outliers)

23.75
28.873647154455565
[100]


<b>Percentiles:</b>
Percentiles indicate the relative standing of a value within a dataset. The np.percentile() function computes the percentiles, providing insights into the data distribution.

In [50]:
percentile_25 = np.percentile(data, 25)  
percentile_50 = np.percentile(data, 50)  
percentile_75 = np.percentile(data, 75)  
print(percentile_25)
print(percentile_50)
print(percentile_75)

12.0
12.5
15.25


<b>Application in Data Science:</b>

NumPy's capabilities make it an essential tool for data science professionals. Its efficient handling of numerical computations, large datasets, and mathematical functions provides a strong foundation for data analysis, machine learning, financial analysis, and scientific research.

<b>Advantages of NumPy:</b>

<b>Performance:</b> NumPy arrays are faster and more efficient than Python lists, making it ideal for numerical operations.

<b>Functionality:</b> Provides a wide range of mathematical functions and operations, enabling complex calculations.

<b>Scalability:</b> Can handle large datasets with ease, crucial for data-intensive fields like machine learning and big data analysis.

<b>Real-World Applications:</b>

<b>Machine Learning:</b> NumPy is used for data preprocessing, creating datasets, and performing mathematical operations on data.

<b>Financial Analysis:</b> Helps in performing complex financial calculations, such as portfolio optimization and risk assessment.

<b>Scientific Research:</b> Widely used in scientific computing for simulations, data analysis, and visualization.