# Introduction to NumPy

In [30]:
import numpy as np

## Overview

NumPy, short for Numerical Python, is a foundational package for numerical computations in Python. It provides support for arrays (including multidimensional arrays), as well as an assortment of mathematical functions to operate on these arrays. With NumPy, mathematical and logical operations on arrays can be executed efficiently.

> **Efficiency of NumPy Arrays vs. Python Lists**
>
> NumPy arrays offer significant advantages over Python lists in terms of efficiency, making them the preferred choice for numerical and scientific computing. Here's a comparison of key factors that highlight the efficiency of NumPy arrays:
>
> - **Homogeneous Data Types:** NumPy arrays are designed to store elements of the same data type, which allows for efficient memory usage and faster mathematical operations. In contrast, Python lists can contain elements of different data types, resulting in increased memory overhead and slower computations when performing operations.
>
> - **Vectorization:** NumPy leverages vectorized operations, which means that mathematical operations are applied element-wise to entire arrays, rather than using explicit loops. This approach results in faster execution times for operations like addition, multiplication, and more. In Python lists, you would need to use explicit loops, which are inherently slower.
>
> - **Optimized Algorithms:** NumPy implements optimized algorithms in low-level languages like C and Fortran. These algorithms are highly efficient and are used for various operations, such as sorting, searching, and mathematical computations. Python lists rely on general-purpose algorithms, which are not as optimized for numerical tasks.
>
> - **Contiguous Memory:** NumPy arrays are stored in contiguous blocks of memory, allowing for efficient data access and minimizing cache misses. This contiguous memory allocation is represented as follows:
>
> ![NumPy Contiguous Memory](https://jakevdp.github.io/PythonDataScienceHandbook/figures/array_vs_list.png)
>
> In the diagram, you can see that NumPy arrays store elements in a continuous block of memory, making it easy to access and process data efficiently. Python lists, on the other hand, may have scattered memory allocations, which can lead to slower data access times.
>
> - **Parallel Processing:** NumPy can take advantage of multi-core processors through parallel processing, making it faster for large-scale computations. Python lists cannot easily benefit from parallelization.
>
> - **Third-Party Libraries:** NumPy seamlessly integrates with other data science and numerical computing libraries like SciPy, scikit-learn, and pandas. These libraries are also optimized for NumPy arrays, resulting in efficient workflows.
>
> In summary, NumPy arrays are designed with efficiency in mind, offering homogeneous data storage, vectorized operations, optimized algorithms, and better memory management with contiguous memory blocks. For numerical and scientific computing tasks, using NumPy arrays over Python lists can lead to significant performance improvements and faster development.


# Creating NumPy Arrays

## From Lists

> **Python List vs. NumPy Array**
> 
> Python lists and NumPy arrays are both used to store collections of data, but they have significant differences in terms of functionality and performance.
> 
> - **Data Types:** Python lists can hold elements of different data types, making them versatile but potentially less efficient for numerical operations. NumPy arrays, on the other hand, are homogeneous, meaning all elements must have the same data type, which allows for more efficient numerical computations.
> 
> - **Performance:** NumPy arrays are optimized for numerical operations and are significantly faster than Python lists when performing element-wise operations. This is because NumPy arrays are implemented in C and are stored in contiguous memory, whereas Python lists are more flexible but may involve additional overhead.
> 
> - **Functionality:** NumPy arrays provide a wide range of mathematical and array-oriented operations, such as element-wise addition, multiplication, and statistical functions. Python lists offer fewer built-in operations and require more explicit iteration for similar tasks.
> 
> - **Broadcasting:** NumPy arrays support broadcasting, which allows operations between arrays of different shapes to be performed efficiently. Python lists do not have this feature, requiring more explicit element-wise operations.
> 
> In summary, if your task involves numerical computations, data analysis, or machine learning, NumPy arrays are the preferred choice due to their performance and built-in functionality. Python lists are more general-purpose and flexible but may be less efficient for numerical tasks.


In [63]:
# Creating a 1D array from a list
array1 = np.array([1, 2, 3])
print("1D Array from a List:")
print(array1)

# Creating a 2D array from a list of lists
array2 = np.array([[1, 2], [3, 4], [5, 6]])
print("\n2D Array from a List of Lists:")
print(array2)

1D Array from a List:
[1 2 3]

2D Array from a List of Lists:
[[1 2]
 [3 4]
 [5 6]]


## Using Built-in Functions

> **NumPy Built-In Functions**
>
> NumPy provides a wide range of built-in functions that simplify common array operations. Here are explanations and common use cases for some of these functions:
> 
> - **Array of Zeros (`np.zeros`):** The `np.zeros` function creates a NumPy array filled with zeros. This function is commonly used when you want to initialize an array with zeros before performing numerical computations. For example, when creating a placeholder for data or storing the results of an operation in a NumPy array.
> 
> - **Array of Ones (`np.ones`):** Similar to `np.zeros`, the `np.ones` function generates a NumPy array filled with ones. It is often used in scenarios where you need to initialize an array with ones before performing operations. For instance, when defining weights in machine learning models or creating a mask for certain data manipulation tasks.
> 
> - **Identity Matrix (`np.eye`):** The `np.eye` function produces an identity matrix of the specified size. Identity matrices are commonly used in linear algebra, especially in matrix operations like matrix multiplication. They serve as the multiplicative identity element in matrix algebra and are fundamental for transformations and solving linear equations.
>
> [Identity Matrix Video](https://www.youtube.com/watch?v=3cnIa0fYJkY)
> 
> These NumPy functions are valuable for various data science and scientific computing tasks. They help you quickly create arrays with specific values or structures, making your code more efficient and readable.


In [32]:
# Creating an array of zeros
zeros = np.zeros(3)
print("Array of Zeros:")
print(zeros)

# Creating an array of ones with shape (2, 3)
ones = np.ones((2, 3))
print("\nArray of Ones:")
print(ones)

# Creating a 3x3 identity matrix
identity = np.eye(3)
print("\nIdentity Matrix:")
print(identity)

Array of Zeros:
[0. 0. 0.]

Array of Ones:
[[1. 1. 1.]
 [1. 1. 1.]]

Identity Matrix:
[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]


# Numpy Array Operations

## Arithmetic Operations

> **NumPy Arithmetic Operations with Arrays**
>
> NumPy allows for various arithmetic operations with arrays, making it a powerful tool for numerical computing. Here's an explanation of how each of these arithmetic operations works with arrays or matrices:
> 
> - **Addition (`+`):** When you add two NumPy arrays together using the `+` operator, element-wise addition is performed. This means that each corresponding element in the two arrays is added together, resulting in a new array of the same shape.
> 
> - **Subtraction (`-`):** Subtraction between two NumPy arrays using the `-` operator also happens element-wise. Each element in the second array is subtracted from the corresponding element in the first array, producing a new array with the same shape.
> 
> - **Multiplication (`*`):** Multiplication of NumPy arrays using the `*` operator is, once again, performed element-wise. Each element in one array is multiplied by the corresponding element in the other array, resulting in a new array with the same shape.
> 
> - **Division (`/`):** Division between NumPy arrays using the `/` operator is element-wise as well. Each element in the first array is divided by the corresponding element in the second array, yielding a new array with the same shape.
> 
> - **Exponentiation (`**` or `np.power`):** You can raise a NumPy array to a power using the `**` operator or the `np.power` function. This performs element-wise exponentiation, where each element in the array is raised to the specified power, producing a new array with the same shape.
> 
> - **Modulus (`%` or `np.mod`):** The modulus operation calculates the remainder when one array is divided by another. You can use the `%` operator or the `np.mod` function for element-wise modulus calculations, resulting in a new array of the same shape.
>
> These element-wise arithmetic operations make it easy to perform various mathematical computations on arrays of data, which is a fundamental capability in data science and scientific computing.
>
> **Array Shape Compatibility in NumPy**
>
> In NumPy, it's crucial to ensure that arrays involved in basic arithmetic operations (such as addition, subtraction, multiplication, division, exponentiation, and modulus) have compatible shapes. Compatibility means that the arrays either have the same shape or can be broadcasted to match each other's shape.
>
> **Element-Wise Operations**
>
> Element-wise operations are mathematical operations performed on arrays by applying the operation to corresponding elements based on their positions in the arrays. These operations are performed independently for each element pair, resulting in an output array of the same shape as the input arrays. Element-wise operations are a fundamental concept in NumPy and play a key role in numerical computing and data manipulation.



In [50]:
# Creating two NumPy arrays
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

In [34]:
# Addition
addition = a + b
print("Addition:")
print(addition)

Addition:
[5 7 9]


In [35]:
# Subtraction
subtraction = a - b
print("\nSubtraction:")
print(subtraction)


Subtraction:
[-3 -3 -3]


In [36]:
# Multiplication
multiplication = a * b
print("\nMultiplication:")
print(multiplication)


Multiplication:
[ 4 10 18]


In [37]:
# Division
division = a / b
print("\nDivision:")
print(division)


Division:
[0.25 0.4  0.5 ]


In [38]:
# Exponentiation
exponentiation = a ** b
print("\nExponentiation:")
print(exponentiation)


Exponentiation:
[  1  32 729]


In [39]:
# Modulus
modulus = a % b
print("\nModulus:")
print(modulus)


Modulus:
[1 2 3]


## Aggregation


> **NumPy Aggregation Operations**
>
> NumPy provides various aggregation operations to summarize data in arrays. These operations help extract valuable insights from numerical data. Here are some common aggregation operations and their use cases:
> 
> - **Summation (`np.sum`):** The `np.sum` function calculates the sum of all elements in an array. It is commonly used to find the total of numerical values in an array, such as calculating the total sales for a period.
> 
> - **Mean (`np.mean`):** The `np.mean` function computes the average or mean of the elements in an array. It is used to determine the central tendency of the data. For instance, you might use it to find the average score in a set of test results.
> 
> - **Maximum Value (`np.max`):** The `np.max` function identifies the maximum value in an array. It is helpful when you want to find the highest recorded value in a dataset, such as the maximum temperature in a weather dataset.
> 
> - **Minimum Value (`np.min`):** The `np.min` function finds the minimum value in an array. It's useful for discovering the lowest data point in a dataset, like finding the minimum stock price in a financial dataset.
> 
> - **Median (`np.median`):** The `np.median` function calculates the median of an array, which represents the middle value when the elements are sorted. Median is often used when dealing with data containing outliers.
> 
> - **Standard Deviation (`np.std`):** The `np.std` function computes the standard deviation of the elements in an array. It measures the amount of variation or dispersion in the data. For instance, it's used to analyze how spread out data points are in a set.
> 
> - **Variance (`np.var`):** The `np.var` function calculates the variance of an array, which quantifies how much data points deviate from the mean. It's commonly used in statistics to assess data distribution.
> 
> These aggregation operations help data scientists and analysts understand, summarize, and draw insights from data efficiently.


In [40]:
# Sum of elements
sum_val = np.sum(a)
print(f"Sum of elements in 'a': {sum_val}")

Sum of elements in 'a': 6


In [41]:
# Mean of elements
mean_val = np.mean(b)
print(f"Mean of elements in 'b': {mean_val}")

Mean of elements in 'b': 5.0


In [42]:
# Maximum value
max_val = np.max(a)
print(f"Maximum value in 'a': {max_val}")

Maximum value in 'a': 3


In [43]:
# Minimum value
min_val = np.min(b)
print(f"Minimum value in 'b': {min_val}")

Minimum value in 'b': 4


In [44]:
# Median
median_val = np.median(a)
print(f"Median of elements in 'a': {median_val}")

Median of elements in 'a': 2.0


In [45]:
# Standard deviation
std_dev = np.std(b)
print(f"Standard deviation of elements in 'b': {std_dev}")

Standard deviation of elements in 'b': 0.816496580927726


In [46]:
# Variance
variance = np.var(a)
print(f"Variance of elements in 'a': {variance}")

Variance of elements in 'a': 0.6666666666666666


# Array Indexing and Slicing


## Indexing


**Indexing with NumPy Arrays**

NumPy provides versatile indexing capabilities for arrays. Here are some common indexing techniques:

- **Second element of 'a' (`a[1]`):** You can access individual elements of a NumPy array using square brackets and the index. In this example, we retrieve the second element of the array 'a'.

- **First row of a 2D array (`array2[0]`):** For 2D arrays, you can use indexing to access rows or columns. Here, we extract the first row of 'array2'.

- **Conditional indexing:** You can use boolean indexing to filter elements based on a condition. For instance, `a[a > 2]` selects elements greater than 2 in 'a'.

These indexing techniques are powerful tools for data manipulation and analysis using NumPy.


In [55]:
# Indexing with NumPy Arrays
# NumPy provides versatile indexing capabilities for arrays.

# Second element of 'a'
second_element = a[1]
print(f"Second element of 'a': {second_element}")

# First row of a 2D array
first_row = array2[0]
print(f"First row of 'array2': {first_row}")

# Conditional indexing
# You can use boolean indexing to filter elements based on a condition.

# Elements greater than 2 in 'a'
greater_than_2 = a[a > 2]
print(f"Elements greater than 2 in 'a': {greater_than_2}")

Second element of 'a': 2
First row of 'array2': [1 2]
Elements greater than 2 in 'a': [3]


## Slicing


**Slicing with NumPy Arrays**

NumPy offers powerful slicing capabilities for arrays, allowing you to extract specific subsets of data efficiently. Here are some common slicing techniques:

- **Elements from index 1 to 2 (exclusive) (`a[1:3]`):** You can use slicing to extract a contiguous subarray from an array. In this example, we obtain elements from index 1 (inclusive) to 3 (exclusive) in 'a'.

- **First two rows and two columns of a 2D array (`array2[:2, :2]`):** For 2D arrays, you can slice both rows and columns simultaneously. Here, we extract the first two rows and the first two columns of 'array2'.

- **Slicing with step (`a[::2]`):** You can specify a step value to skip elements while slicing. In this case, we retrieve every second element from 'a'.

- **Reverse slicing (`a[::-1]`):** By using a negative step, you can reverse the order of elements in an array. In this example, we reverse the order of elements in 'a'.

These slicing techniques offer flexibility and efficiency when working with NumPy arrays, making it easier to manipulate and analyze data.


In [62]:
# Slicing examples
print("Slicing Examples:")

# Elements from index 1 to 2 (exclusive)
slice1 = a[1:3]
print("\nSlice 1 (a[1:3]):", slice1)

# First two rows and two columns of a 2D array
slice2 = array2[:3, :1]
print("\nSlice 2 (array2[:2, :2]):")
print(slice2)

# Slicing with step (every second element)
slice3 = a[::2]
print("\nSlice 3 (a[::2]):", slice3)

# Reverse slicing (reversing the order of elements)
slice4 = a[::-1]
print("\nSlice 4 (a[::-1]):", slice4)

Slicing Examples:

Slice 1 (a[1:3]): [2 3]

Slice 2 (array2[:2, :2]):
[[1]
 [3]
 [5]]

Slice 3 (a[::2]): [1 3]

Slice 4 (a[::-1]): [3 2 1]


# Reshaping and Combining Arrays


## Reshaping


>In this section, we explore various reshaping operations using NumPy.
>
>**Reshaping a 1D array to 2D (3 rows, 1 column)**: We start by reshaping a 1D array 'a' into a 2D array with 3 rows and 1 column. This is a common operation when you want to convert a 1D array into a vertical column vector.
>
>**Flattening a 2D array to 1D**: Here, we flatten a 2D array 'array2' into a 1D array. Flattening is useful when you want to convert a multi-dimensional array into a 1D representation.
>
>**Reshaping a 1D array to 2D (2 rows, 2 columns)**: In this example, we reshape a 1D array 'c' into a 2D array with 2 rows and 2 columns. This operation transforms the 1D data into a matrix-like structure.
>
>**Reshaping a 1D array to a 3D array (2x2x1)**: Finally, we reshape 'c' into a 3D array with dimensions 2x2x1. This demonstrates how you can change the dimensions of an array to match your data representation needs.

In [66]:
# Reshaping examples
print("Reshaping Examples:")

# Reshaping a 1D array to 2D (3 rows, 1 column)
reshaped1 = a.reshape(3, 1)
print("\nReshaped 1 (a.reshape(3, 1)):")
print(f"'a' is: {a}")
print(reshaped1)

# Flattening a 2D array to 1D

flattened = array2.ravel()
print("\nFlattened (array2.ravel()):", flattened)
print(f"'array2' is: {array2}")

# Reshaping a 1D array to 2D (2 rows, 2 columns)

c = np.array([1,2,3,4])

reshaped2 = c.reshape(2, 2)
print("\nReshaped 2 (c.reshape(2, 2)):")
print(reshaped2)

# Reshaping a 1D array to a 3D array (2x2x1)
reshaped3 = c.reshape(2, 2, 1)
print("\nReshaped 3 (c.reshape(2, 2, 1)):")
print(reshaped3)

Reshaping Examples:

Reshaped 1 (a.reshape(3, 1)):
'a' is: [1 2 3]
[[1]
 [2]
 [3]]

Flattened (array2.ravel()): [1 2 3 4 5 6]
'array2' is: [[1 2]
 [3 4]
 [5 6]]

Reshaped 2 (c.reshape(2, 2)):
[[1 2]
 [3 4]]

Reshaped 3 (c.reshape(2, 2, 1)):
[[[1]
  [2]]

 [[3]
  [4]]]


## Combining Arrays


> **Stacking Arrays Examples**
>
> In NumPy, you can stack arrays both vertically and horizontally to combine them into larger arrays. Here are some examples of stacking arrays and their use cases:
>
> - **Vertically Stacked 1 (np.vstack((a, b))):** In this example, arrays 'a' and 'b' are stacked vertically using `np.vstack`. This operation concatenates the arrays along the rows, effectively stacking 'b' below 'a'. It's useful when you have arrays with the same number of columns and want to combine them vertically.
>
> - **Vertically Stacked 2 (np.vstack((a, c))):** Here, 'a' and 'c' are stacked vertically using `np.vstack`. However, 'c' has a different shape compared to 'a'. Stacking arrays vertically with different shapes will raise a `ValueError` because the number of columns must match.
>
> - **Horizontally Stacked 1 (np.hstack((a, b))):** This example demonstrates horizontal stacking of 'a' and 'b' using `np.hstack`. The arrays are concatenated along the columns, with 'b' added to the right of 'a'. It's useful when you have arrays with the same number of rows and want to combine them horizontally.
>
> - **Horizontally Stacked 2 (np.hstack((a, d))):** In this case, 'a' and 'd' are stacked horizontally using `np.hstack`. Similar to the vertical stacking example, stacking arrays horizontally with different shapes will raise a `ValueError`.
>
> Stacking arrays is a valuable operation when you need to combine data from different sources or reshape your data for further analysis or processing.


In [67]:
# Stacking Arrays Examples
print("Stacking Arrays Examples:")

# Stacking arrays vertically
vstacked1 = np.vstack((a, b))
print("\nVertically Stacked 1 (np.vstack((a, b))):")
print(f"'a' is: {a}")
print(f"'b' is: {b}")
print(vstacked1)

# Stacking arrays vertically with different shapes
c = np.array([7, 8, 9])
vstacked2 = np.vstack((a, c))
print("\nVertically Stacked 2 (np.vstack((a, c))):")
print(f"'a' is: {a}")
print(f"'c' is: {c}")
print(vstacked2)

# Stacking arrays horizontally
hstacked1 = np.hstack((a, b))
print("\nHorizontally Stacked 1 (np.hstack((a, b))):")
print(f"'a' is: {a}")
print(f"'b' is: {b}")
print(hstacked1)

# Stacking arrays horizontally with different shapes
d = np.array([10, 11, 12])
hstacked2 = np.hstack((a, d))
print("\nHorizontally Stacked 2 (np.hstack((a, d))):")
print(f"'a' is: {a}")
print(f"'d' is: {d}")
print(hstacked2)

Stacking Arrays Examples:

Vertically Stacked 1 (np.vstack((a, b))):
'a' is: [1 2 3]
'b' is: [4 5 6]
[[1 2 3]
 [4 5 6]]

Vertically Stacked 2 (np.vstack((a, c))):
'a' is: [1 2 3]
'c' is: [7 8 9]
[[1 2 3]
 [7 8 9]]

Horizontally Stacked 1 (np.hstack((a, b))):
'a' is: [1 2 3]
'b' is: [4 5 6]
[1 2 3 4 5 6]

Horizontally Stacked 2 (np.hstack((a, d))):
'a' is: [1 2 3]
'd' is: [10 11 12]
[ 1  2  3 10 11 12]
