# Introduction to NumPy

In this lesson, we will introduce **NumPy**, a powerful library for numerical computing in Python. We will cover:

- What is NumPy and why use it
- How to install and import NumPy
- Creating NumPy arrays and exploring their attributes
- Array operations (indexing, slicing, mathematical operations, broadcasting, aggregation functions)
- Matrix operations (creation, multiplication, transposing, reshaping)
- Practical exercises to practice these concepts

## What is NumPy and Why Use It?

- **NumPy** stands for *Numerical Python* and is one of the core libraries for numerical computing.
- It provides support for large multi-dimensional arrays and matrices.
- It includes many built-in mathematical functions to operate on these arrays efficiently.

### Advantages:

- **Performance:** Array operations are implemented in C, making them much faster than Python loops.
- **Memory Efficiency:** Arrays are densely packed and use less memory compared to Python lists.
- **Functionality:** Offers many vectorized operations and functions for linear algebra, statistics, and more.

## Installing and Importing NumPy

If you haven't installed NumPy yet, run the following command in your terminal or in a notebook cell:

```python
!pip install numpy
```

Then, import NumPy in your Python code using:

```python
import numpy as np
```

In [None]:
# Importing numpy
import numpy as np

# Creating a basic NumPy array from a Python list
array_from_list = np.array([1, 2, 3, 4, 5])
print('Array from list:', array_from_list)

## NumPy Arrays vs Python Lists

- **Homogeneity:** NumPy arrays are homogeneous (all elements have the same type), while Python lists can contain mixed types.
- **Performance:** NumPy arrays support vectorized operations and are much faster for numerical tasks.
- **Memory Efficiency:** Arrays use less memory compared to lists.
- **Functionality:** Many built-in mathematical operations are available directly on NumPy arrays.

In [None]:
# Comparing Python lists and NumPy arrays
python_list = [1, 2, 3, 4, 5]
numpy_array = np.array([1, 2, 3, 4, 5])

# Multiplying a Python list replicates the list
print('Python list multiplied by 2:', python_list * 2)

# Multiplying a NumPy array multiplies each element by 2
print('NumPy array multiplied by 2:', numpy_array * 2)

## Creating NumPy Arrays

There are several ways to create arrays in NumPy:

### 1. Using `np.array`

Convert a Python list into a NumPy array.

In [None]:
data = [10, 20, 30, 40, 50]
arr = np.array(data)
print('NumPy array:', arr)

### 2. Using `np.zeros` and `np.ones`

- `np.zeros(shape)` creates an array filled with zeros.
- `np.ones(shape)` creates an array filled with ones.

In [None]:
# Array of zeros: 3 rows x 4 columns
zeros_array = np.zeros((3, 4))
print('Zeros array:\n', zeros_array)

# Array of ones: 2 rows x 5 columns
ones_array = np.ones((2, 5))
print('Ones array:\n', ones_array)

### 3. Using `np.arange` and `np.linspace`

- `np.arange(start, stop, step)` creates an array with evenly spaced values.
- `np.linspace(start, stop, num)` creates an array with a specified number of evenly spaced values.

In [None]:
# Using np.arange: values from 0 to 8 with step 2
arange_array = np.arange(0, 10, 2)
print('np.arange:', arange_array)

# Using np.linspace: 5 values evenly spaced between 0 and 1
linspace_array = np.linspace(0, 1, 5)
print('np.linspace:', linspace_array)

## Array Attributes

Every NumPy array has several useful attributes:

- **`shape`**: The dimensions of the array (rows, columns, etc.)
- **`dtype`**: The data type of the elements
- **`size`**: Total number of elements in the array
- **`ndim`**: Number of dimensions (axes) of the array

In [None]:
arr_example = np.array([[1, 2, 3], [4, 5, 6]])
print('Array:')
print(arr_example)

print('Shape:', arr_example.shape)
print('Data type:', arr_example.dtype)
print('Size:', arr_example.size)
print('Number of dimensions:', arr_example.ndim)

## Array Operations

### Indexing and Slicing

You can access elements in a NumPy array similar to Python lists:

- **Indexing:** Use square brackets `[]` with the index (starting at 0) to get a single element.
- **Slicing:** Use the colon `:` to extract a subset of the array.

In [None]:
sample_array = np.array([10, 20, 30, 40, 50])

# Indexing: Get the third element (index 2)
print('Third element:', sample_array[2])

# Slicing: Get elements from index 1 to 3 (index 4 is excluded)
print('Slice from index 1 to 4:', sample_array[1:4])

### Mathematical Operations (Element-wise)

NumPy allows you to perform arithmetic operations on arrays element-wise.

In [None]:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

print('Addition:', a + b)
print('Subtraction:', a - b)
print('Multiplication:', a * b)
print('Division:', a / b)

### Broadcasting

Broadcasting allows NumPy to perform operations on arrays of different shapes by "stretching" the smaller array along the missing dimensions. For example, adding a vector to each row of a matrix.

In [None]:
matrix = np.array([[1, 2, 3],
                   [4, 5, 6]])
vector = np.array([1, 0, 1])

# Broadcasting: Add the vector to each row of the matrix
result = matrix + vector
print('Matrix:\n', matrix)
print('Vector:', vector)
print('Result of broadcasting addition:\n', result)

### Aggregation Functions

NumPy provides functions to aggregate values in an array:

- **`np.sum()`**: Sum of all elements
- **`np.mean()`**: Mean (average) of elements
- **`np.std()`**: Standard deviation
- **`np.min()`**: Minimum value
- **`np.max()`**: Maximum value

In [None]:
data = np.array([1, 2, 3, 4, 5])

print('Sum:', np.sum(data))
print('Mean:', np.mean(data))
print('Standard Deviation:', np.std(data))
print('Minimum:', np.min(data))
print('Maximum:', np.max(data))

## Matrix Operations

### Creating Matrices

NumPy offers functions to create special matrices:

- **`np.eye(n)`**: Creates an *n x n* identity matrix
- **`np.random.rand(m, n)`**: Creates an *m x n* matrix with random values between 0 and 1

In [None]:
# Identity matrix of size 4x4
identity_matrix = np.eye(4)
print('Identity Matrix:\n', identity_matrix)

# Random 3x3 matrix
random_matrix = np.random.rand(3, 3)
print('Random Matrix:\n', random_matrix)

### Matrix Multiplication

Perform matrix multiplication using either `np.dot` or the `@` operator.

In [None]:
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])

# Using np.dot
product_dot = np.dot(A, B)
print('Matrix multiplication using np.dot:\n', product_dot)

# Using @ operator
product_at = A @ B
print('Matrix multiplication using @ operator:\n', product_at)

### Transposing and Reshaping

- **Transpose:** Flip rows and columns using `.T`.
- **Reshape:** Change the shape of an array using `np.reshape` (the total number of elements must remain the same).

In [None]:
matrix = np.array([[1, 2, 3], [4, 5, 6]])

# Transpose the matrix
transpose_matrix = matrix.T
print('Original Matrix:\n', matrix)
print('Transposed Matrix:\n', transpose_matrix)

# Reshape the matrix to a 3x2 array
reshaped_matrix = np.reshape(matrix, (3, 2))
print('Reshaped Matrix (3x2):\n', reshaped_matrix)

## Practical Exercises

### Exercise 1: Generate an Array and Perform Operations

1. Create a 2D NumPy array of shape **(4, 5)** with random integers between 0 and 50.
2. Compute the **sum** of all elements in the array.
3. Find the **mean** of each column.

Try to solve this on your own before checking the solution below.

In [None]:
# Exercise 1 Solution
array_ex1 = np.random.randint(0, 51, (4, 5))
print('Exercise 1 Array:\n', array_ex1)

total_sum = np.sum(array_ex1)
print('Total Sum of Elements:', total_sum)

mean_columns = np.mean(array_ex1, axis=0)  # axis=0 computes the mean column-wise
print('Mean of Each Column:', mean_columns)

### Exercise 2: Calculate Statistics on a Dataset Using NumPy

Given the following dataset of exam scores, calculate:

- The overall **mean** score
- The **standard deviation** of the scores
- The **highest** and **lowest** scores

Dataset: `[88, 92, 79, 93, 85, 78, 91, 87, 95, 89]`

In [None]:
# Exercise 2 Solution
scores = np.array([88, 92, 79, 93, 85, 78, 91, 87, 95, 89])

mean_score = np.mean(scores)
std_score = np.std(scores)
min_score = np.min(scores)
max_score = np.max(scores)

print('Mean Score:', mean_score)
print('Standard Deviation:', std_score)
print('Minimum Score:', min_score)
print('Maximum Score:', max_score)

### Exercise 3: Simple Numerical Computation Task

Imagine you have an array representing the distances (in kilometers) traveled by different vehicles. 

1. Create an array with the following distances: `[15, 30, 45, 60, 75]`.
2. Convert these distances into **miles** using the conversion factor (1 km = 0.621371 miles).
3. Calculate the **total distance** traveled in miles.

In [None]:
# Exercise 3 Solution
distances_km = np.array([15, 30, 45, 60, 75])
conversion_factor = 0.621371

# Convert distances to miles
distances_miles = distances_km * conversion_factor
print('Distances in miles:', distances_miles)

# Total distance in miles
total_distance_miles = np.sum(distances_miles)
print('Total distance traveled in miles:', total_distance_miles)

## Summary

In this notebook, we learned:

- The purpose and advantages of NumPy
- How to install and import NumPy
- Different ways to create arrays and inspect their properties
- Array operations including indexing, slicing, arithmetic operations, and broadcasting
- Matrix operations such as multiplication, transposition, and reshaping
- Practical exercises to apply these concepts

This foundation will be useful as we continue to explore more advanced data manipulation libraries like **Pandas** in our next lesson.

### Next Steps

In the next lesson, we will dive into **Pandas**, a powerful library for data manipulation and analysis in Python. Stay tuned!