# Numpy Introduction

###  Learning Objectives
- Understand what NumPy is and why it is used
- Learn how to create and manipulate NumPy arrays
- Perform mathematical and statistical operations with NumPy
- Work with random data generation and aggregation functions

---

## What is NumPy?

NumPy (Numerical Python) is a powerful Python library for numerical computing. It provides:
- Efficient multi-dimensional arrays (ndarrays) for large datasets
- A large collection of high-performance mathematical functions
- Broadcasting capabilities for flexible arithmetic operations
- Random number generation and sampling tools

NumPy forms the backbone of the scientific Python ecosystem and is widely used in data analysis, machine learning, simulations, and image processing.

Compared to Python's built-in lists, NumPy arrays are:
- Faster
- More memory efficient
- Easier to perform element-wise operations on

---

## Section 1: Importing the Library
To use NumPy, you first need to import it. The convention is to import it using the alias `np`:

In [1]:
import numpy as np #This allows you to access NumPy functions with the prefix `np.` throughout your code.

## Section 2. Creating NumPy Arrays

#### What is a NumPy Array?

A NumPy array (technically called an `ndarray`, or *n*-dimensional array) is a grid of values of the same data type, indexed by a tuple of nonnegative integers. NumPy arrays:

- Support fast, vectorized operations (without explicit loops)  
- Use less memory and are more efficient than Python lists  
- Can be multi-dimensional (1D, 2D, 3D, etc.)  
- Are foundational for scientific and numerical computing in Python  

Unlike the built-in `list` type that can hold elements of **different types**, NumPy arrays allow only **one data type** for all elements. This means that:
- A NumPy array can contain either integers or floats, but **not both at the same time**
- This homogeneity allows NumPy to perform **fast linear algebra operations**

NumPy also supports a wide range of **mathematical and statistical functions** such as:
- Average
- Minimum and Maximum
- Standard Deviation
- Variance
- Aggregation functions, and many more

After mastering NumPy, you’ll have a **powerful tool for analyzing numerical multi-dimensional data**, forming the backbone of most data science and machine learning workflows.

<div align="center">
  <img src="figures/fig11.png" alt="NumPy Array Illustration" width="600"/>
  <p style="font-size:small;">
    Figure 11: NumPy array layout (1D and 2D structures)  
    Source: <a href="https://www.pythontutorial.net/python-numpy/what-is-numpy/" target="_blank">
    PythonTutorial.net – What is NumPy</a>
  </p>
</div>


### From Python Lists

In [12]:
# Convert a list to a NumPy array
my_list = [1,2,3,4,5]

# conversion using np.array
numpy_array = np.array(my_list)
print("NumPy Array:", numpy_array)

NumPy Array: [1 2 3 4 5]


> Why use NumPy arrays instead of lists?
> - Support for vectorized operations (no need for loops)
> - More powerful indexing and slicing
> - Easier integration with scientific libraries

### Using `np.asarray()`
`np.asarray()` behaves similarly to `np.array()`, but avoids copying data if the input is already an array:  
This is useful when working with large datasets and memory efficiency is a concern.


In [None]:
# Conversion using np.asarray()
numpy_array2 = np.asarray(my_list)
print("Array with np.asarray:", numpy_array2)

Array with np.asarray: [1 2 3 4 5]


### Creating Arrays with Random Numbers

In [3]:
# Create an array with random integers
random_array = np.random.randint(0, 2024, 30) # 30 random integers between 0 and 2024
print("Random Array:", random_array)

Random Array: [ 624 1695 1153  384 1749 1051  706  748 1603  461  173   25  168  213
 1085  850 1024 1280  561  227  294 1273  563   64 1844 1767 1000  880
  980 1095]


In [8]:
# Generate normally distributed random numbers
normal_array = np.random.randn(1000) # 1000 normally distributed random numbers
print("some normally distributed random numbers:", normal_array[:10]) # print first 10 numbers

some normally distributed random numbers: [-0.26354362 -0.87388542  0.12940227 -0.03645156 -0.98572959 -1.36567331
 -0.81582488 -0.07333171 -0.91803362  1.64299021]


> Why generate random numbers?
> - Useful for simulations, initializing weights in machine learning, or testing statistical methods.

## Section 3: Manipulating Arrays

### Sorting Arrays
You can sort an array using:  
> This returns a new sorted array and does not modify the original. Use `array.sort()` to sort in-place.

In [None]:
sorted_array = np.sort(random_array)
print("Sorted array:", sorted_array)

Sorted array: [  25   64  168  173  213  227  294  384  461  561  563  624  706  748
  850  880  980 1000 1024 1051 1085 1095 1153 1273 1280 1603 1695 1749
 1767 1844]


### Indexing and Slicing

In [14]:
numpy_array

array([1, 2, 3, 4, 5])

In [None]:
print("First element:", numpy_array[0])
print("Last element:", numpy_array[-1])

First element: 1
Last element: 5


In [15]:
print("First three elements:", numpy_array[:3])
print("Middle elements:", numpy_array[1:4])

First three elements: [1 2 3]
Middle elements: [2 3 4]


### Update values

In [None]:
numpy_array[0] = 100
print("Modified array:", numpy_array)

Modified array: [100   2   3   4   5]


### Stacking

<div align="center">
  <img src="figures/stacking.png" alt="stacking" width="550"/>
  <p style="font-size:small;">
    stacking</a>
  </p>
</div>


In [17]:
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6]])

# Vertical stacking
vstacked = np.vstack((a, b))
print("Vertical Stack:")
print(vstacked)

# Horizontal stacking
hstacked = np.hstack((a, a))
print("Horizontal Stack:")
print(hstacked)

Vertical Stack:
[[1 2]
 [3 4]
 [5 6]]
Horizontal Stack:
[[1 2 1 2]
 [3 4 3 4]]


## Section 4: Statistical Calculations

### Mean (Average)
The arithmetic mean is calculated using `np.mean()` or `.mean()` method:
> The mean provides a central tendency of your data, useful in summarizing distributions.

In [None]:
#  Calculating the mean
mean = np.mean(numpy_array)
print("Mean using np.mean():", mean)

mean2 = numpy_array.mean()
print("Mean using .mean():", mean2)

Mean using np.mean(): 3.0
Mean using .mean(): 3.0


### Standard Deviation
This measures how spread out the values are:
> A low standard deviation means values are close to the mean, while a high one means they're more spread out

In [12]:
# 4.2: Calculating standard deviation
std_dev = np.std(normal_array)
print("Standard deviation of the normally distributed array:", std_dev)

Standard deviation of the normally distributed array: 1.0124964895455801


### Minimum and Maximum
> These are useful for range and normalization operations.

In [9]:
# 4.3: Maximum and minimum of an array
max_value = np.max(normal_array)
min_value = np.min(normal_array)
print("Maximum value:", max_value)
print("Minimum value:", min_value)

Maximum value: 3.677466538732689
Minimum value: -3.965997536317456


In [10]:
# 4.4: Sum of all values in an array
total_sum = np.sum(normal_array)
print("Sum of values:", total_sum)

Sum of values: 7.825047369297339


## Section 5: Conclusion

NumPy provides a wide range of functions for numerical computation.  
Here is a summary of the functions we covered:

| **Function**         | **Description**                                  |
|----------------------|--------------------------------------------------|
| `np.random.randint`  | Generate random integers                         |
| `np.random.randn`    | Generate normally distributed random numbers     |
| `np.sort`            | Sort an array                                    |
| `np.array`           | Convert a list into a NumPy array                |
| `np.asarray`         | Alternative to `np.array`                        |
| `np.mean`            | Calculate the mean                               |
| `np.std`             | Calculate the standard deviation                 |
| `np.max`, `np.min`   | Find the maximum and minimum                     |
| `np.sum`             | Calculate the sum of all values                  |
| `array.reshape()`    | Reshape an array                                 |
| `array.flatten()`    | Flatten a multidimensional array                 |
| `np.vstack`          | Stack arrays vertically                          |
| `np.hstack`          | Stack arrays horizontally                        |
| Indexing/Slicing     | Access, modify, and select subsets of data       |
| Boolean Masking      | Filter arrays using conditions                   |


Experiment with these functions to develop a deeper understanding!


---
## References
1. [W3Schools - Python Modules](https://www.w3schools.com/python/python_modules.asp)
2. HS Offenburg - Introductory Python Course
3. [Python Packaging Guide](https://packaging.python.org/en/latest/)
4. [DataCamp: Intro to Python](https://www.datacamp.com/courses/intro-to-python-for-data-science)
