# NumPy Fundamentals for Data Science – Jupyter Notebook Guide

## Introduction to NumPy

NumPy is a core library for scientific computing in Python. It allows you to work with large, multi-dimensional arrays and perform high-level mathematical operations efficiently.


## Installing and Importing NumPy

```python
# Install NumPy if needed
!pip install numpy
```

In [1]:
# Import NumPy
import numpy as np

## Python Lists vs. NumPy Arrays

In [2]:
# Comparing Python list and NumPy array behavior
my_list = [1, 2, 3]
my_array = np.array([1, 2, 3])

print("List * 2:", my_list * 2)        # Repeats the list
print("Array * 2:", my_array * 2)      # Multiplies each element

List * 2: [1, 2, 3, 1, 2, 3]
Array * 2: [2 4 6]


> NumPy arrays support element-wise operations, making them faster and more efficient for numerical tasks.

## Creating Arrays

NumPy provides several methods to create arrays:
- `np.array()` creates an array from a list.
- `np.zeros()` and `np.ones()` create arrays of zeros or ones.
- `np.arange()` generates evenly spaced values within a range.
- `np.linspace()` creates evenly spaced numbers over a specified interval.
- `np.random.rand()` generates random values between 0 and 1.

These tools allow flexible ways to initialize data for analysis or simulation.

In [5]:
# Various ways to create arrays
print(np.array([1, 2, 3])  )               # Create from a list
print(np.zeros((2, 3))  )                 # 2x3 array of zeros
print(np.ones(5)  )                       # 1D array of ones
print(np.arange(0, 10, 2))               # Evenly spaced values from 0 to 8
print(np.linspace(0, 1, 5) )              # 5 values from 0 to 1 (inclusive)
print(np.random.rand(3, 4))               # 3x4 array of random values between 0 and 1

[1 2 3]
[[0. 0. 0.]
 [0. 0. 0.]]
[1. 1. 1. 1. 1.]
[0 2 4 6 8]
[0.   0.25 0.5  0.75 1.  ]
[[0.78799679 0.20580319 0.23940884 0.85849995]
 [0.277503   0.87401744 0.28678102 0.33413595]
 [0.08126722 0.6592306  0.76912676 0.38525132]]


>Tip: Choose the method based on the structure or range of data you want to initialize.

## Array Properties

Once you create an array, you can inspect its structure and characteristics:
- `.shape`: returns a tuple representing array dimensions.
- `.ndim`: number of dimensions (1D, 2D, etc.).
- `.dtype`: data type of elements (e.g., int, float).

Understanding these attributes is essential for manipulating and reshaping data.

In [6]:
# Exploring array attributes
array = np.array([[1, 2], [3, 4], [5, 6]])
print("Shape:", array.shape)       # (3, 2)
print("Dimensions:", array.ndim)   # 2D array
print("Data Type:", array.dtype)   # int64 (depends on values)

Shape: (3, 2)
Dimensions: 2
Data Type: int32


## Indexing and Slicing

Accessing and selecting elements is key in data transformation:
- Use `[index]` for single elements.
- Use `[start:stop]` for slicing.
- Use `[row, col]` for 2D arrays.
- Use `:` to select full rows or columns.

In [7]:
# Accessing elements in a 1D array
a = np.array([10, 20, 30, 40, 50])
print(a[0])           # First element
print(a[1:4])         # Elements from index 1 to 3
print(a[-1])          # Last element

# Accessing elements in a 2D array
b = np.array([[1, 2, 3], [4, 5, 6]])
print(b[1][0])        # First element of second row
print(b[0, 2])        # Third element of first row
print(b[:, 1])        # Second column of all rows

10
[20 30 40]
50
4
3
[2 5]


>Slicing is very powerful for filtering rows, columns, or subarrays of interest.

## Task 1: Temperature Data Analysis

**Instructions:**
- Create a NumPy array with 7 daily temperature readings.
- Use indexing and slicing to:
  - Print weekday temperatures (Mon–Fri)
  - Print weekend temperatures (Sat–Sun)
- Find the highest and lowest temperatures using built-in functions.

In [8]:
# Task: Analyze temperature data for a week
temps = np.array([22.5, 23.1, 21.8, 24.0, 23.3, 22.9, 23.7])

# Print weekday and weekend temperatures
print("Weekday temps:", temps[:5])     # Monday to Friday
print("Weekend temps:", temps[-2:])    # Saturday and Sunday

# Find the max and min temperatures of the week
print("Max temp:", np.max(temps))
print("Min temp:", np.min(temps))

Weekday temps: [22.5 23.1 21.8 24.  23.3]
Weekend temps: [22.9 23.7]
Max temp: 24.0
Min temp: 21.8


> Note: Arrays make it simple to split data into meaningful segments like weekdays/weekends.

## Task 2: Mathematical Operations on Arrays

**Instructions:**
- Create two arrays representing weekly sales for two weeks.
- Use element-wise addition to compute total daily sales.
- Calculate percentage increase from Week 1 to Week 2.
- Calculate a 3-day moving average using convolution.

In [10]:
# Task: Compare weekly sales data
week1 = np.array([150, 200, 250, 175, 300, 220, 190])
week2 = np.array([180, 210, 260, 200, 330, 240, 210])

# Total daily sales
total = week1 + week2
print("Total daily sales:", total)

# Percentage increase between Week 1 and Week 2
increase = ((week2 - week1) / week1) * 100
print("% Increase:", np.round(increase, 2))

# 3-day moving average using convolution
moving_avg = np.convolve(week2, np.ones(3)/3, mode='valid')
print("3-day moving average:", np.round(moving_avg, 2))

Total daily sales: [330 410 510 375 630 460 400]
% Increase: [20.    5.    4.   14.29 10.    9.09 10.53]
3-day moving average: [216.67 223.33 263.33 256.67 260.  ]


## Task 3: Standardizing Exam Scores

**Instructions:**
- Create a NumPy array of student exam scores.
- Normalize scores to a 0–1 range.
- Rescale the normalized scores to 50–100.

In [13]:
# Task: Normalize and rescale exam scores
scores = np.array([65, 70, 85, 90, 75, 60, 95])

# Normalize to [0, 1] range
min_score, max_score = np.min(scores), np.max(scores)
normalized = (scores - min_score) / (max_score - min_score)
print("Normalized (0–1):", np.round(normalized, 2))

# Rescale to 50–100 range
scaled = normalized * 50 + 50
print("Scaled (50–100):", np.round(scaled, 2))

Normalized (0–1): [0.14 0.29 0.71 0.86 0.43 0.   1.  ]
Scaled (50–100): [ 57.14  64.29  85.71  92.86  71.43  50.   100.  ]


## Task 4: NumPy Speed Test

**Instructions:**
- Generate 1 million random numbers in a Python list and a NumPy array.
- Sum the values using a loop (list) and `np.sum()` (array).
- Compare execution times.

In [14]:
# Task: Compare performance of sum in list vs NumPy array
# Import necessary librairies
import time
import random

# Generate 1 million random numbers in a list
py_list = [random.random() for _ in range(1_000_000)]
start = time.time()
total = sum(py_list)
print("List sum time:", time.time() - start)

# Generate 1 million random numbers in a NumPy array
np_array = np.random.rand(1_000_000)
start = time.time()
total = np.sum(np_array)
print("NumPy sum time:", time.time() - start)

List sum time: 0.010819196701049805
NumPy sum time: 0.0


> Note: NumPy's speed comes from vectorized operations and optimized memory handling.

## Final Activity: Weather Sensor Analysis

**Instructions:**
- Simulate 3 sensor readings for 7 days in a 2D array.
- Calculate:
  - Mean temperature per day
  - Sensor with the highest weekly average
  - Overall min and max temps
  - Standard deviation per sensor
  - Center the dataset by subtracting the mean from each sensor

In [15]:
# Task: Analyze temperature readings from 3 sensors over 7 days
data = np.array([
    [22.1, 22.5, 21.9],
    [23.2, 22.8, 22.4],
    [21.7, 22.0, 21.5],
    [24.0, 23.8, 23.2],
    [23.0, 22.9, 22.1],
    [22.5, 23.4, 22.8],
    [23.6, 24.0, 23.7]
])

# Mean temperature per day
daily_avg = np.mean(data, axis=1)
print("Daily Averages:", np.round(daily_avg, 2))

# Mean temperature per sensor across all days
sensor_avg = np.mean(data, axis=0)
print("Sensor Averages:", np.round(sensor_avg, 2))
print("Best Sensor Index:", np.argmax(sensor_avg))

# Minimum and maximum temperature overall
print("Min Temp:", np.min(data))
print("Max Temp:", np.max(data))

# Standard deviation per sensor
sensor_std = np.std(data, axis=0)
print("Sensor STD:", np.round(sensor_std, 2))

# Center the data by subtracting sensor mean from each reading
centered = data - sensor_avg
print("Centered Data (first row):", np.round(centered[0], 2))

Daily Averages: [22.17 22.8  21.73 23.67 22.67 22.9  23.77]
Sensor Averages: [22.87 23.06 22.51]
Best Sensor Index: 1
Min Temp: 21.5
Max Temp: 24.0
Sensor STD: [0.76 0.66 0.71]
Centered Data (first row): [-0.77 -0.56 -0.61]


---

## Summary
- NumPy provides efficient tools for numerical computation
- Arrays enable fast, memory-efficient operations
- Use slicing/indexing to manipulate data
- Normalize and transform values for analysis

📚 **Explore More**:
- [NumPy Official Docs](https://numpy.org/doc/)
- [W3Schools - NumPy Tutorial](https://www.w3schools.com/python/numpy/default.asp)


✅ Keep practicing, try new datasets, and challenge yourself to apply these tools to real-world problems. You've got this!

---