# Introduction to NumPy and pandas (Beginner-Friendly)

This notebook introduces **NumPy** and **pandas**, two of the most popular Python libraries for data manipulation and analysis.

We assume **no prior experience** and explain every concept in detail.

## 1. What is NumPy?

NumPy (Numerical Python) is a library for fast mathematical operations on arrays.

**Why NumPy?**
- Faster than Python lists.
- Supports multi-dimensional arrays.
- Provides many mathematical functions.

Example use-cases: Image processing, scientific computing, data analysis.

In [None]:
import numpy as np

# Creating a simple NumPy array
arr = np.array([1, 2, 3, 4, 5])
print("Array:", arr)
print("Type:", type(arr))

### **Key Difference Between Python Lists and NumPy Arrays**
- Python lists are slower for numerical operations.
- NumPy arrays are optimized and support **vectorized operations**.

**Example:** Multiply every element by 2.

In [None]:
py_list = [1, 2, 3, 4, 5]
np_array = np.array(py_list)

# Multiply elements
# For list, you need a loop:
py_result = [x * 2 for x in py_list]
np_result = np_array * 2  # Vectorized operation

print("List result:", py_result)
print("NumPy result:", np_result)

## 2. Creating NumPy Arrays

There are many ways to create NumPy arrays:

In [None]:
# From a Python list
arr = np.array([1, 2, 3])
print(arr)

# Using arange (like range but for arrays)
arr2 = np.arange(0, 10, 2)
print("arange:", arr2)

# Using linspace (evenly spaced values)
arr3 = np.linspace(0, 1, 5)
print("linspace:", arr3)

# Zeros and ones
zeros_arr = np.zeros((2, 3))
ones_arr = np.ones((3, 3))
print("Zeros:\n", zeros_arr)
print("Ones:\n", ones_arr)

## 3. Indexing and Slicing in NumPy

Just like lists, you can access elements using indexes, but NumPy also supports multi-dimensional indexing.

In [None]:
arr = np.array([10, 20, 30, 40, 50])
print(arr[0])  # First element
print(arr[1:4])  # Slice elements 1 to 3

matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print("Matrix:")
print(matrix)
print("Element at row 1, col 2:", matrix[0, 1])
print("First two rows:\n", matrix[:2, :])

## 4. Array Operations (Element-wise and Broadcasting)

NumPy supports element-wise operations and **broadcasting** (automatic resizing of arrays for operations).

In [None]:
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
print("Addition:", arr1 + arr2)
print("Multiplication:", arr1 * arr2)

# Broadcasting example
arr3 = np.array([1, 2, 3])
print("Broadcasted add:", arr3 + 5)

## 5. Useful NumPy Functions

- `reshape()`: Change the shape of an array.
- `sum()`, `mean()`, `max()`, `min()`: Aggregation.
- `np.random`: Random numbers.

In [None]:
arr = np.arange(1, 7)
print("Original:", arr)
print("Reshaped 2x3:\n", arr.reshape(2, 3))

print("Sum:", arr.sum())
print("Mean:", arr.mean())
print("Max:", arr.max())

# Introduction to pandas

pandas is used for **data manipulation and analysis**.

It has two main data structures:
- **Series:** 1D labeled array.
- **DataFrame:** 2D table of data (like Excel).

In [None]:
import pandas as pd

# Creating a Series
s = pd.Series([10, 20, 30], index=['a', 'b', 'c'])
print(s)

# Accessing elements
print("Element with index b:", s['b'])

## Creating DataFrames

A DataFrame is like a table (rows and columns).

In [None]:
data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'City': ['New York', 'Los Angeles', 'Chicago']
}

# Create DataFrame
df = pd.DataFrame(data)
print(df)

## Accessing Data in DataFrames

- `df['column']`: Access a column.
- `df.loc[row_label]`: Access by label.
- `df.iloc[row_index]`: Access by integer index.

In [None]:
print(df['Name'])  # Access column
print(df.loc[0])     # First row (by label)
print(df.iloc[1])    # Second row (by index)

## Basic DataFrame Operations

- `df.head()`: First 5 rows.
- `df.tail()`: Last 5 rows.
- `df.describe()`: Summary statistics.

In [None]:
print(df.head())
print(df.describe())

## FAQs

- **Q:** What is the difference between a list and a NumPy array? 
  - **A:** NumPy arrays are faster and support vectorized operations.
- **Q:** Why use pandas DataFrame? 
  - **A:** It is easy to analyze tabular data (like Excel).
- **Q:** Can I convert between NumPy and pandas? 
  - **A:** Yes, `df.to_numpy()` and `pd.DataFrame(array)`.

## Exercises (with solutions at the bottom)

1. Create a NumPy array of numbers 1 to 20 and reshape it into 4x5.
2. Find the mean and max of `[10, 20, 30, 40, 50]`.
3. Create a pandas DataFrame of 3 students with name, marks, and grade.
4. Access the marks of the second student.
5. Get summary statistics of your DataFrame.