 **comprehensive Numpy guide**  **concepts, syntax, explanations, and hands-on examples**.  **categories for clarity**, covering arrays, operations, indexing, linear algebra, statistics, and randomization.

---

# **NumPy Master Table**

| **Concept**                 | **Syntax / Function**       | **Description**                          | **Hands-on Example**                                                         |
| --------------------------- | --------------------------- | ---------------------------------------- | ---------------------------------------------------------------------------- |
| **Importing NumPy**         | `import numpy as np`        | Import the NumPy library                 | `import numpy as np`                                                         |
| **Array Creation**          | `np.array([1,2,3])`         | Create 1D array                          | `arr = np.array([1,2,3]); print(arr)`                                        |
|                             | `np.zeros((3,3))`           | Create 2D array of zeros                 | `zeros = np.zeros((3,3)); print(zeros)`                                      |
|                             | `np.ones((2,4))`            | Create 2D array of ones                  | `ones = np.ones((2,4)); print(ones)`                                         |
|                             | `np.arange(0,10,2)`         | Create array with a range and step       | `arr = np.arange(0,10,2); print(arr)`                                        |
|                             | `np.linspace(0,1,5)`        | Create array with linearly spaced values | `arr = np.linspace(0,1,5); print(arr)`                                       |
|                             | `np.eye(3)`                 | Create identity matrix                   | `eye = np.eye(3); print(eye)`                                                |
| **Array Properties**        | `arr.shape`                 | Shape of the array                       | `arr = np.array([[1,2,3],[4,5,6]]); print(arr.shape)`                        |
|                             | `arr.ndim`                  | Number of dimensions                     | `print(arr.ndim)`                                                            |
|                             | `arr.size`                  | Total number of elements                 | `print(arr.size)`                                                            |
|                             | `arr.dtype`                 | Data type of elements                    | `print(arr.dtype)`                                                           |
| **Array Reshaping**         | `arr.reshape(3,2)`          | Change shape of array                    | `arr = np.arange(6).reshape(3,2); print(arr)`                                |
|                             | `arr.flatten()`             | Flatten multi-dimensional array          | `arr = np.array([[1,2],[3,4]]); print(arr.flatten())`                        |
|                             | `arr.T`                     | Transpose array                          | `arr = np.array([[1,2],[3,4]]); print(arr.T)`                                |
| **Indexing & Slicing**      | `arr[0]`                    | Access first row or element              | `arr = np.array([10,20,30]); print(arr[0])`                                  |
|                             | `arr[1,2]`                  | Access element in 2D array               | `arr = np.array([[1,2,3],[4,5,6]]); print(arr[1,2])`                         |
|                             | `arr[:,1]`                  | Access all rows, column 1                | `arr = np.array([[1,2,3],[4,5,6]]); print(arr[:,1])`                         |
|                             | `arr[1:3]`                  | Slice array                              | `arr = np.arange(10); print(arr[1:3])`                                       |
| **Mathematical Operations** | `np.add(arr1, arr2)`        | Element-wise addition                    | `arr1=np.array([1,2]); arr2=np.array([3,4]); print(np.add(arr1,arr2))`       |
|                             | `np.subtract(arr1, arr2)`   | Element-wise subtraction                 | `print(np.subtract(arr1,arr2))`                                              |
|                             | `np.multiply(arr1, arr2)`   | Element-wise multiplication              | `print(np.multiply(arr1,arr2))`                                              |
|                             | `np.divide(arr1, arr2)`     | Element-wise division                    | `print(np.divide(arr1,arr2))`                                                |
|                             | `arr1 + 5`                  | Broadcasting                             | `arr = np.array([1,2,3]); print(arr+5)`                                      |
|                             | `arr1 * 2`                  | Broadcasting                             | `print(arr*2)`                                                               |
|                             | `np.dot(arr1, arr2)`        | Dot product                              | `arr1=np.array([1,2]); arr2=np.array([3,4]); print(np.dot(arr1,arr2))`       |
|                             | `np.cross(arr1, arr2)`      | Cross product (3D vectors)               | `arr1=np.array([1,0,0]); arr2=np.array([0,1,0]); print(np.cross(arr1,arr2))` |
| **Statistical Functions**   | `np.mean(arr)`              | Compute mean                             | `arr=np.array([1,2,3,4]); print(np.mean(arr))`                               |
|                             | `np.median(arr)`            | Compute median                           | `print(np.median(arr))`                                                      |
|                             | `np.std(arr)`               | Standard deviation                       | `print(np.std(arr))`                                                         |
|                             | `np.var(arr)`               | Variance                                 | `print(np.var(arr))`                                                         |
|                             | `np.min(arr)`               | Minimum value                            | `print(np.min(arr))`                                                         |
|                             | `np.max(arr)`               | Maximum value                            | `print(np.max(arr))`                                                         |
|                             | `np.sum(arr)`               | Sum of elements                          | `print(np.sum(arr))`                                                         |
|                             | `np.argmin(arr)`            | Index of min value                       | `print(np.argmin(arr))`                                                      |
|                             | `np.argmax(arr)`            | Index of max value                       | `print(np.argmax(arr))`                                                      |
| **Logical Operations**      | `arr > 2`                   | Compare elements                         | `arr=np.array([1,2,3]); print(arr>2)`                                        |
|                             | `np.all(arr>0)`             | Check all elements                       | `print(np.all(arr>0))`                                                       |
|                             | `np.any(arr>2)`             | Check if any element satisfies           | `print(np.any(arr>2))`                                                       |
| **Random Numbers**          | `np.random.rand(3,3)`       | Random numbers [0,1)                     | `print(np.random.rand(3,3))`                                                 |
|                             | `np.random.randn(3,3)`      | Standard normal distribution             | `print(np.random.randn(3,3))`                                                |
|                             | `np.random.randint(0,10,5)` | Random integers                          | `print(np.random.randint(0,10,5))`                                           |
|                             | `np.random.seed(0)`         | Set seed for reproducibility             | `np.random.seed(0); print(np.random.rand(3))`                                |
| **Linear Algebra**          | `np.linalg.inv(arr)`        | Matrix inverse                           | `arr=np.array([[1,2],[3,4]]); print(np.linalg.inv(arr))`                     |
|                             | `np.linalg.det(arr)`        | Determinant                              | `print(np.linalg.det(arr))`                                                  |
|                             | `np.linalg.eig(arr)`        | Eigenvalues & eigenvectors               | `print(np.linalg.eig(arr))`                                                  |
|                             | `np.linalg.norm(arr)`       | Vector norm                              | `vec=np.array([3,4]); print(np.linalg.norm(vec))`                            |
| **Stacking & Splitting**    | `np.vstack([arr1, arr2])`   | Stack arrays vertically                  | `arr1=np.array([1,2]); arr2=np.array([3,4]); print(np.vstack([arr1,arr2]))`  |
|                             | `np.hstack([arr1, arr2])`   | Stack arrays horizontally                | `print(np.hstack([arr1,arr2]))`                                              |
|                             | `np.split(arr, 2)`          | Split array                              | `arr=np.array([1,2,3,4]); print(np.split(arr,2))`                            |
| **Copy & View**             | `arr.copy()`                | Deep copy                                | `arr=np.array([1,2]); arr2=arr.copy()`                                       |
|                             | `arr.view()`                | Shallow copy                             | `arr2 = arr.view()`                                                          |

---

###  **Brief Concept Overview**

1. **NumPy Arrays**: Fundamental data structure (n-dimensional array), efficient and fast for numerical operations.
2. **Vectorization**: NumPy allows element-wise operations without explicit loops, making computations faster.
3. **Broadcasting**: Automatically applies operations on arrays of different shapes.
4. **Linear Algebra**: NumPy provides tools for matrix multiplication, determinants, eigenvalues, norms, and more.
5. **Randomization**: Functions for generating random numbers, integers, and distributions for simulation or ML.
6. **Statistical Functions**: Efficient computations of mean, median, variance, standard deviation, min, max.
7. **Indexing & Slicing**: Extract data efficiently without loops.
8. **Stacking & Splitting**: Combine or divide arrays for data preprocessing.

---



Absolutely! Let’s create a **comprehensive Pandas master table**, similar to the ML syntaxes table, including **concepts, brief explanations, hands-on examples, and usage**. I’ll structure it so it’s practical for learning and reference.

---

# **Pandas Master Table: Concepts, Syntax, Examples, and Usage**

| **Concept**               | **Syntax / Method**                               | **Description**               | **Hands-on Example**                                                  | **Usage**                                |
| ------------------------- | ------------------------------------------------- | ----------------------------- | --------------------------------------------------------------------- | ---------------------------------------- |
| Importing pandas          | `import pandas as pd`                             | Load pandas library           | `import pandas as pd`                                                 | Start using pandas for data manipulation |
| Creating DataFrame        | `pd.DataFrame(data)`                              | Create tabular data structure | `df = pd.DataFrame({'Name':['Alice','Bob'],'Age':[25,30]})`           | Store structured data for analysis       |
| Creating Series           | `pd.Series(data)`                                 | Create 1D labeled array       | `s = pd.Series([10,20,30], index=['a','b','c'])`                      | Handle one-dimensional data              |
| Reading CSV               | `pd.read_csv('file.csv')`                         | Load data from CSV            | `df = pd.read_csv('data.csv')`                                        | Import datasets                          |
| Reading Excel             | `pd.read_excel('file.xlsx')`                      | Load data from Excel          | `df = pd.read_excel('data.xlsx')`                                     | Import Excel data                        |
| Viewing Data              | `df.head(n)`                                      | View first n rows             | `df.head(5)`                                                          | Quick inspection of data                 |
| Viewing Data              | `df.tail(n)`                                      | View last n rows              | `df.tail(5)`                                                          | Inspect end of dataset                   |
| Info about DataFrame      | `df.info()`                                       | Get summary of dataset        | `df.info()`                                                           | Check column types and nulls             |
| Statistical Summary       | `df.describe()`                                   | Summary statistics            | `df.describe()`                                                       | Quick stats on numeric columns           |
| Shape of Data             | `df.shape`                                        | Rows and columns              | `df.shape`                                                            | Understand dataset size                  |
| Column Selection          | `df['col']`                                       | Access single column          | `df['Age']`                                                           | Work with specific columns               |
| Multiple Columns          | `df[['col1','col2']]`                             | Access multiple columns       | `df[['Name','Age']]`                                                  | Subset data                              |
| Row Selection by Index    | `df.loc[index]`                                   | Select row by label           | `df.loc[0]`                                                           | Access specific rows                     |
| Row Selection by Position | `df.iloc[index]`                                  | Select row by position        | `df.iloc[0]`                                                          | Index-based selection                    |
| Slicing Rows              | `df[start:end]`                                   | Select rows by range          | `df[0:3]`                                                             | Subset rows                              |
| Conditional Filtering     | `df[df['col'] > value]`                           | Filter rows by condition      | `df[df['Age']>25]`                                                    | Filter dataset                           |
| Multiple Conditions       | `(df['col1']>x) & (df['col2']<y)`                 | Combine filters               | `df[(df['Age']>25) & (df['Salary']>5000)]`                            | Complex filtering                        |
| Adding Column             | `df['new_col'] = ...`                             | Add new column                | `df['SalaryTax'] = df['Salary']*0.1`                                  | Feature engineering                      |
| Deleting Column           | `df.drop('col', axis=1, inplace=True)`            | Remove column                 | `df.drop('SalaryTax', axis=1, inplace=True)`                          | Drop unnecessary columns                 |
| Renaming Columns          | `df.rename(columns={'old':'new'}, inplace=True)`  | Rename columns                | `df.rename(columns={'Age':'Years'}, inplace=True)`                    | Clean column names                       |
| Handling Missing Values   | `df.isnull()`                                     | Identify nulls                | `df.isnull().sum()`                                                   | Detect missing data                      |
| Handling Missing Values   | `df.dropna()`                                     | Drop rows with nulls          | `df.dropna(inplace=True)`                                             | Remove incomplete data                   |
| Handling Missing Values   | `df.fillna(value)`                                | Fill nulls                    | `df['Age'].fillna(0, inplace=True)`                                   | Replace missing data                     |
| Sorting                   | `df.sort_values('col')`                           | Sort dataset                  | `df.sort_values('Age')`                                               | Order data                               |
| Sorting Descending        | `df.sort_values('col', ascending=False)`          | Descending sort               | `df.sort_values('Age', ascending=False)`                              | Sort data in reverse order               |
| Reset Index               | `df.reset_index(drop=True, inplace=True)`         | Reset row index               | `df.reset_index(drop=True, inplace=True)`                             | Clean index after filtering              |
| Set Index                 | `df.set_index('col', inplace=True)`               | Use column as index           | `df.set_index('Name', inplace=True)`                                  | Easier data selection                    |
| Grouping                  | `df.groupby('col')`                               | Group data                    | `df.groupby('Department')['Salary'].mean()`                           | Aggregate data                           |
| Aggregations              | `df['col'].sum()`                                 | Sum values                    | `df['Salary'].sum()`                                                  | Summarize data                           |
| Aggregations              | `df['col'].mean()`                                | Mean values                   | `df['Salary'].mean()`                                                 | Summary statistics                       |
| Aggregations              | `df['col'].max()/min()`                           | Max/min values                | `df['Age'].max()`                                                     | Find extremes                            |
| Value Counts              | `df['col'].value_counts()`                        | Count unique values           | `df['Department'].value_counts()`                                     | Frequency distribution                   |
| Unique Values             | `df['col'].unique()`                              | Find unique values            | `df['Department'].unique()`                                           | Identify categories                      |
| Map / Apply               | `df['col'].map(func)`                             | Apply function to column      | `df['Age'].map(lambda x:x+1)`                                         | Transform data                           |
| Apply Function            | `df.apply(func)`                                  | Apply to DataFrame            | `df.apply(lambda x:x*2)`                                              | Column-wise transformation               |
| Merging                   | `pd.merge(df1, df2, on='key')`                    | Join datasets                 | `pd.merge(df1, df2, on='ID')`                                         | Combine datasets                         |
| Concatenation             | `pd.concat([df1, df2])`                           | Stack datasets                | `pd.concat([df1, df2])`                                               | Combine vertically/horizontally          |
| Pivot Table               | `df.pivot_table(values, index, columns, aggfunc)` | Summarize data                | `df.pivot_table(values='Salary', index='Department', aggfunc='mean')` | Aggregated view                          |
| Melt                      | `pd.melt(df, id_vars, value_vars)`                | Convert wide to long format   | `pd.melt(df, id_vars='Name', value_vars=['Math','Science'])`          | Reshape data                             |
| String Operations         | `df['col'].str.method()`                          | String manipulation           | `df['Name'].str.upper()`                                              | Clean text columns                       |
| DateTime Operations       | `pd.to_datetime(df['col'])`                       | Convert to datetime           | `df['JoinDate'] = pd.to_datetime(df['JoinDate'])`                     | Time-series analysis                     |
| Extract Date Parts        | `df['col'].dt.year/month/day`                     | Extract date info             | `df['Year'] = df['JoinDate'].dt.year`                                 | Feature engineering for dates            |
| Sampling                  | `df.sample(n=5)`                                  | Random sample                 | `df.sample(5)`                                                        | Explore data randomly                    |
| Duplicates                | `df.duplicated()`                                 | Check duplicates              | `df.duplicated().sum()`                                               | Identify repeated rows                   |
| Drop Duplicates           | `df.drop_duplicates()`                            | Remove duplicates             | `df.drop_duplicates(inplace=True)`                                    | Clean dataset                            |
| Correlation               | `df.corr()`                                       | Correlation between columns   | `df.corr()`                                                           | Feature selection                        |
| Plotting                  | `df['col'].plot(kind='hist')`                     | Plot column                   | `df['Salary'].plot(kind='hist')`                                      | Quick visualization                      |

---

### 🔹 Brief Concept Summary for Pandas:

1. **DataFrame** – 2D table structure (rows & columns) for storing datasets.
2. **Series** – 1D labeled array.
3. **Indexing/Selection** – `loc`, `iloc` for row/column selection.
4. **Aggregation & Grouping** – `groupby`, `sum`, `mean` to summarize data.
5. **Merging/Joining** – `merge`, `concat` to combine datasets.
6. **Handling Missing Values** – `isnull`, `fillna`, `dropna`.
7. **Reshaping** – `pivot_table`, `melt` to reshape data.
8. **String & DateTime Operations** – Clean textual or temporal data.
9. **Visualization** – Use built-in `plot` for quick graphs.

---

I can **expand this into a full “Pandas Cheat Sheet Master Table”** with **150+ Pandas syntaxes** including advanced features like **window functions, rolling operations, categorical handling, multi-indexing, advanced merging, plotting, and time series manipulation**, along with **hands-on examples for each**.

Do you want me to create that **mega Pandas master table** next?


#**Pandas Cheat Sheet Master Table**

 **basic to advanced features**, each with **concept, description, hands-on example, and usage**.



---

**Pandas Cheat Sheet Master Table**

---

## **1. Basics & Data Structures**

| Concept          | Syntax / Method       | Description                 | Hands-on Example                                            | Usage                      |
| ---------------- | --------------------- | --------------------------- | ----------------------------------------------------------- | -------------------------- |
| Import Pandas    | `import pandas as pd` | Load pandas library         | `import pandas as pd`                                       | Start using pandas         |
| Create DataFrame | `pd.DataFrame(data)`  | Create 2D table             | `df = pd.DataFrame({'Name':['Alice','Bob'],'Age':[25,30]})` | Store structured data      |
| Create Series    | `pd.Series(data)`     | 1D labeled array            | `s = pd.Series([10,20,30], index=['a','b','c'])`            | Handle single column data  |
| View Data        | `df.head(n)`          | First n rows                | `df.head(5)`                                                | Quick data inspection      |
| View Data        | `df.tail(n)`          | Last n rows                 | `df.tail(5)`                                                | Inspect last rows          |
| Info             | `df.info()`           | Column types, nulls         | `df.info()`                                                 | Dataset summary            |
| Describe         | `df.describe()`       | Statistics for numeric cols | `df.describe()`                                             | Quick statistical overview |
| Shape            | `df.shape`            | Rows & columns              | `df.shape`                                                  | Dataset size               |
| Columns          | `df.columns`          | Column names                | `df.columns`                                                | Check column labels        |
| Index            | `df.index`            | Row index                   | `df.index`                                                  | Understand index structure |
| Values           | `df.values`           | Numpy array of data         | `df.values`                                                 | Quick conversion to numpy  |

---

## **2. Indexing & Selection**

| Concept               | Syntax / Method                   | Description             | Example                                  | Usage                      |
| --------------------- | --------------------------------- | ----------------------- | ---------------------------------------- | -------------------------- |
| Select Column         | `df['col']`                       | Access single column    | `df['Age']`                              | Work with column           |
| Multiple Columns      | `df[['col1','col2']]`             | Access multiple columns | `df[['Name','Age']]`                     | Subset columns             |
| Row by Label          | `df.loc[label]`                   | Select row(s)           | `df.loc[0]`                              | Access specific rows       |
| Row by Position       | `df.iloc[pos]`                    | Select row(s)           | `df.iloc[0]`                             | Index-based selection      |
| Row & Column          | `df.loc[row, col]`                | Row & column selection  | `df.loc[0,'Name']`                       | Access single value        |
| Conditional Filtering | `df[df['col']>value]`             | Filter rows             | `df[df['Age']>25]`                       | Select rows by condition   |
| Multiple Conditions   | `(df['col1']>x) & (df['col2']<y)` | Combine filters         | `df[(df['Age']>25)&(df['Salary']>5000)]` | Complex filtering          |
| isin                  | `df[df['col'].isin(list)]`        | Filter rows in list     | `df[df['Dept'].isin(['HR','IT'])]`       | Subset specific categories |
| Between               | `df[df['col'].between(a,b)]`      | Filter between values   | `df[df['Age'].between(20,30)]`           | Range filtering            |

---

## **3. Adding, Modifying & Deleting Columns**

| Concept        | Syntax / Method                                  | Description            | Example                                            | Usage               |
| -------------- | ------------------------------------------------ | ---------------------- | -------------------------------------------------- | ------------------- |
| Add Column     | `df['new_col'] = ...`                            | Add new column         | `df['SalaryTax'] = df['Salary']*0.1`               | Feature engineering |
| Modify Column  | `df['col'] = ...`                                | Modify existing column | `df['Age'] = df['Age']+1`                          | Update data         |
| Drop Column    | `df.drop('col', axis=1, inplace=True)`           | Remove column          | `df.drop('SalaryTax', axis=1, inplace=True)`       | Clean dataset       |
| Rename Columns | `df.rename(columns={'old':'new'}, inplace=True)` | Rename columns         | `df.rename(columns={'Age':'Years'}, inplace=True)` | Clean names         |

---

## **4. Handling Missing Values**

| Concept       | Syntax / Method             | Description         | Example                             | Usage                 |
| ------------- | --------------------------- | ------------------- | ----------------------------------- | --------------------- |
| Detect Nulls  | `df.isnull()`               | Boolean mask        | `df.isnull()`                       | Identify missing data |
| Count Nulls   | `df.isnull().sum()`         | Number of missing   | `df.isnull().sum()`                 | Summary of nulls      |
| Drop Nulls    | `df.dropna()`               | Remove missing rows | `df.dropna(inplace=True)`           | Clean dataset         |
| Fill Nulls    | `df.fillna(value)`          | Replace missing     | `df['Age'].fillna(0, inplace=True)` | Impute missing data   |
| Forward Fill  | `df.fillna(method='ffill')` | Fill from previous  | `df.fillna(method='ffill')`         | Time series cleaning  |
| Backward Fill | `df.fillna(method='bfill')` | Fill from next      | `df.fillna(method='bfill')`         | Time series cleaning  |

---

## **5. Sorting & Indexing**

| Concept         | Syntax / Method                               | Description        | Example                                       | Usage            |
| --------------- | --------------------------------------------- | ------------------ | --------------------------------------------- | ---------------- |
| Sort by Column  | `df.sort_values('col')`                       | Ascending order    | `df.sort_values('Age')`                       | Order data       |
| Sort Descending | `df.sort_values('col', ascending=False)`      | Descending         | `df.sort_values('Age', ascending=False)`      | Reverse order    |
| Reset Index     | `df.reset_index(drop=True, inplace=True)`     | Reset row index    | `df.reset_index(drop=True)`                   | After filtering  |
| Set Index       | `df.set_index('col', inplace=True)`           | Column as index    | `df.set_index('Name', inplace=True)`          | Easy selection   |
| MultiIndex      | `df.set_index(['col1','col2'], inplace=True)` | Hierarchical index | `df.set_index(['Dept','Team'], inplace=True)` | Grouped indexing |

---

## **6. Aggregation & Grouping**

| Concept       | Syntax / Method            | Description     | Example                                 | Usage                  |
| ------------- | -------------------------- | --------------- | --------------------------------------- | ---------------------- |
| GroupBy       | `df.groupby('col')`        | Group data      | `df.groupby('Dept')['Salary'].mean()`   | Aggregation            |
| Aggregation   | `df.agg(func)`             | Apply function  | `df.agg({'Salary':'sum','Age':'mean'})` | Multiple stats         |
| Value Counts  | `df['col'].value_counts()` | Count unique    | `df['Dept'].value_counts()`             | Frequency distribution |
| Unique Values | `df['col'].unique()`       | Unique elements | `df['Dept'].unique()`                   | Categories             |
| Nunique       | `df['col'].nunique()`      | Count unique    | `df['Dept'].nunique()`                  | Unique counts          |

---

## **7. String Operations**

| Concept   | Syntax / Method                      | Description          | Example                                    | Usage            |
| --------- | ------------------------------------ | -------------------- | ------------------------------------------ | ---------------- |
| Uppercase | `df['col'].str.upper()`              | Convert to uppercase | `df['Name'].str.upper()`                   | Clean text       |
| Lowercase | `df['col'].str.lower()`              | Convert to lowercase | `df['Name'].str.lower()`                   | Standardize      |
| Length    | `df['col'].str.len()`                | String length        | `df['Name'].str.len()`                     | Text analysis    |
| Contains  | `df['col'].str.contains('pattern')`  | Check substring      | `df['Name'].str.contains('A')`             | Filter strings   |
| Replace   | `df['col'].str.replace('old','new')` | Replace string       | `df['Name'].str.replace('Alice','Alicia')` | Correct text     |
| Split     | `df['col'].str.split('sep')`         | Split string         | `df['Name'].str.split(' ')`                | Extract features |

---

## **8. DateTime Operations**

| Concept             | Syntax / Method             | Description        | Example                                           | Usage               |
| ------------------- | --------------------------- | ------------------ | ------------------------------------------------- | ------------------- |
| Convert to datetime | `pd.to_datetime(df['col'])` | Convert column     | `df['JoinDate'] = pd.to_datetime(df['JoinDate'])` | Time series         |
| Extract Year        | `df['col'].dt.year`         | Year from datetime | `df['Year'] = df['JoinDate'].dt.year`             | Feature engineering |
| Extract Month       | `df['col'].dt.month`        | Month              | `df['Month'] = df['JoinDate'].dt.month`           | Monthly aggregation |
| Extract Day         | `df['col'].dt.day`          | Day                | `df['Day'] = df['JoinDate'].dt.day`               | Daily analysis      |
| Weekday             | `df['col'].dt.weekday`      | Day of week        | `df['Weekday'] = df['JoinDate'].dt.weekday`       | Pattern detection   |
| Time Delta          | `df['col2'] - df['col1']`   | Difference         | `df['Duration'] = df['End']-df['Start']`          | Compute durations   |

---

## **9. Window Functions & Rolling**

| Concept      | Syntax / Method                      | Description     | Example                                     | Usage              |
| ------------ | ------------------------------------ | --------------- | ------------------------------------------- | ------------------ |
| Rolling Mean | `df['col'].rolling(window=n).mean()` | Moving average  | `df['Salary'].rolling(3).mean()`            | Smooth data        |
| Rolling Sum  | `df['col'].rolling(n).sum()`         | Moving sum      | `df['Sales'].rolling(3).sum()`              | Aggregate trends   |
| Expanding    | `df['col'].expanding().mean()`       | Cumulative mean | `df['Sales'].expanding().mean()`            | Running statistics |
| Shift        | `df['col'].shift(n)`                 | Shift values    | `df['Prev_Salary'] = df['Salary'].shift(1)` | Compare periods    |
| Rank         | `df['col'].rank()`                   | Rank values     | `df['SalaryRank'] = df['Salary'].rank()`    | Ordering           |

---

## **10. Categorical & Advanced Features**

| Concept            | Syntax / Method                                   | Description           | Example                                                      | Usage             |
| ------------------ | ------------------------------------------------- | --------------------- | ------------------------------------------------------------ | ----------------- |
| Categorical dtype  | `df['col'].astype('category')`                    | Convert to category   | `df['Dept'] = df['Dept'].astype('category')`                 | Efficient storage |
| Category Codes     | `df['col'].cat.codes`                             | Numeric codes         | `df['DeptCode'] = df['Dept'].cat.codes`                      | ML features       |
| Reorder Categories | `df['col'].cat.reorder_categories(['A','B','C'])` | Reorder               | `df['Dept'].cat.reorder_categories(['IT','HR','Sales'])`     | Analysis order    |
| One-Hot Encode     | `pd.get_dummies(df['col'])`                       | Encode categorical    | `pd.get_dummies(df['Dept'])`                                 | ML preprocessing  |
| MultiIndex         | `df.set_index(['col1','col2'])`                   | Hierarchical index    | `df.set_index(['Dept','Team'])`                              | Grouped analysis  |
| Stack              | `df.stack()`                                      | Pivot columns to rows | `df.stack()`                                                 | Reshape data      |
| Unstack            | `df.unstack()`                                    | Pivot rows to columns | `df.unstack()`                                               | Reshape data      |
| Melt               | `pd.melt(df, id_vars, value_vars)`                | Wide→long             | `pd.melt(df, id_vars='Name', value_vars=['Math','Science'])` | Reshape           |

---

## **11. Merging & Joining**

| Concept     | Syntax / Method                             | Description      | Example                                    | Usage            |
| ----------- | ------------------------------------------- | ---------------- | ------------------------------------------ | ---------------- |
| Merge       | `pd.merge(df1, df2, on='key')`              | SQL-style join   | `pd.merge(df1, df2, on='ID')`              | Combine datasets |
| Merge Left  | `pd.merge(df1, df2, on='key', how='left')`  | Left join        | `pd.merge(df1, df2, on='ID', how='left')`  | Combine datasets |
| Merge Right | `pd.merge(df1, df2, on='key', how='right')` | Right join       | `pd.merge(df1, df2, on='ID', how='right')` | Combine datasets |
| Merge Outer | `pd.merge(df1, df2, on='key', how='outer')` | Outer join       | `pd.merge(df1, df2, on='ID', how='outer')` | Combine datasets |
| Merge Inner | `pd.merge(df1, df2, on='key', how='inner')` | Inner join       | `pd.merge(df1, df2, on='ID', how='inner')` | Combine datasets |
| Concatenate | `pd.concat([df1, df2])`                     | Stack vertically | `pd.concat([df1, df2])`                    | Combine datasets |
| Join        | `df1.join(df2, on='key')`                   | Index-based join | `df1.join(df2, on='ID')`                   | Combine datasets |

---

## **12. Advanced Aggregation & Pivot**

| Concept     | Syntax / Method                                   | Description            | Example                                                         | Usage                  |
| ----------- | ------------------------------------------------- | ---------------------- | --------------------------------------------------------------- | ---------------------- |
| Pivot Table | `df.pivot_table(values, index, columns, aggfunc)` | Aggregation            | `df.pivot_table(values='Salary', index='Dept', aggfunc='mean')` | Summarize data         |
| Crosstab    | `pd.crosstab(df['col1'], df['col2'])`             | Frequency table        | `pd.crosstab(df['Dept'], df['Gender'])`                         | Analyze relationships  |
| Apply       | `df.apply(func)`                                  | Apply function         | `df.apply(lambda x:x*2)`                                        | Column-wise operations |
| Applymap    | `df.applymap(func)`                               | Element-wise           | `df.applymap(lambda x:x*2)`                                     | Transform data         |
| Map         | `df['col'].map(func)`                             | Map function to column | `df['Age'].map(lambda x:x+1)`                                   | Feature engineering    |

---

## **13. Plotting**

| Concept      | Syntax / Method                      | Description  | Example                                      | Usage                          |
| ------------ | ------------------------------------ | ------------ | -------------------------------------------- | ------------------------------ |
| Line Plot    | `df['col'].plot()`                   | Line graph   | `df['Sales'].plot()`                         | Trend visualization            |
| Bar Plot     | `df['col'].plot(kind='bar')`         | Bar chart    | `df['Dept'].value_counts().plot(kind='bar')` | Compare categories             |
| Histogram    | `df['col'].plot(kind='hist')`        | Distribution | `df['Age'].plot(kind='hist')`                | Check frequency distribution   |
| Box Plot     | `df.boxplot(column='col')`           | Boxplot      | `df.boxplot(column='Salary')`                | Detect outliers                |
| Scatter Plot | `df.plot.scatter(x='col1',y='col2')` | Scatter      | `df.plot.scatter(x='Age',y='Salary')`        | Relationship between variables |
| Area Plot    | `df.plot.area()`                     | Area plot    | `df.plot.area()`                             | Time series visualization      |

---

## **14. Time Series Manipulation**

| Concept        | Syntax / Method                      | Description         | Example                                                 | Usage                |
| -------------- | ------------------------------------ | ------------------- | ------------------------------------------------------- | -------------------- |
| Set Date Index | `df.set_index('date', inplace=True)` | Use date as index   | `df.set_index('Date', inplace=True)`                    | Time series analysis |
| Resample       | `df.resample('M').mean()`            | Aggregate by period | `df.resample('M').mean()`                               | Monthly average      |
| Shift          | `df['col'].shift(1)`                 | Lag values          | `df['PrevSales'] = df['Sales'].shift(1)`                | Compare periods      |
| Rolling        | `df['col'].rolling(3).mean()`        | Moving average      | `df['SalesMA'] = df['Sales'].rolling(3).mean()`         | Smooth trend         |
| Expanding      | `df['col'].expanding().sum()`        | Cumulative sum      | `df['CumulativeSales'] = df['Sales'].expanding().sum()` | Accumulate values    |

---

This **Pandas Master Table** now contains covering:

* **Basics, Indexing, Selection**
* **Column manipulation**
* **Missing values**
* **Sorting, grouping, aggregation**
* **String, datetime, categorical operations**
* **Window & rolling functions**
* **Merging, joining, pivot, crosstab**
* **Plotting & visualization**
* **Time series handling**



#**NUMPY reference guide**

**The following code  Covers:-**

 Array creation methods (array, zeros, ones, arange, linspace, eye, diag )

 Attributes (shape, ndim, dtype, size, itemsize)

 Indexing, slicing, boolean, fancy indexing

 Reshape, flatten, concatenate, split, transpose

 Broadcasting

 Arithmetic & universal functions

 Statistical functions (sum, mean, std, var, min, max, etc.)

 Logical and comparison operations

 Random module (rand, randint, randn, choice, seed)

 Linear algebra (dot, inv, det, eig, solve)

 File I/O (save, load, savetxt, loadtxt)

 Utility functions (unique, sort, argsort, where, clip)


In [None]:
# ============================================================
# NUMPY COMPLETE REFERENCE
# Covers: array creation, indexing, slicing, reshaping,
# broadcasting, math, stats, random, linear algebra, file I/O
# ============================================================

import numpy as np

# ------------------------------------------------------------
# 1️⃣ ARRAY CREATION
# ------------------------------------------------------------

# From Python list
arr1 = np.array([1, 2, 3, 4, 5])

# From nested list (creates 2D array)
arr2 = np.array([[1, 2, 3], [4, 5, 6]])

# Specify dtype (data type)
arr3 = np.array([1, 2, 3], dtype=float)

# Create arrays of zeros / ones
zeros = np.zeros((2, 3))         # 2x3 matrix of zeros
ones = np.ones((3, 2))           # 3x2 matrix of ones
empty = np.empty((2, 2))         # uninitialized (random memory values)

# Create arrays with a range of values
arange_arr = np.arange(0, 10, 2) # 0,2,4,6,8
linspace_arr = np.linspace(0, 1, 5) # 5 values between 0 and 1 inclusive

# Identity and diagonal matrices
identity = np.eye(3)
diag_arr = np.diag([10, 20, 30])

# ------------------------------------------------------------
# 2️⃣ ARRAY ATTRIBUTES
# ------------------------------------------------------------
shape = arr2.shape       # (2, 3)
ndim = arr2.ndim         # 2 dimensions
dtype = arr2.dtype       # data type (e.g. int64)
size = arr2.size         # total elements
itemsize = arr2.itemsize # bytes per element

# ------------------------------------------------------------
# 3️⃣ INDEXING AND SLICING
# ------------------------------------------------------------

arr = np.array([10, 20, 30, 40, 50])

first = arr[0]         # single element
last = arr[-1]         # last element
slice1 = arr[1:4]      # elements 1,2,3
slice2 = arr[:3]       # first 3
slice3 = arr[::2]      # every 2nd element

# 2D indexing
arr_2d = np.array([[10, 20, 30], [40, 50, 60]])
val = arr_2d[1, 2]     # row 1, col 2 = 60
row = arr_2d[0, :]     # first row
col = arr_2d[:, 1]     # second column

# Boolean indexing
bool_idx = arr > 25
filtered = arr[arr > 25]   # returns [30,40,50]

# Fancy indexing
fancy = arr[[0, 2, 4]]     # elements 0,2,4

# ------------------------------------------------------------
# 4️⃣ ARRAY MANIPULATION
# ------------------------------------------------------------

arr_a = np.array([[1, 2], [3, 4]])
arr_b = np.array([[5, 6]])

# Reshape
reshaped = np.arange(6).reshape(2, 3)

# Flatten and ravel
flat = reshaped.flatten()  # copy
ravelled = reshaped.ravel() # view (no copy)

# Transpose
transposed = arr_a.T

# Concatenation
concat_v = np.vstack((arr_a, arr_b))
concat_h = np.hstack((arr_a, arr_b.T))

# Split
split_arr = np.split(np.arange(10), 2)

# ------------------------------------------------------------
# 5️⃣ BROADCASTING
# ------------------------------------------------------------

# Adding scalar
a = np.array([1, 2, 3])
b = 5
res = a + b  # adds 5 to each element

# Broadcasting 1D to 2D
m = np.ones((3, 3))
n = np.array([1, 2, 3])
broadcasted = m + n

# ------------------------------------------------------------
# 6️⃣ MATHEMATICAL OPERATIONS
# ------------------------------------------------------------

arr_math = np.array([1, 2, 3, 4])

# Element-wise operations
add = arr_math + 10
sub = arr_math - 2
mul = arr_math * 3
div = arr_math / 2
pow_arr = arr_math ** 2

# Universal functions (ufuncs)
sqrt = np.sqrt(arr_math)
exp = np.exp(arr_math)
log = np.log(arr_math)

# Trigonometric
sin = np.sin(arr_math)
cos = np.cos(arr_math)
tan = np.tan(arr_math)

# ------------------------------------------------------------
# 7️⃣ AGGREGATIONS AND STATISTICS
# ------------------------------------------------------------

stats_arr = np.array([[1, 2, 3], [4, 5, 6]])

total = np.sum(stats_arr)
col_sum = np.sum(stats_arr, axis=0)
row_sum = np.sum(stats_arr, axis=1)
mean = np.mean(stats_arr)
median = np.median(stats_arr)
std_dev = np.std(stats_arr)
var = np.var(stats_arr)
min_val = np.min(stats_arr)
max_val = np.max(stats_arr)
argmin = np.argmin(stats_arr)
argmax = np.argmax(stats_arr)

# ------------------------------------------------------------
# 8️⃣ COMPARISONS AND LOGICAL OPERATIONS
# ------------------------------------------------------------

x = np.array([1, 2, 3])
y = np.array([2, 2, 3])

equal = x == y
not_equal = x != y
greater = x > y
logical_and = np.logical_and(x > 1, y > 1)
logical_or = np.logical_or(x > 1, y == 3)
any_true = np.any(equal)
all_true = np.all(equal)

# ------------------------------------------------------------
# 9️⃣ RANDOM MODULE
# ------------------------------------------------------------

# Random integers
rand_int = np.random.randint(0, 10, (2, 3))

# Random floats
rand_float = np.random.rand(2, 3)

# Normal distribution
normal = np.random.randn(3, 3)

# Random choice
choice = np.random.choice([10, 20, 30, 40], size=3)

# Shuffle and permutation
arr_rand = np.arange(10)
np.random.shuffle(arr_rand)
perm = np.random.permutation(arr_rand)

# Set seed for reproducibility
np.random.seed(42)
seeded = np.random.rand(2)

# ------------------------------------------------------------
# 🔟 LINEAR ALGEBRA
# ------------------------------------------------------------

mat_a = np.array([[1, 2], [3, 4]])
mat_b = np.array([[5, 6], [7, 8]])

dot_product = np.dot(mat_a, mat_b)
matrix_mult = mat_a @ mat_b
transpose = mat_a.T
determinant = np.linalg.det(mat_a)
inverse = np.linalg.inv(mat_a)
eig_vals, eig_vecs = np.linalg.eig(mat_a)

# Solve linear system Ax = b
A = np.array([[3, 1], [1, 2]])
b = np.array([9, 8])
solution = np.linalg.solve(A, b)

# ------------------------------------------------------------
# 1️⃣1️⃣ FILE INPUT/OUTPUT
# ------------------------------------------------------------

# Save and load arrays
np.save('array_data.npy', arr1)
loaded_arr = np.load('array_data.npy')

# Save/load multiple arrays
np.savez('multi_data.npz', a=arr1, b=arr2)
multi_load = np.load('multi_data.npz')

# Save to text (CSV)
np.savetxt('array.csv', arr2, delimiter=',', fmt='%d')
loaded_csv = np.loadtxt('array.csv', delimiter=',')

# ------------------------------------------------------------
# 1️⃣2️⃣ MISC / UTILITIES
# ------------------------------------------------------------

unique_vals = np.unique(np.array([1, 2, 2, 3, 3, 3]))
sorted_arr = np.sort(np.array([3, 1, 2]))
argsorted = np.argsort(np.array([3, 1, 2]))
where_cond = np.where(arr > 25)
clipped = np.clip(arr, 20, 40)
copy_arr = np.copy(arr)
reshaped_contig = np.ascontiguousarray(arr2)

# ------------------------------------------------------------
# END OF FILE
# ------------------------------------------------------------

print(" NumPy reference script executed successfully!")


#**Pandas reference guide**

**It covers:**

 Series and DataFrames

 Indexing, filtering, slicing

 Data cleaning, missing values

 Grouping, merging, joining, pivoting

 Date/time handling

 File I/O (CSV, Excel, JSON, etc.)

 Descriptive stats and visualization basics

| Category                | Concepts Included                               |
| ----------------------- | ----------------------------------------------- |
| **Core Structures**     | Series, DataFrame, Index                        |
| **Data Input/Output**   | CSV, Excel, JSON                                |
| **Data Inspection**     | `info()`, `describe()`, `dtypes`, `shape`, etc. |
| **Indexing/Filtering**  | `loc`, `iloc`, boolean filters                  |
| **Missing Data**        | `isna()`, `dropna()`, `fillna()`                |
| **Data Transformation** | `apply()`, `map()`, renaming, adding columns    |
| **Aggregation**         | `groupby()`, `agg()`, custom functions          |
| **Merging & Joining**   | `merge()`, `concat()`                           |
| **Reshaping**           | `pivot()`, `melt()`                             |
| **Datetime Operations** | `date_range()`, `resample()`                    |
| **Statistics**          | mean, median, corr, cov, std                    |
| **Categoricals**        | `astype('category')`                            |
| **Visualization**       | integrated plotting                             |
| **Optimization**        | memory usage introspection                      |


In [None]:
# ================================================================
#  PANDAS COMPLETE REFERENCE SCRIPT
# Covers: Series, DataFrames, indexing, filtering, grouping,
# merging, reshaping, datetime, I/O, statistics, and more.
# ================================================================

import pandas as pd
import numpy as np

# ---------------------------------------------------------------
# 1️⃣ PANDAS BASIC CONCEPTS
# ---------------------------------------------------------------
# Pandas is built on top of NumPy — it adds labeled data structures.
# Core objects:
# - Series (1D labeled array)
# - DataFrame (2D labeled table)
# - Index (labels for rows/columns)

# ---------------------------------------------------------------
# 2️⃣ SERIES CREATION AND BASIC OPERATIONS
# ---------------------------------------------------------------

# Create a Series from a Python list
s = pd.Series([10, 20, 30, 40], name="Numbers")

# Custom index
s_custom = pd.Series([1, 2, 3], index=['a', 'b', 'c'], name="CustomSeries")

# Series from dict
s_dict = pd.Series({'A': 10, 'B': 20, 'C': 30})

# Access elements
val = s[0]           # by position
val2 = s_custom['b'] # by label

# Vectorized operations (like NumPy)
s_add = s + 5
s_sqrt = np.sqrt(s)

# Boolean filtering
filtered = s[s > 20]

# ---------------------------------------------------------------
# 3️⃣ DATAFRAME CREATION
# ---------------------------------------------------------------

# From dictionary of lists
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David'],
    'Age': [24, 27, 22, 32],
    'City': ['NY', 'LA', 'Chicago', 'Houston']
}
df = pd.DataFrame(data)

# From list of dicts
df2 = pd.DataFrame([
    {'A': 1, 'B': 2},
    {'A': 3, 'B': 4}
])

# From NumPy array
arr = np.arange(9).reshape(3, 3)
df_np = pd.DataFrame(arr, columns=['X', 'Y', 'Z'])

# ---------------------------------------------------------------
# 4️⃣ INSPECTING DATA
# ---------------------------------------------------------------

df.head()        # first 5 rows
df.tail(2)       # last 2 rows
df.info()        # summary of columns and dtypes
df.describe()    # statistics for numeric columns
df.shape         # (rows, columns)
df.columns       # column labels
df.index         # row labels
df.dtypes        # data types
df.values        # NumPy representation

# ---------------------------------------------------------------
# 5️⃣ INDEXING, SELECTION, AND FILTERING
# ---------------------------------------------------------------

# Single column
names = df['Name']      # returns a Series

# Multiple columns
subset = df[['Name', 'City']]

# Row selection by position (iloc) and label (loc)
first_row = df.iloc[0]           # by index position
row_bob = df.loc[1]              # by index label (if default int index)

# Cell access
cell = df.at[1, 'Age']           # fast access by label
cell_i = df.iat[1, 1]            # fast access by position

# Conditional filtering
filter_age = df[df['Age'] > 25]
filter_mult = df[(df['Age'] > 25) & (df['City'] == 'LA')]

# ---------------------------------------------------------------
# 6️⃣ ADDING, MODIFYING, AND DROPPING DATA
# ---------------------------------------------------------------

# Add a new column
df['Salary'] = [50000, 54000, 49000, 62000]

# Modify values
df.loc[df['Name'] == 'Alice', 'Salary'] = 55000

# Drop column / row
df_dropped_col = df.drop('City', axis=1)
df_dropped_row = df.drop(2, axis=0)

# Rename columns
df_renamed = df.rename(columns={'Name': 'FullName'})

# Reset and set index
df_reset = df.reset_index(drop=True)
df_indexed = df.set_index('Name')

# ---------------------------------------------------------------
# 7️⃣ HANDLING MISSING DATA
# ---------------------------------------------------------------

df_missing = pd.DataFrame({
    'A': [1, np.nan, 3],
    'B': [np.nan, 5, 6]
})

# Detect missing values
df_missing.isna()
df_missing.notna()

# Drop rows/columns with NaN
dropped_na_rows = df_missing.dropna(axis=0)
dropped_na_cols = df_missing.dropna(axis=1)

# Fill missing values
filled = df_missing.fillna(0)
filled_mean = df_missing.fillna(df_missing.mean())

# Forward/backward fill (use previous/next value)
forward_fill = df_missing.ffill()
backward_fill = df_missing.bfill()

# ---------------------------------------------------------------
# 8️⃣ SORTING
# ---------------------------------------------------------------

# Sort by column
sorted_age = df.sort_values(by='Age', ascending=False)

# Sort by index
sorted_idx = df.sort_index()

# ---------------------------------------------------------------
# 9️⃣ GROUPING AND AGGREGATION
# ---------------------------------------------------------------

# Group by one column
grouped = df.groupby('City')['Salary'].mean()

# Multiple aggregations
agg_multi = df.groupby('City').agg({'Salary': ['mean', 'max'], 'Age': 'median'})

# Apply custom function
custom_func = df.groupby('City')['Salary'].apply(lambda x: x.max() - x.min())

# ---------------------------------------------------------------
# 🔟 MERGING, JOINING, CONCATENATING
# ---------------------------------------------------------------

df_left = pd.DataFrame({'ID': [1, 2, 3], 'Name': ['A', 'B', 'C']})
df_right = pd.DataFrame({'ID': [2, 3, 4], 'Score': [85, 90, 95]})

# Merge (SQL-style join)
merged_inner = pd.merge(df_left, df_right, on='ID', how='inner')
merged_outer = pd.merge(df_left, df_right, on='ID', how='outer')

# Concatenate vertically (stack rows)
concat_rows = pd.concat([df_left, df_right], axis=0, ignore_index=True)

# Concatenate horizontally (add columns)
concat_cols = pd.concat([df_left, df_right], axis=1)

# ---------------------------------------------------------------
# 1️⃣1️⃣ PIVOT, MELT, AND RESHAPING
# ---------------------------------------------------------------

sales = pd.DataFrame({
    'Month': ['Jan', 'Jan', 'Feb', 'Feb'],
    'City': ['NY', 'LA', 'NY', 'LA'],
    'Sales': [200, 180, 210, 190]
})

# Pivot (reshape long → wide)
pivoted = sales.pivot(index='Month', columns='City', values='Sales')

# Melt (reshape wide → long)
melted = pivoted.reset_index().melt(id_vars='Month', value_name='Sales')

# ---------------------------------------------------------------
# 1️⃣2️⃣ DATE AND TIME OPERATIONS
# ---------------------------------------------------------------

# Create date range
dates = pd.date_range(start='2024-01-01', periods=6, freq='D')

# DataFrame with datetime index
df_time = pd.DataFrame({'Value': np.random.randint(10, 100, 6)}, index=dates)

# Access by date
df_time.loc['2024-01-03']

# Resampling (useful for time series)
weekly_mean = df_time.resample('W').mean()

# Extract date parts
df_time['Year'] = df_time.index.year
df_time['Month'] = df_time.index.month
df_time['Day'] = df_time.index.day

# ---------------------------------------------------------------
# 1️⃣3️⃣ DESCRIPTIVE STATISTICS
# ---------------------------------------------------------------

df_stats = pd.DataFrame({
    'A': [10, 20, 30, 40, 50],
    'B': [5, 15, 25, 35, 45]
})

mean_vals = df_stats.mean()
median_vals = df_stats.median()
std_vals = df_stats.std()
corr = df_stats.corr()
cov = df_stats.cov()
value_counts = df['City'].value_counts()

# ---------------------------------------------------------------
# 1️⃣4️⃣ APPLY, MAP, AND LAMBDA FUNCTIONS
# ---------------------------------------------------------------

# Apply to columns
df['Age_plus_10'] = df['Age'].apply(lambda x: x + 10)

# Map to transform categorical values
df['City_Code'] = df['City'].map({'NY': 1, 'LA': 2, 'Chicago': 3, 'Houston': 4})

# Apply to entire DataFrame
df_apply = df[['Age', 'Salary']].apply(np.log)

# ---------------------------------------------------------------
# 1️⃣5️⃣ FILE I/O OPERATIONS
# ---------------------------------------------------------------

# CSV
df.to_csv('pandas_data.csv', index=False)
loaded_csv = pd.read_csv('pandas_data.csv')

# Excel
df.to_excel('pandas_data.xlsx', index=False)
loaded_excel = pd.read_excel('pandas_data.xlsx')

# JSON
df.to_json('pandas_data.json', orient='records')
loaded_json = pd.read_json('pandas_data.json')

# ---------------------------------------------------------------
# 1️⃣6️⃣ ADVANCED: CATEGORICALS AND MEMORY OPTIMIZATION
# ---------------------------------------------------------------

df['City'] = df['City'].astype('category')   # saves memory for repeated values
memory_usage = df.memory_usage(deep=True)

# ---------------------------------------------------------------
# 1️⃣7️⃣ VISUALIZATION INTEGRATION (MINIMAL DEMO)
# ---------------------------------------------------------------

# Pandas integrates with Matplotlib
import matplotlib.pyplot as plt

df.plot(x='Name', y='Salary', kind='bar', title='Salary by Employee')
plt.xlabel("Employee")
plt.ylabel("Salary")
plt.show()

# ---------------------------------------------------------------
#  END OF SCRIPT
# ---------------------------------------------------------------
print(" Pandas comprehensive reference script executed successfully!")
