# 📊 Pandas DataFrame Methods – Quick Reference with Argument Purpose

| Method               | Purpose (Plain English)                                      | Common Arguments / Variants                          | Why Use These Arguments                                |
|----------------------|--------------------------------------------------------------|------------------------------------------------------|--------------------------------------------------------|
| `df.info()`          | Overview of columns, types, nulls, memory                    | `verbose=True`, `memory_usage='deep'`                | Show full column info; get accurate memory usage       |
| `df.describe()`      | Summary stats for numeric or all columns                     | `include='all'`, `percentiles=[...]`                 | Include non-numeric columns; customize quantiles       |
| `df.head()`          | First few rows                                               | `n=5`                                                | Preview top `n` rows                                   |
| `df.tail()`          | Last few rows                                                | `n=5`                                                | Preview bottom `n` rows                                |
| `df.shape`           | Tuple of (rows, columns)                                     | —                                                    | Check dataset size                                     |
| `df.columns`         | List of column names                                         | —                                                    | Useful for renaming or selection                       |
| `df.index`           | Row index info                                               | —                                                    | Use when resetting or setting index                    |
| `df.dtypes`          | Data types of each column                                    | —                                                    | Identify numeric, object, datetime types               |
| `df.count()`         | Count of non-null entries                                    | `axis=0`                                              | Count by column (`axis=0`) or row (`axis=1`)           |
| `df.nunique()`       | Number of unique values per column                           | `dropna=True`                                        | Include/exclude NaNs in uniqueness count               |
| `df.value_counts()`  | Frequency of unique values in a Series                       | `normalize=True`, `dropna=False`                     | Show proportions; include/exclude missing values       |
| `df.mean()`          | Column-wise mean                                             | `axis=0`, `skipna=True`                              | Skip NaNs; choose row or column-wise mean              |
| `df.median()`        | Column-wise median                                           | `axis=0`, `skipna=True`                              | Robust central tendency; skip NaNs                     |
| `df.std()`           | Standard deviation                                           | `axis=0`, `skipna=True`                              | Assess spread; skip NaNs                               |
| `df.min()`           | Minimum value per column                                     | `axis=0`, `skipna=True`                              | Find lowest values; skip NaNs                          |
| `df.max()`           | Maximum value per column                                     | `axis=0`, `skipna=True`                              | Find highest values; skip NaNs                         |
| `df.sum()`           | Sum of values per column                                     | `axis=0`, `skipna=True`                              | Total values; skip NaNs                                |
| `df.corr()`          | Correlation matrix                                           | `method='pearson'`, `'spearman'`, `'kendall'`        | Choose correlation type based on data distribution     |
| `df.mode()`          | Most frequent value(s)                                       | `dropna=True`                                        | Include/exclude NaNs in mode calculation               |
| `df.isnull()`        | Boolean mask of missing values                               | —                                                    | Identify missing entries                               |
| `df.notnull()`       | Boolean mask of non-missing values                           | —                                                    | Identify complete entries                              |
| `df.duplicated()`    | Boolean mask of duplicate rows                               | `subset=[...]`, `keep='first'`                       | Check duplicates in specific columns; keep logic       |
| `df.dropna()`        | Remove missing values                                        | `axis=0`, `subset=[...]`, `how='any'`                | Drop rows/columns with missing data based on rules     |
| `df.fillna()`        | Fill missing values                                          | `value=...`, `method='ffill'`, `'bfill'`             | Impute with static value or forward/backward fill      |
| `df.sort_values()`   | Sort by column(s)                                            | `by='col'`, `ascending=True`                         | Sort by one or more columns; control order             |
| `df.sort_index()`    | Sort by index                                                | `ascending=True`                                     | Reorder rows by index                                  |
| `df.rename()`        | Rename columns or index                                      | `columns={'old':'new'}`                              | Standardize or clarify column names                    |
| `df.astype()`        | Convert data types                                           | `{'col': 'int'}`                                     | Fix types for analysis or modeling                     |
| `df.drop()`          | Remove rows or columns                                       | `columns=[...]`, `axis=1`                            | Clean up unwanted data                                 |
| `df.apply()`         | Apply function across rows or columns                        | `axis=0`, `func=...`                                 | Use custom logic across rows or columns                |
| `df.groupby()`       | Group data by column(s)                                      | `by='col'`                                           | Aggregate by category or group                         |
| `df.agg()`           | Aggregate with multiple functions                            | `{'col': ['mean', 'std']}`                           | Apply multiple stats to grouped data                   |