# Pandas

Pandas provides several functions to help check and explore your data efficiently. Here are some key functions for data checking:

### 1. **Basic Information**
- **`df.head()`**: Displays the first 5 rows of the DataFrame.
- **`df.tail()`**: Displays the last 5 rows of the DataFrame.
- **`df.info()`**: Provides a summary of the DataFrame, including column names, data types, and non-null counts.
- **`df.describe()`**: Generates descriptive statistics (e.g., count, mean, standard deviation) for numerical columns.
- **`df.shape`**: Returns the shape of the DataFrame (number of rows, columns).
- **`df.columns`**: Lists all column names.
- **`df.dtypes`**: Returns the data types of each column.

### 2. **Missing Data**
- **`df.isnull()`**: Checks for missing values, returning a DataFrame of booleans.
- **`df.isnull().sum()`**: Counts the number of missing values in each column.
- **`df.notnull()`**: Checks for non-missing values.
- **`df.dropna()`**: Removes rows with missing values.
- **`df.fillna()`**: Fills missing values with a specified value or method (e.g., forward fill or backward fill).

### 3. **Duplicated Data**
- **`df.duplicated()`**: Returns a boolean Series indicating duplicate rows.
- **`df.drop_duplicates()`**: Removes duplicate rows from the DataFrame.

### 4. **Data Types**
- **`df.astype()`**: Changes the data type of a column.
- **`pd.to_numeric()`**: Converts a column to numeric values (useful for converting strings to numbers).
- **`pd.to_datetime()`**: Converts a column to a datetime format.

### 5. **Unique Values and Value Counts**
- **`df['column'].unique()`**: Returns the unique values of a column.
- **`df['column'].nunique()`**: Returns the number of unique values in a column.
- **`df['column'].value_counts()`**: Counts the frequency of unique values in a column.

### 6. **Summarization**
- **`df.mean()`**: Calculates the mean of numeric columns.
- **`df.median()`**: Calculates the median of numeric columns.
- **`df.mode()`**: Finds the mode (most frequent value) of columns.
- **`df.corr()`**: Computes pairwise correlation of columns.
- **`df.count()`**: Returns the number of non-null observations in each column.

### 7. **Conditional Checks**
- **`df[df['column'] > value]`**: Returns rows where a condition holds true (e.g., column value greater than a specific value).
- **`df.query('column > value')`**: Another way to filter rows based on conditions.

These functions can help you better understand the structure and quality of your dataset.