### 4. Data Cleaning

**Definition**:  
Data cleaning is the process of detecting and correcting (or removing) inaccurate records from a dataset. It's crucial to ensure that your data is reliable and usable for analysis.

**Key Concepts**:
- **Handling Missing Values**:
  - **Drop Missing Values**: Remove rows or columns with missing data.
  - **Fill Missing Values**: Use methods like forward-fill, backward-fill, or fill with a specific value (mean, median).
  
- **Removing Duplicates**:
  - Identifying and dropping duplicate records to maintain data integrity.

- **Correcting Data Types**:
  - Ensure that each column has the correct data type (e.g., integers, floats, dates) for accurate analysis.

- **Library**:
  - **Pandas**: Provides functions for data cleaning and preprocessing.

- **Example Usage**:
Here's how to clean a DataFrame in Pandas:

```python
import pandas as pd

# Sample data with missing values and duplicates
data = {
    'Title': ['Video 1', 'Video 2', None, 'Video 2'],
    'URL': ['http://youtube.com/video1', 'http://youtube.com/video2', 'http://youtube.com/video3', 'http://youtube.com/video2']
}

# Create a DataFrame
df = pd.DataFrame(data)

# Drop duplicates
df.drop_duplicates(inplace=True)

# Fill missing values
df['Title'].fillna('Unknown', inplace=True)

# Display cleaned DataFrame
print(df)
```