# Features of Pandas Profiling

1. **Overview of Variables**:
   - Provides a detailed summary of each variable in the dataset.
   - Includes counts, mean, median, mode, minimum, maximum, standard deviation, and more.
   - Visualizes the distribution of numerical and categorical variables.

2. **Missing Values Analysis**:
   - Identifies missing values and calculates their percentage.
   - Visualizes missing value patterns with heatmaps and bar charts.

3. **Correlations**:
   - Computes and displays correlation coefficients such as Pearson, Spearman, and Kendall.
   - Highlights highly correlated features for easier identification of redundant variables.

4. **Data Quality Assessment**:
   - Detects potential issues like:
     - Constant values.
     - High cardinality variables (too many unique values in categorical columns).
     - Duplicate rows.
   - Flags potential data cleaning requirements.

5. **Interactive Visualizations**:
   - Offers visualizations such as:
     - Histograms and KDE plots for numerical distributions.
     - Box plots to identify outliers.
     - Pie charts for categorical data representation.
     - Heatmaps for correlations.

6. **Customizable Reports**:
   - Generates detailed, interactive reports in HTML or JSON format.
   - Easily shared or embedded in data pipelines.

7. **Descriptive Warnings**:
   - Highlights:
     - Columns with zero variance.
     - Skewed distributions.
     - Sparse data in columns.
   - Provides insights on potential preprocessing needs.

8. **Explorative Mode**:
   - Enables more flexible reporting with additional insights for exploratory data analysis (EDA).

9. **Support for Large Datasets**:
   - Optimized for handling large datasets with configurations to reduce computation time.

10. **Integration with Pandas**:
    - Works seamlessly with pandas DataFrames, making it a convenient tool for data scientists and analysts.


In [None]:
import pandas as pd
#from pandas_profiling import ProfileReport
from ydata_profiling import ProfileReport

In [None]:
url = 'https://raw.githubusercontent.com/rashakil-ds/Public-Datasets/refs/heads/main/credit.csv'
df = pd.read_csv(url)
df.head()

In [None]:
df.shape

# Generate a profiling report

In [None]:
profile = ProfileReport(df, title="Pandas Profiling Report", explorative=True)
profile