 # 1 YDATA PROFILING
YData Profiling (formerly pandas_profiling) is a Python library that automatically generates an HTML Exploratory Data Analysis (EDA) report from a Pandas DataFrame.

It quickly shows data structure, missing values, distributions, correlations, and basic warnings.

It is useful for a first look / sanity check on small to medium datasets.

It does not clean data, build models, or replace manual EDA.

It is slow and memory-heavy on large datasets and not for production.

One-line definition (exam-safe):

YData Profiling automatically creates a detailed EDA report to quickly understand a dataset’s structure and quality.

In [1]:
import seaborn as sns
df=sns.load_dataset('tips')
from ydata_profiling import ProfileReport


In [2]:
profile=ProfileReport(df,explorative=True)
profile.to_file("report.pdf")

Summarize dataset:   0%|          | 0/5 [00:00<?, ?it/s]


100%|███████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 155.76it/s][A


Generate report structure:   0%|          | 0/1 [00:00<?, ?it/s]

Render HTML:   0%|          | 0/1 [00:00<?, ?it/s]



Export report to file:   0%|          | 0/1 [00:00<?, ?it/s]

# 2 DTALE 
D-Tale is a Python library that provides an interactive web-based UI to explore Pandas DataFrames.

1. You can sort, filter, search, plot, and inspect data live in a browser.

2. It is mainly used for hands-on, interactive EDA, not reports.

3. Changes are temporary (it doesn’t clean or save data automatically).

4. Best for small to medium datasets.

5. Not suitable for production pipelines or large datasets.

## One-line definition (exam-safe):

D-Tale is an interactive web tool for exploring Pandas DataFrames in real time using a browser-based interface.

In [3]:
import pandas as pd
import seaborn as sns

In [4]:
df=sns.load_dataset('titanic')

In [5]:
import dtale
dtale.show(df)



# 3 Sweetviz
Sweetviz is a Python library that automatically generates an HTML EDA report from a Pandas DataFrame, with a strong focus on target-variable comparison.

It’s maintained as an open-source tool (commonly referred to as Sweetviz).

## What Sweetviz is used for

Quick EDA summary (distributions, missing values, correlations)

Comparing features against a target (classification or regression)

Simple before/after dataset comparison

In [None]:
import sweetviz as sv
report =sv.analyze(df)
report.show_html()

# 4 Autoviz
AutoViz is a Python library that automatically generates visual EDA plots (graphs) from a dataset.

It focuses on visualizations, not reports or interactive tables.

In [3]:
! pip install autoviz
from autoviz import AutoViz_Class
av=AutoViz_Class()
dft=av.AutoViz()

Defaulting to user installation because normal site-packages is not writeable
Collecting autoviz
  Using cached autoviz-0.1.905-py3-none-any.whl.metadata (14 kB)
Collecting emoji (from autoviz)
  Using cached emoji-2.15.0-py3-none-any.whl.metadata (5.7 kB)
Collecting pyamg (from autoviz)
  Using cached pyamg-5.3.0-cp313-cp313-win_amd64.whl.metadata (8.3 kB)
Collecting textblob (from autoviz)
  Using cached textblob-0.19.0-py3-none-any.whl.metadata (4.4 kB)
Collecting xgboost<1.7,>=0.82 (from autoviz)
  Using cached xgboost-1.6.2-py3-none-win_amd64.whl.metadata (1.8 kB)
Collecting pandas-dq>=1.29 (from autoviz)
  Using cached pandas_dq-1.29-py3-none-any.whl.metadata (19 kB)
Collecting hvplot>=0.9.2 (from autoviz)
  Using cached hvplot-0.12.2-py3-none-any.whl.metadata (19 kB)
Collecting holoviews>=1.16.0 (from autoviz)
  Using cached holoviews-1.22.1-py3-none-any.whl.metadata (10 kB)
Collecting panel>=1.4.0 (from autoviz)
  Using cached panel-1.8.5-py3-none-any.whl.metadata (15 kB)
Colle

TypeError: AutoViz_Class.AutoViz() missing 1 required positional argument: 'filename'

#  Dataprep
DataPrep is a Python library used for quick exploratory data analysis (EDA) with interactive visualizations, mainly in Jupyter Notebook and Google Colab.

## What it does

1. Generates automatic EDA plots

2. Shows missing values and distributions

3. Provides interactive charts in notebooks

## What it does NOT do

1. No model training

2. No feature engineering

3 Not for production pipelines

## Limitations

1. Slow on large datasets

2. Installation issues are common

3. Limited customization

4. Less actively maintained

In [None]:
! pip install dataprep
from dataprep.datasets import load_dataset
from dataprep.eda import create_report
create_repoprt('titanic').show_browser()

# lux
Lux is a Python library that provides automatic visualization recommendations for Pandas DataFrames while you explore data.

It is developed by Lux.

## What Lux does

Automatically suggests relevant plots

Integrates directly with Pandas

Updates visualizations when you filter or modify data

Helps discover patterns without writing plot code

In [2]:
! pip install lux
import pandas as pd

Defaulting to user installation because normal site-packages is not writeable


In [6]:
df = pd.read_csv("https://raw.githubusercontent.com/lux-org/lux-datasets/master/data/college.csv")


In [7]:
df

Unnamed: 0,Name,PredominantDegree,HighestDegree,FundingModel,Region,Geography,AdmissionRate,ACTMedian,SATAverage,AverageCost,Expenditure,AverageFacultySalary,MedianDebt,AverageAgeofEntry,MedianFamilyIncome,MedianEarnings
0,Alabama A & M University,Bachelor's,Graduate,Public,Southeast,Mid-size City,0.8989,17,823,18888,7459,7079,19500.0,20.629999,29039.0,27000
1,University of Alabama at Birmingham,Bachelor's,Graduate,Public,Southeast,Mid-size City,0.8673,25,1146,19990,17208,10170,16250.0,22.670000,34909.0,37200
2,University of Alabama in Huntsville,Bachelor's,Graduate,Public,Southeast,Mid-size City,0.8062,26,1180,20306,9352,9341,16500.0,23.190001,39766.0,41500
3,Alabama State University,Bachelor's,Graduate,Public,Southeast,Mid-size City,0.5125,17,830,17400,7393,6557,15854.5,20.889999,24029.5,22400
4,The University of Alabama,Bachelor's,Graduate,Public,Southeast,Small City,0.5655,26,1171,26717,9817,9605,17750.0,20.770000,58976.0,39200
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1289,University of Connecticut-Avery Point,Bachelor's,Graduate,Public,New England,Mid-size Suburb,0.5940,24,1020,12946,11730,14803,18983.0,20.120001,86510.0,49700
1290,University of Connecticut-Stamford,Bachelor's,Graduate,Public,New England,Mid-size City,0.4107,21,1017,13028,4958,14803,18983.0,20.120001,86510.0,49700
1291,California State University-Channel Islands,Bachelor's,Graduate,Public,Far West,Mid-size Suburb,0.6443,20,954,22570,12026,8434,12500.0,24.850000,32103.0,35800
1292,DigiPen Institute of Technology,Bachelor's,Graduate,Private For-Profit,Far West,Small City,0.6635,28,1225,37848,5998,7659,19000.0,21.209999,68233.0,72800
