In [None]:
import datetime as dt
import pandas as pd

In [None]:
import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

In this notebook we are going to compare the top Auto EDA libraries that really facilite the dataset analysis and visualziation!

The main idea is that you can compare time to plot and results of each AutoEda library so you can select the best for your work.

<a id='1'></a>
# <p style="background-color:skyblue; font-family:newtimeroman; font-size:150%; text-align:center">1. 📊 Dataprep 📚</p>

In [None]:
!pip install dataprep

In [None]:
from dataprep.eda import plot, plot_correlation, create_report, plot_missing

In [None]:
df = pd.read_csv('/kaggle/input/titanic/train.csv')

In [None]:
# Start of Pandas Profiling process
start_time = dt.datetime.now()
print("Started at ", start_time)

In [None]:
plot(df)

In [None]:
create_report(df)

In [None]:
plot(df, "Age")

In [None]:
plot(df, "Age", "Embarked")

In [None]:
print('Dataprep finished!!')
finish_time = dt.datetime.now()
print("Finished at ", finish_time)
elapsed = finish_time - start_time
print("Elapsed time: ", elapsed)

<a id='2'></a>
# <p style="background-color:skyblue; font-family:newtimeroman; font-size:150%; text-align:center">2. 📊 AutoViz 📚</p>

AutoViz stands out of the crowd of freeware Pythonic Rapid EDA Automation tools, doing things in a very fast way, the way better than its close freeware rivals like SweetViz or Pandas Profiling

In [None]:
!pip install git+git://github.com/AutoViML/AutoViz.git
!pip install xlrd

In [None]:
# Start of AutoViz process
start_time = dt.datetime.now()
print("Started at ", start_time)

In [None]:
df = pd.read_csv('/kaggle/input/creditcardfraud/creditcard.csv')
df.head()

In [None]:
from autoviz.AutoViz_Class import AutoViz_Class

AV = AutoViz_Class()
dftc = AV.AutoViz(
    filename='', 
    sep='' , 
    depVar='Class', 
    dfte=df, 
    header=0, 
    verbose=1, 
    lowess=False, 
    chart_format='png', 
    max_rows_analyzed=300000, 
    max_cols_analyzed=30
)

In [None]:
print('AutoViz finished!!')
finish_time = dt.datetime.now()
print("Finished at ", finish_time)
elapsed = finish_time - start_time
print("Elapsed time: ", elapsed)

<a id='3'></a>
# <p style="background-color:skyblue; font-family:newtimeroman; font-size:150%; text-align:center">3. 📊 Pandas Profiling 📚</p>

In [None]:
from pandas_profiling import ProfileReport

In [None]:
df = pd.read_csv('/kaggle/input/credit-card-customers/BankChurners.csv')

In [None]:
# Start of Pandas Profiling process
start_time = dt.datetime.now()
print("Started at ", start_time)

In [None]:
report = ProfileReport(df)

In [None]:
report

In [None]:
print('Pandas Profling finished!!')
finish_time = dt.datetime.now()
print("Finished at ", finish_time)
elapsed = finish_time - start_time
print("Elapsed time: ", elapsed)

<a id='4'></a>
# <p style="background-color:skyblue; font-family:newtimeroman; font-size:150%; text-align:center">4. 📊 SweetViz 📚</p>

In [None]:
!pip install sweetviz

In [None]:
import sweetviz as sv

In [None]:
# Start of Pandas Profiling process
start_time = dt.datetime.now()
print("Started at ", start_time)

In [None]:
df = pd.read_csv('/kaggle/input/credit-card-customers/BankChurners.csv').head(2000)

In [None]:
advert_report = sv.analyze([df, 'Data'])

In [None]:
advert_report.show_html()

![sweetviz.gif](attachment:sweetviz.gif)

In [None]:
print('SweetViz finished!!')
finish_time = dt.datetime.now()
print("Finished at ", finish_time)
elapsed = finish_time - start_time
print("Elapsed time: ", elapsed)

<a id='5'></a>
# <p style="background-color:skyblue; font-family:newtimeroman; font-size:150%; text-align:center">5. 📊 Lux 📚</p>

In [None]:
!pip install lux-api

In [None]:
import lux
import pandas as pd

In [None]:
df = pd.read_csv('/kaggle/input/titanic/train.csv')

In [None]:
df

<a id='6'></a>
# <p style="background-color:skyblue; font-family:newtimeroman; font-size:150%; text-align:center">6. 📊 D-Tale 📚</p>

In [None]:
! pip install dtale

# References

The references to the blog posts below may be helpful in your deeper delve into the universe of AutoViz

* Pandas Profiling GitHub - https://github.com/pandas-profiling/pandas-profiling
* Dan Roth, AutoViz: A New Tool for Automated Visualization - https://towardsdatascience.com/autoviz-a-new-tool-for-automated-visualization-ec9c1744a6ad
* George Vyshnya, PROs and CONs of Rapid EDA Tools - https://medium.com/sbc-group-blog/pros-and-cons-of-rapid-eda-tools-e1ccd159ab07
* SweetViz - https://towardsdatascience.com/sweetviz-automated-eda-in-python-a97e4cabacde
* DataPrep - https://sfu-db.github.io/dataprep/user_guide/eda/plot.html