# SweetViz

Sweetviz is an open-source Python library that generates beautiful, high-density visualizations to kickstart EDA (Exploratory Data Analysis) with just two lines of code. Output is a fully self-contained HTML application.

The system is built around quickly visualizing target values and comparing datasets. Its goal is to help quick analysis of target characteristics, training vs testing data, and other such data characterization tasks.

- https://pypi.org/project/sweetviz/


- https://www.analyticsvidhya.com/blog/2021/05/sweetviz-library-eda-in-seconds/

In [None]:
# Installation

# pip install sweetviz

In [1]:
import warnings
warnings.filterwarnings('ignore')

%matplotlib inline

# Necessary libraries

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Automate EDA

import sweetviz as sv

## Seaborn Libraries

In [2]:
print(sns.get_dataset_names())

['anagrams', 'anscombe', 'attention', 'brain_networks', 'car_crashes', 'diamonds', 'dots', 'dowjones', 'exercise', 'flights', 'fmri', 'geyser', 'glue', 'healthexp', 'iris', 'mpg', 'penguins', 'planets', 'seaice', 'taxis', 'tips', 'titanic']


## Load car_crash Dataset

In [3]:
df_car_crashes = sns.load_dataset('car_crashes')

In [4]:
df_car_crashes

Unnamed: 0,total,speeding,alcohol,not_distracted,no_previous,ins_premium,ins_losses,abbrev
0,18.8,7.332,5.64,18.048,15.04,784.55,145.08,AL
1,18.1,7.421,4.525,16.29,17.014,1053.48,133.93,AK
2,18.6,6.51,5.208,15.624,17.856,899.47,110.35,AZ
3,22.4,4.032,5.824,21.056,21.28,827.34,142.39,AR
4,12.0,4.2,3.36,10.92,10.68,878.41,165.63,CA
5,13.6,5.032,3.808,10.744,12.92,835.5,139.91,CO
6,10.8,4.968,3.888,9.396,8.856,1068.73,167.02,CT
7,16.2,6.156,4.86,14.094,16.038,1137.87,151.48,DE
8,5.9,2.006,1.593,5.9,5.9,1273.89,136.05,DC
9,17.9,3.759,5.191,16.468,16.826,1160.13,144.18,FL


### .show__notebook

In [12]:
# Create and save interactive dashboard
sweetviz_report = sv.analyze(df_car_crashes) # We can add the traget variable (target_feat: str =)
sweetviz_report.show_notebook(w=950,
                              h=800,
                              filepath='sweetviz_carcrash_report.html')

                                             |                                             | [  0%]   00:00 ->…

Report 'sweetviz_report_carcrash.html' was saved to storage.


### .show_html

In [13]:
# Open and save the default web browser to display the Sweetviz dashboard.
sweetviz_report = sv.analyze(df_car_crashes) # We can add the traget variable (target_feat: str =)
sweetviz_report.show_html(filepath='sweetviz_carcrash_report.html')

                                             |                                             | [  0%]   00:00 ->…

Report sweetviz_report_carcrash.html was generated! NOTEBOOK/COLAB USERS: the web browser MAY not pop up, regardless, the report IS saved in your notebook/colab files.


### IBM Dataset with Target Variable

In [14]:
df_ibm = pd.read_csv(r"C:\Users\ariel\OneDrive\Desktop\CCT College\GitHub\Automated EDA\Datasets\datasets\WA_Fn-UseC_-HR-Employee-Attrition.csv")

In [15]:
df_ibm

Unnamed: 0,Age,Attrition,BusinessTravel,DailyRate,Department,DistanceFromHome,Education,EducationField,EmployeeCount,EmployeeNumber,...,RelationshipSatisfaction,StandardHours,StockOptionLevel,TotalWorkingYears,TrainingTimesLastYear,WorkLifeBalance,YearsAtCompany,YearsInCurrentRole,YearsSinceLastPromotion,YearsWithCurrManager
0,41,Yes,Travel_Rarely,1102,Sales,1,2,Life Sciences,1,1,...,1,80,0,8,0,1,6,4,0,5
1,49,No,Travel_Frequently,279,Research & Development,8,1,Life Sciences,1,2,...,4,80,1,10,3,3,10,7,1,7
2,37,Yes,Travel_Rarely,1373,Research & Development,2,2,Other,1,4,...,2,80,0,7,3,3,0,0,0,0
3,33,No,Travel_Frequently,1392,Research & Development,3,4,Life Sciences,1,5,...,3,80,0,8,3,3,8,7,3,0
4,27,No,Travel_Rarely,591,Research & Development,2,1,Medical,1,7,...,4,80,1,6,3,3,2,2,2,2
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1465,36,No,Travel_Frequently,884,Research & Development,23,2,Medical,1,2061,...,3,80,1,17,3,3,5,2,0,3
1466,39,No,Travel_Rarely,613,Research & Development,6,1,Medical,1,2062,...,1,80,1,9,5,3,7,7,1,7
1467,27,No,Travel_Rarely,155,Research & Development,4,3,Life Sciences,1,2064,...,2,80,1,6,0,3,6,2,0,3
1468,49,No,Travel_Frequently,1023,Sales,2,3,Medical,1,2065,...,4,80,0,17,3,2,9,6,0,8


### .show__notebook

In [23]:
# Create and save interactive dashboard
sweetviz_ibm = sv.analyze(df_ibm, target_feat='Attrition') # We can add the traget variable (target_feat: str =)
sweetviz_ibm.show_notebook(filepath='sweetviz_ibm_report.html')

                                             |                                             | [  0%]   00:00 ->…

Report 'sweetviz_ibm_report.html' was saved to storage.


### .show_html

In [24]:
# Open and save the default web browser to display the Sweetviz dashboard.
sweetviz_ibm = sv.analyze(df_ibm, target_feat='Attrition') # We can add the traget variable (target_feat: str =)
sweetviz_ibm.show_html(filepath='sweetviz_ibm_report.html')

                                             |                                             | [  0%]   00:00 ->…

Report sweetviz_ibm_report.html was generated! NOTEBOOK/COLAB USERS: the web browser MAY not pop up, regardless, the report IS saved in your notebook/colab files.
