## Automatic Visualization with AutoViz (Tabular Playground Series - Feb 2021)

Hello, I am sharing this automatic visualization notebook to help data understanding without much effort. 

![AutoViz](https://github.com/AutoViML/AutoViz/raw/master/logo.png)

> Automatically Visualize any dataset, any size with a single line of code. AutoViz performs automatic visualization of any dataset with one line. Give any input file (CSV, txt or json) and AutoViz will visualize it.


### For more information: [AutoViz](https://github.com/AutoViML/AutoViz)

## Install autoviz inside kaggle notebook

In [None]:
!pip install xlrd # Dependency for autoviz class
!pip install autoviz

## Import Libs

In [None]:
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
from autoviz.AutoViz_Class import AutoViz_Class # AutoViz
from sklearn.preprocessing import LabelEncoder

## Parameters

In [None]:
TRAIN_PATH = "/kaggle/input/tabular-playground-series-feb-2021/train.csv"
TEST_PATH = "/kaggle/input/tabular-playground-series-feb-2021/test.csv"
SUBMISSION = "/kaggle/input/tabular-playground-series-feb-2021/sample_submission.csv"

AUTO_VIZ_FILENAME = ""
AUTO_VIZ_SEP = ","
AUTO_VIZ_DEP_VAR = "target"
AUTO_VIZ_HEADER = 0
AUTO_VIZ_VERBOSE = 1
AUTO_VIZ_LOWESS = False
AUTO_VIZ_CHAR_FORMAT = "svg"
AUTO_VIZ_MAX_ROWS_ANALYZED = 30000
AUTO_VIZ_MAX_COLS_ANALYZED = 30

## Read train data

In [None]:
train = pd.read_csv(TRAIN_PATH, index_col='id')
train.head()

In [None]:
submission = pd.read_csv(SUBMISSION, index_col='id')

In [None]:
train.columns

In [None]:
test = pd.read_csv(TEST_PATH, index_col='id')

In [None]:
test.columns

In [None]:
test.head()

## Label Encoding 

* Encode categorical data

In [None]:
for c in train.columns:
    if train[c].dtype == 'object':
        lbl = LabelEncoder()
        lbl.fit(list(train[c].values) + list(test[c].values))
        
        train[c] = lbl.transform(train[c].values)
        test[c] = lbl.transform(test[c].values)

## Create autoviz object

In [None]:
AV = AutoViz_Class()

## Let's see the magic! (Train Data)

In [None]:
%%time
dft = AV.AutoViz(filename=AUTO_VIZ_FILENAME, 
					sep=AUTO_VIZ_SEP, 
					depVar=AUTO_VIZ_DEP_VAR, 
					dfte=train, 
					header=AUTO_VIZ_HEADER, 
					verbose=AUTO_VIZ_VERBOSE,
                    lowess=AUTO_VIZ_LOWESS,
                    chart_format=AUTO_VIZ_CHAR_FORMAT,
                    max_rows_analyzed=AUTO_VIZ_MAX_ROWS_ANALYZED,
                    max_cols_analyzed=AUTO_VIZ_MAX_COLS_ANALYZED)

## Here I will test a model :) 

In [None]:
from sklearn.ensemble import RandomForestRegressor

target = train.pop("target")
target = target.values

model = RandomForestRegressor(n_estimators=50, n_jobs=-1)
model.fit(train, target)
submission['target'] = model.predict(test)
submission.to_csv('random_forest.csv')

## If you liked this interactive visualization, do not forget to like the notebook. Thanks! :)

You can check my others notebooks about automatic EDA too:

* https://www.kaggle.com/rapela/tps-02-21-sweet-visualization
* https://www.kaggle.com/rapela/jane-street-sweet-visualization
* https://www.kaggle.com/rapela/jane-street-autoviz