# <a id='toc1_'></a>[Attribute Visualization](#toc0_)
This notebook demonstrates the use of `ColumnVisualizer` class in `src/visualization/columns.py` for visualizing the attributes present in raw data.

**Table of contents**<a id='toc0_'></a>    
- [Attribute Visualization](#toc1_)    
  - [Setup](#toc1_1_)    
  - [Data Digestion and Preprocessing](#toc1_2_)    
  - [Visualization](#toc1_3_)    

<!-- vscode-jupyter-toc-config
	numbering=false
	anchor=true
	flat=false
	minLevel=1
	maxLevel=6
	/vscode-jupyter-toc-config -->
<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->

## <a id='toc1_1_'></a>[Setup](#toc0_)

In [None]:
import sys
from pathlib import Path

Jupyter server should be run at the notebook directory, so the output of the following cell would be the project root:

In [None]:
project_root = Path.cwd().resolve().parent.parent
print(f"Project root: {project_root.name}")

In [None]:
if str(project_root) not in sys.path:
    sys.path.insert(0, str(project_root))

In [None]:
%load_ext autoreload
# Reload all modules imported with %aimport every time before executing the Python code typed.
%autoreload 1
%aimport src.visualization.columns, src.database.database_connection, \
    src.visualization.models, src.preprocess.preprocess

In [None]:
from src.visualization import ColumnVisualizer
from src.preprocess import Preprocess
from src.database import DatabaseConnection

## <a id='toc1_2_'></a>[Data Digestion and Preprocessing](#toc0_)

In [None]:
db_path = Path(project_root) / "data/slurm_data.db"
db_connection = DatabaseConnection(str(db_path.resolve()), anonymize=True)

jobs_df = db_connection.fetch_all_jobs()

In [None]:
clean_jobs_df = Preprocess().preprocess_data(jobs_df, min_elapsed_seconds=600, anonymize=True)
display(clean_jobs_df)
print(clean_jobs_df.shape)

## <a id='toc1_3_'></a>[Visualization](#toc0_)

In [None]:
visualizer = ColumnVisualizer(clean_jobs_df)

In [None]:
visualizer.visualize(
    output_dir_path=Path(Path(project_root) / "data/visualizations"),
    columns=None,
)