# EXPERIMENTAL: Viewing and analyzing work order trends in PyGWalker

#### This notebook imports and processes the same dataset used in notebook 1, and then loads those data into a new, more interactive interface with PyGWalker

Note that I have not worked with this tool extensively. I am providing it only for your exploration, should you find it interesting and/or useful.

In [None]:
#Necessary imports: Numpy, Pandas, and datetime for data manipulation; matplotlib.pyplot and Seaborn for visualization
#(These "packages" or "libraries" are outside of Python's so-called "standard library" of built-in functionality, so we need to 
    #import them separately as we do here. For convenience, we give them short aliases such as "np","pd", and "sns".)
import numpy as np
import pandas as pd
import datetime as dt
import pygwalker
import unicodedata #For removing characters that cause trouble with PyGWalker

#We want to display all columns in the previews below, so we set the display.max_columns option of Pandas to None
pd.set_option('display.max_columns', None)

In [None]:
#Using the read_csv() method of Pandas, read in work order data as a Pandas DataFrame
t = pd.read_csv('Data/St_Nick_Vendor_WOs.csv', encoding='utf-8')

In [None]:
#Convert our date columns to Pandas' datetime format, for later manipulation
for field in ['ZZCREATEDATE', 'STATUSDATE', 'PARENTSTATUSDATE']:
    t[field] = pd.to_datetime(t[field])
    
#Using Python's built-in datetime method for converting dates/times to strings, create new fields
#representing the month that work orders (and their parent work orders) were updated and/or created.
t['CREATEMONTH'] = t['ZZCREATEDATE'].apply(lambda x: x.strftime('%Y%m'))
t['STATUSMONTH'] = t['STATUSDATE'].apply(lambda x: x.strftime('%Y%m'))
t['PARENTSTATUSMONTH'] = t['PARENTSTATUSDATE'].apply(lambda x: x.strftime('%Y%m'))

### Introducing: PyGWalker

I have yet to explore this tool fully, but I recently discovered a promising new library for interactive, Tableau-style data analysis with Jupyter known as [PyGWalker](https://github.com/Kanaries/pygwalker). I've loaded up our work orders dataframe here for testing -- give it a try!

Note: the first two cells below remove so-called "control characters" (such as line breaks) from all text fields in the DataFrame. These characters cause trouble when loading data into PyGWalker.

In [None]:
def remove_control_characters(s):
    return "".join(ch for ch in s if unicodedata.category(ch)[0]!="C")

In [None]:
types = pd.DataFrame(t.dtypes).reset_index()
types.columns = ['col','type']
obj_cols = types[types['type'] == 'object']['col'].to_list()

for col in obj_cols:
    t[col] = t[col].apply(lambda x: remove_control_characters(str(x)))

In [None]:
gwalker = pygwalker.walk(t)