

# Example Usage for Interactive Functions still in dev

Here’s an example of how you could use the interactive functions in wrangle module. 
These examples assume that you have a pandas.DataFrame called data ready for manipulation.

## 1. Interactive Data Cleaning (clean_data_interactive):

This function allows users to interactively clean the dataset by selecting columns to remove or fill NaN values.

Usage:

	•	Choose columns to remove (e.g., age).
	•	Select how to handle NaN values (e.g., fill with “mean”).
	•	Decide whether to apply the cleaning to rows or columns.
	•	Click the “Apply Cleaning” button to see the cleaned data.

In [3]:

import pandas as pd
from vistool.wrangle import (clean_data_interactive, 
                                     filter_data_interactive, 
                                     rename_columns_interactive, 
                                     label_encode_interactive)



# Example DataFrame
data = pd.DataFrame({
    'age': [25, 30, None, 40],
    'income': [50000, None, 55000, 60000],
    'city': ['NY', 'LA', 'SF', 'LA']
})


In [4]:

# Launch interactive data cleaning
clean_data_interactive(data)



SelectMultiple(description='Remove Columns:', options=('age', 'income', 'city'), value=())

Dropdown(description='Fill With:', options=('None', 'mean', 'average'), value='None')

Dropdown(description='Apply To:', options=('columns', 'rows'), value='columns')

Button(description='Apply Cleaning', style=ButtonStyle())

File 'data/Monthly_AE_Attendances_Nov_2024.csv' loaded successfully!


## 2. Interactive Data Filtering (filter_data_interactive):

This function allows users to interactively filter the dataset based on a specified condition.


Usage:

	•	Enter a filter condition (e.g., age > 30).
	•	Click the “Apply Filter” button to see the filtered data based on the condition.

In [5]:

# Launch interactive data filtering
filter_data_interactive(data)


Text(value='', description='Condition:', placeholder='Enter condition (e.g., Age > 30)')

Button(description='Apply Filter', style=ButtonStyle())

## 3. Interactive Column Renaming (rename_columns_interactive):

This function allows users to interactively rename columns in the dataset.

Usage:
    
	•	Enter the column mappings in the format: old_name:new_name (e.g., age:years,income:salary).
	•	Click the “Apply Rename” button to apply the column renaming.



In [7]:

# Launch interactive column renaming
rename_columns_interactive(data)


Text(value='', description='Mappings:', placeholder='Enter mappings (e.g., old:new,age:years)')

Button(description='Apply Rename', style=ButtonStyle())


## 4. Interactive Label Encoding (label_encode_interactive):

This function allows users to interactively apply label encoding on a categorical column.

Usage:
    
	•	Select a categorical column from the dropdown (e.g., city).
	•	Click the “Apply Encoding” button to encode the column as numeric labels.



In [6]:

# Launch interactive label encoding
label_encode_interactive(data)

Dropdown(description='Column:', options=('age', 'income', 'city'), value='age')

Button(description='Apply Encoding', style=ButtonStyle())

In [9]:
# import load-csv and summarize_data  
from vistool.download import load_csv, summarize_data



In [None]:
# load Monthly_AE_Attendances in nov 2024 in data file and print firts 5 rows

df = load_csv('data/Monthly_AE_Attendances_Nov_2024.csv')
df.head(5)


File 'data/Monthly_AE_Attendances_Nov_2024.csv' loaded successfully!


Unnamed: 0,period,org_code,parent_org,org_name,ae_attendances_type_1,ae_attendances_type_2,ae_attendances_other_ae_department,ae_attendances_booked_appointments_type_1,ae_attendances_booked_appointments_type_2,ae_attendances_booked_appointments_other_department,...,attendances_over_4hrs_other_department,attendances_over_4hrs_booked_appointments_type_1,attendances_over_4hrs_booked_appointments_type_2,attendances_over_4hrs_booked_appointments_other_department,patients_who_have_waited_4-12_hs_from_dta_to_admission,patients_who_have_waited_12+_hrs_from_dta_to_admission,emergency_admissions_via_ae-type_1,emergency_admissions_via_ae-type_2,emergency_admissions_via_ae-other_AE_department,other_emergency_admissions
0,MSitAE-NOVEMBER-2024,NL7,NHS ENGLAND MIDLANDS,ASSURA VERTIS URGENT CARE CENTRES (BIRMINGHAM),0,0,4350,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,MSitAE-NOVEMBER-2024,RWY,NHS ENGLAND NORTH EAST AND YORKSHIRE,CALDERDALE AND HUDDERSFIELD NHS FOUNDATION TRUST,15295,0,0,0,0,0,...,0,0,0,0,1160,16,3104,0,0,316
2,MSitAE-NOVEMBER-2024,AAH,NHS ENGLAND SOUTH WEST,TETBURY HOSPITAL TRUST LTD,0,0,516,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,MSitAE-NOVEMBER-2024,AQN04,NHS ENGLAND SOUTH EAST,PHL LYMINGTON UTC,0,0,2593,0,0,12,...,1,0,0,0,0,0,0,0,0,0
4,MSitAE-NOVEMBER-2024,C82038,NHS ENGLAND MIDLANDS,LATHAM HOUSE MEDICAL PRACTICE,0,0,303,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [13]:
#print the summary of data 
summarize_data(df)


    --- Data Overview ---
    Shape: 198 rows, 22 columns

    Numeric Columns: 18 columns (e.g., ae_attendances_type_1, ae_attendances_type_2, ae_attendances_other_ae_department...)
    Non-Numeric Columns: 4 columns (e.g., period, org_code, parent_org...)

    Missing Values: 0 missing values in total
    Duplicate Rows: 0 duplicate rows

    Categorical Columns: 2 columns (e.g., period, parent_org...)

    Correlation between numeric columns:
                                                        ae_attendances_type_1  \
ae_attendances_type_1                                            1.000000   
ae_attendances_type_2                                            0.980878   
ae_attendances_other_ae_department                               0.996996   
ae_attendances_booked_appointments_type_1                        0.985124   
ae_attendances_booked_appointments_type_2                        0.793002   
ae_attendances_booked_appointments_other_depart...               0.977264   
attend

In [15]:
df.columns
df.dtypes

period                                                        object
org_code                                                      object
parent_org                                                    object
org_name                                                      object
ae_attendances_type_1                                          int64
ae_attendances_type_2                                          int64
ae_attendances_other_ae_department                             int64
ae_attendances_booked_appointments_type_1                      int64
ae_attendances_booked_appointments_type_2                      int64
ae_attendances_booked_appointments_other_department            int64
attendances_over_4hrs_type_1                                   int64
attendances_over_4hrs_type_2                                   int64
attendances_over_4hrs_other_department                         int64
attendances_over_4hrs_booked_appointments_type_1               int64
attendances_over_4hrs_booked_appoi

In [16]:
# Launch interactive cleaning
# Choose a column, such as ae_attendances_type_1, to clean.
# Decide whether to drop rows or fill NaN values with the mean.
# Click the “Apply Cleaning” button to see the updated dataset.


clean_data_interactive(df)



Rows with NaN in columns ['ae_attendances_type_1'] were dropped.


Unnamed: 0,period,org_code,parent_org,org_name,ae_attendances_type_1,ae_attendances_type_2,ae_attendances_other_ae_department,ae_attendances_booked_appointments_type_1,ae_attendances_booked_appointments_type_2,ae_attendances_booked_appointments_other_department,...,attendances_over_4hrs_other_department,attendances_over_4hrs_booked_appointments_type_1,attendances_over_4hrs_booked_appointments_type_2,attendances_over_4hrs_booked_appointments_other_department,patients_who_have_waited_4-12_hs_from_dta_to_admission,patients_who_have_waited_12+_hrs_from_dta_to_admission,emergency_admissions_via_ae-type_1,emergency_admissions_via_ae-type_2,emergency_admissions_via_ae-other_AE_department,other_emergency_admissions
0,MSitAE-NOVEMBER-2024,NL7,NHS ENGLAND MIDLANDS,ASSURA VERTIS URGENT CARE CENTRES (BIRMINGHAM),0.0,0.0,4350.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,MSitAE-NOVEMBER-2024,RWY,NHS ENGLAND NORTH EAST AND YORKSHIRE,CALDERDALE AND HUDDERSFIELD NHS FOUNDATION TRUST,15295.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,1160.0,16.0,3104.0,0.0,0.0,316.0
2,MSitAE-NOVEMBER-2024,AAH,NHS ENGLAND SOUTH WEST,TETBURY HOSPITAL TRUST LTD,0.0,0.0,516.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,MSitAE-NOVEMBER-2024,AQN04,NHS ENGLAND SOUTH EAST,PHL LYMINGTON UTC,0.0,0.0,2593.0,0.0,0.0,12.0,...,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,MSitAE-NOVEMBER-2024,C82038,NHS ENGLAND MIDLANDS,LATHAM HOUSE MEDICAL PRACTICE,0.0,0.0,303.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
193,MSitAE-NOVEMBER-2024,R0A,NHS ENGLAND NORTH WEST,MANCHESTER UNIVERSITY NHS FOUNDATION TRUST,26244.0,2817.0,11453.0,3394.0,456.0,1013.0,...,850.0,1707.0,9.0,115.0,4172.0,263.0,5795.0,131.0,167.0,445.0
194,MSitAE-NOVEMBER-2024,RXF,NHS ENGLAND NORTH EAST AND YORKSHIRE,MID YORKSHIRE TEACHING NHS TRUST,16762.0,106.0,5926.0,120.0,0.0,84.0,...,396.0,2.0,0.0,0.0,2286.0,736.0,3492.0,0.0,153.0,897.0
195,MSitAE-NOVEMBER-2024,8J094,NHS ENGLAND MIDLANDS,BADGER LTD,0.0,0.0,0.0,0.0,0.0,2011.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
196,MSitAE-NOVEMBER-2024,Y03007,NHS ENGLAND MIDLANDS,ERDINGTON GP HEALTH & WELLBEING WIC,0.0,0.0,0.0,0.0,0.0,2520.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [17]:
# Launch interactive renaming
# For example, filter rows where ae_attendances_type_1 > 10

filter_data_interactive(df)

Text(value='', description='Condition:', placeholder='Enter condition (e.g., Age > 30)')

Button(description='Apply Filter', style=ButtonStyle())

In [18]:
# Launch interactive renaming
#  For example, rename ae_attendances_type_1 to type_1_attendances
rename_columns_interactive(data)

Text(value='', description='Mappings:', placeholder='Enter mappings (e.g., old:new,age:years)')

Button(description='Apply Rename', style=ButtonStyle())

In [20]:



# Launch interactive label encoding
# If you want to encode a categorical column, such as org_name, into numeric labels:

label_encode_interactive(df)


Label encoding applied to column: org_name


Unnamed: 0,period,org_code,parent_org,org_name,ae_attendances_type_1,ae_attendances_type_2,ae_attendances_other_ae_department,ae_attendances_booked_appointments_type_1,ae_attendances_booked_appointments_type_2,ae_attendances_booked_appointments_other_department,...,attendances_over_4hrs_other_department,attendances_over_4hrs_booked_appointments_type_1,attendances_over_4hrs_booked_appointments_type_2,attendances_over_4hrs_booked_appointments_other_department,patients_who_have_waited_4-12_hs_from_dta_to_admission,patients_who_have_waited_12+_hrs_from_dta_to_admission,emergency_admissions_via_ae-type_1,emergency_admissions_via_ae-type_2,emergency_admissions_via_ae-other_AE_department,other_emergency_admissions
0,MSitAE-NOVEMBER-2024,NL7,NHS ENGLAND MIDLANDS,4,0,0,4350,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,MSitAE-NOVEMBER-2024,RWY,NHS ENGLAND NORTH EAST AND YORKSHIRE,21,15295,0,0,0,0,0,...,0,0,0,0,1160,16,3104,0,0,316
2,MSitAE-NOVEMBER-2024,AAH,NHS ENGLAND SOUTH WEST,154,0,0,516,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,MSitAE-NOVEMBER-2024,AQN04,NHS ENGLAND SOUTH EAST,116,0,0,2593,0,0,12,...,1,0,0,0,0,0,0,0,0,0
4,MSitAE-NOVEMBER-2024,C82038,NHS ENGLAND MIDLANDS,73,0,0,303,0,0,0,...,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
193,MSitAE-NOVEMBER-2024,R0A,NHS ENGLAND NORTH WEST,85,26244,2817,11453,3394,456,1013,...,850,1707,9,115,4172,263,5795,131,167,445
194,MSitAE-NOVEMBER-2024,RXF,NHS ENGLAND NORTH EAST AND YORKSHIRE,94,16762,106,5926,120,0,84,...,396,2,0,0,2286,736,3492,0,153,897
195,MSitAE-NOVEMBER-2024,8J094,NHS ENGLAND MIDLANDS,5,0,0,0,0,0,2011,...,0,0,0,0,0,0,0,0,0,0
196,MSitAE-NOVEMBER-2024,Y03007,NHS ENGLAND MIDLANDS,46,0,0,0,0,0,2520,...,0,0,0,0,0,0,0,0,0,0
