# Tableau Data Preparation

## Objectives

* Prepare and format data for Tableau visualization
* Export cleaned dataset in optimal format for Tableau analysis
* Create additional data transformations if needed for dashboard creation

## Inputs

* data/inputs/cleaned_bank_data.csv

## Outputs

* Tableau-ready dataset (CSV format)
* Data documentation for dashboard creation

## Additional Comment

* This notebook focuses on preparing data specifically for Tableau visualizations and dashboard creation

---

# Change working directory

* We are assuming you will store the notebooks in a subfolder, therefore when running the notebook in the editor, you will need to change the working directory

We need to change the working directory from its current folder to its parent folder
* We access the current directory with os.getcwd()

In [1]:
import os
current_dir = os.getcwd()
current_dir

'c:\\Users\\shema\\Documents\\VScodeProject\\GroupProject\\BankCustomerAttrition\\jupyter_notebooks'

We want to make the parent of the current directory the new current directory
* os.path.dirname() gets the parent directory
* os.chdir() defines the new current directory

In [2]:
os.chdir(os.path.dirname(current_dir))
print("You set a new current directory")

You set a new current directory


Confirm the new current directory

In [3]:
current_dir = os.getcwd()
current_dir

'c:\\Users\\shema\\Documents\\VScodeProject\\GroupProject\\BankCustomerAttrition'

---

# Load and Prepare Data for Tableau

Load the cleaned dataset and prepare it for Tableau visualization

In [4]:
# Import necessary libraries
import pandas as pd
import numpy as np

In [5]:
# Load the cleaned dataset
df = pd.read_csv('data/inputs/cleaned_bank_data.csv')

# Display basic information about the dataset
print(f"Dataset shape: {df.shape}")
df.head()

Dataset shape: (10000, 19)


Unnamed: 0.1,Unnamed: 0,RowNumber,CustomerId,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited,Complain,SatisfactionScore,CardType,PointEarned,AgeGroup
0,0,1,15598695,619,France,Female,42,2,0.0,1,1,1,101348.88,1,1,2,DIAMOND,464,40-49
1,1,2,15649354,608,Spain,Female,41,1,83807.86,1,0,1,112542.58,0,1,3,DIAMOND,456,40-49
2,2,3,15737556,502,France,Female,42,8,159660.8,3,1,0,113931.57,1,1,3,DIAMOND,377,40-49
3,3,4,15671610,699,France,Female,39,1,0.0,2,0,0,93826.63,0,0,5,GOLD,350,30-39
4,4,5,15625092,850,Spain,Female,43,2,125510.82,1,1,1,79084.1,0,0,5,GOLD,425,40-49


---

# Data Transformations for Tableau

Apply any additional transformations needed for optimal Tableau visualization

In [6]:
# Add your data transformation code here
# Example: Create categorical variables, format dates, etc.

---

# Export Data for Tableau

Export the final dataset in a format optimized for Tableau

In [7]:
# Export the dataset for Tableau
# tableau_data = df.copy()  # Make any final modifications here
# tableau_data.to_csv('data/outputs/tableau_data.csv', index=False)
# print("Data exported successfully for Tableau!")