This repository showcases Python scripts used in generating fake data and the rest of the steps taken behind my personal Tableau project - Email Marketing Campaign Dashboard. It is based on the client work I have done. The dashboard is available here.
Datasets available:
data.world
kaggle
I have started by installing Anaconda Navigator onto my local machine and setting up a Spyder IDE with Python 3.9 in order to start generating the scripts.
It was then followed by writing the Python scripts. There are three scripts for three different operations:
This script uses faker library to generate fake email data including customer name, email name, sent dates etc. as per below:
Full code available here.
The second script takes the output from the first script and filters it down. This was needed, because the first script generates 4 emails per customer, while I wanted differing numbers of emails for different customers, so this code, filters out some of the emails based on percentiles to better mimic real-world data.
Full code available here.
This script transforms the filtered data into the format suitable to build a Sankey Funnel Chart. The format is available in the screenshot below. Script filters the data down to the month of August and to the year 2023, later unioning the results to allow period filterability on the dashboard.
Full code available here.
*Some of the code was written with the help of Chat GPT to speed up the process.
The data model for this dashboard is relatively simple. It includes two tables with no relationship established between the two of them. One is the sankey dataset, which is used to build the sankey, and the other one is the email metrics dataset that was used to build the rest of the dashboard.
Sankey Funnel was built following the blog by Ken Flerlage. I have also used this YouTube tutorial to build gradient bar charts. The other Tableau functionality and features used are outlined below:
- Parameter and Select Actions
- Filters
- Navigation buttons
- Gradient images created using Figma
- Custom number formatting to display up/down indicators
- String formatting functions and Regex to format dynamic field values (e.g. currency, percentages, numbers etc.)
Generating fake data
name = faker.name()
account_number = faker.random_number(digits=8)
Selecting columns and filtering
customers_all_emails_in_2023 = df.groupby('name').filter(lambda group: all('2023' in date for date in group['sent_date']))
df_filtered_2023 = df[df['name'].isin(customers_all_emails_in_2023['name'])]
Defining custom functions
def filter_email_2(df):
return df[df['email_name'].str.contains('Email 1|Email 2|Email 3')]
def filter_email_3(df):
return df[df['email_name'].str.contains('Email 1|Email 2|Email 3|Email 4')]
For loops
for j in email_sequence:
if name not in emails_sent:
emails_sent[name] = [j]
elif len(emails_sent[name]) < max_emails_per_person and j not in emails_sent[name]:
emails_sent[name].append(j)
else:
continue
Merging and unioning data frames
merged = pd.merge(grouped_counts, model, on='Link')
df_union = pd.concat([data_2023, data_august], axis=0, ignore_index=True)
Pivoting data frames
df_pivoted = df.pivot(index='name', columns='email_name', values='email_name')
Applying functions to the data
num_emails = min(max_emails_per_person, 4)
len(emails_sent[name])
df[df['email_name'].str.contains('Email 2|Email 3')]
Changing the data types
percentile_index = int(total_customers * percentile) * 4
'sent_date': sent_datetime.strftime('%Y-%m-%d %H:%M:%S')
Importing different Python libraries
import random
from faker import Faker
import pandas as pd
from datetime import datetime, timedelta
Creating columns and assigning values
data_2023['Month/Year'] = "Year"
data_august['Month/Year'] = "Month"
Renaming columns
grouped_counts = grouped_counts.rename(columns={"Email 1": "Step 1", "Email 2": "Step 2", "Email 3": "Step 3", "Email 4": "Step 4"})
Reading and writing to Excel files
df = pd.read_csv(r'C:\Users\Marius\Desktop\Email Marketing\unfiltered_dataset.csv')
combined_result.to_csv(r'C:\Users\Marius\Desktop\Email Marketing\filtered_dataset.csv', index=False)
The dashboard was inspired by Chimdi Nwosu viz #RWFD NYC Community Service Requests which can be found here.