Skip to content
View Zoei19's full-sized avatar

Block or report Zoei19

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Zoei19/README.md

Hi, I'm Zoeisha 👋

MSc Data Analysis for Business Intelligence · Data Analyst · Manchester, UK 🇬🇧


👩🏽‍💻 About Me

I'm a data analyst with an MSc in Data Analysis for Business Intelligence (University of Leicester), passionate about turning complex datasets into clear, actionable insights.

  • 🔍 Specialising in time series analysis, clustering, data migration, and business intelligence dashboards
  • 🗄️ Strong SQL skills (intermediate–advanced) — joins, window functions, CTEs, query optimisation, stored procedures
  • ⚡ Currently learning PySpark on Databricks (Azure) — building big data processing skills
  • 🤖 Recently exploring AI agents and their real-world applications in social impact
  • 📊 Comfortable working across the full analytics pipeline — from raw data to polished visualisation
  • 🌍 Based in Manchester, UK — open to hybrid and remote roles
  • 💼 Actively seeking Data Analyst roles

🛠️ Tech Stack

Languages & Libraries

Python SQL PySpark Pandas NumPy Scikit-learn Matplotlib Seaborn

Visualisation & BI

Power BI Tableau Excel

Tools & Platforms

Azure Databricks Git GitHub Docker Google Cloud VS Code Jupyter SQL Server SSMS


🚀 Featured Projects

MSc Dissertation · University of Leicester

Analysed 2 years of FTSE 100 price data using time series clustering and PCA to uncover hidden stock behavioural patterns. Applied KMeans, Hierarchical, and DBSCAN clustering with full validation metrics.

Python yfinance Scikit-learn PCA KMeans DBSCAN Hierarchical Clustering Time Series


Power BI · IEA Open Data

Interactive dashboard visualising global electric vehicle adoption trends — sales, stock, and market penetration across countries and years using IEA open data.

Power BI Data Visualisation IEA Data Business Intelligence


Tableau · Python · Sales Analytics

Analysed 6 months of coffee shop transaction data to uncover revenue drivers, peak trading hours, and category performance. Built an interactive Tableau dashboard enabling management to plan promotions and resource allocation.

Key findings: £504K total sales · 10 AM peak hour · Espresso drinks top category · Morning hours drive ~35% of daily revenue

Python Pandas Tableau Data Cleaning Sales Analytics Dashboard


SQL · MSSQLSERVER · SSMS · Business Intelligence

End-to-end SQL analysis of Brazil's Olist e-commerce dataset (2016–2018) covering 97,919 orders. Investigated sales trends, delivery performance, and customer satisfaction to surface actionable business insights.

Key findings: 76% on-time delivery · 4/5 avg review score · Electronics & beauty drive ~35% of revenue · 15% repeat customer rate

SQL MSSQLSERVER SSMS Data Analysis E-commerce Business Intelligence


AI Agents · Social Impact

Multi-agent AI pipeline that assesses risk and generates personalised action plans for vulnerable individuals — integrating postcode lookup, service recommendations, and HTML/PDF plan generation.

Python Google ADK AI Agents pdfkit Social Impact


Python · Exploratory Data Analysis · Healthcare

In-depth exploratory analysis of the UCI Heart Disease dataset (920 records, 14 variables) to identify clinical and demographic risk factors across four disease severity levels — including univariate, bivariate, and multivariate analysis with a full visualisation suite.

Key findings: Severity rises sharply after age 60 · Max heart rate strongly inversely correlated with severity · ST depression is the most predictive feature

Python Pandas Matplotlib Seaborn EDA Healthcare Analytics


Python · Data Wrangling · Pipeline Design

Professional-grade data cleaning pipeline for a real-world Brazilian healthcare appointments dataset. Standardised schema, corrected datatypes, handled datetime formatting with timezone awareness, and produced an analysis-ready dataset for downstream EDA, ML, and BI workflows.

Python Pandas Data Cleaning Data Engineering Healthcare


📊 GitHub Stats


📬 Let's Connect

I'm actively looking for Data Analyst roles — if you're hiring or just want to chat data, feel free to reach out!

 

Pinned Loading

  1. AI-Homelessness-Support-Planner AI-Homelessness-Support-Planner Public

    Multi-agent AI pipeline generating personalised crisis action plans for vulnerable individuals using Google ADK

    Python

  2. coffee_shop_sales coffee_shop_sales Public

    Tableau dashboard analysing 6 months of coffee shop sales, £504K revenue, peak hours & category performance

  3. E-commerce-sales E-commerce-sales Public

    End-to-end SQL analysis of 97,919 Olist e-commerce orders - sales trends, delivery performance & customer insights

  4. FTSE100_Stock_Market_Clustering FTSE100_Stock_Market_Clustering Public

    Time series clustering of FTSE 100 stocks (2021–2022) using PCA, KMeans, Hierarchical & DBSCAN - MSc Dissertation

  5. global-ev-trends-dashboard global-ev-trends-dashboard Public

    Interactive Power BI dashboard tracking global EV sales, stock & adoption trends (2015–2023) using IEA open data

    1

  6. Heart_Disease_EDA Heart_Disease_EDA Public

    Exploratory data analysis of 920 patient records from the UCI Heart Disease dataset, clinical risk factor insights

    Jupyter Notebook