Skip to content

monroesolisdata/bi-platform

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BI Platform

End-to-End Business Intelligence — Streamlit + Plotly + SQLite

Six analytical dashboards. Zero cloud dependencies. Fully interactive.


Streamlit Plotly pandas SQLite Python License


What It Does

A fully self-contained BI platform that demonstrates end-to-end dashboard engineering:

Dashboard Purpose
🏠 Overview KPI cards, monthly trend + MoM growth, region donut, category bars
📈 Revenue Monthly/quarterly trends, region area chart, tabbed drill-in
📦 Products Category bars, return-rate scatter, cohort category heatmap
👤 Sales Team Leaderboard (colour-coded by quota status), deal-size scatter
🔍 Drill-Down Region/category/month filters → raw order table + CSV export
🔄 Returns Return-rate heatmap by category × region

Quick Start

git clone https://github.com/YOUR_USERNAME/bi-platform.git
cd bi-platform
python -m venv venv && venv\Scripts\activate   # Windows
pip install -r requirements.txt

# Run tests (no API key / no cloud credentials needed)
pytest tests/ -v

# Seed the database and launch
python -m src.database            # seeds data/bi_platform.db
streamlit run src/app.py
# Open http://localhost:8501

Architecture

flowchart LR
    A["SQLite DB\n2,000 synthetic rows\n(auto-seeded)"] --> B["analytics.py\nQuery layer\n(pure pandas)"]
    B --> C["charts.py\nPlotly builders\n(pure figures)"]
    C --> D["app.py\nStreamlit multi-page\n(6 pages / tabs)"]
    D --> E["Browser\nInteractive dashboard"]
Loading

Three-layer design:

Layer File Role
Data src/database.py SQLite schema + deterministic seed (2 k rows, seed=42)
Analytics src/analytics.py SQL → pandas, KPIs, cohorts, returns
Presentation src/charts.py + src/app.py Plotly figures + Streamlit pages

Database Schema

products     (product_id PK, name, category, unit_price)
salespeople  (salesperson_id PK, name, region)
sales        (sale_id PK, sale_date, product_id FK, salesperson_id FK,
              region, quantity, unit_price, revenue, is_returned)
  • 2,000 rows generated deterministically (seed=42) — identical data on every machine
  • Q4 seasonal uplift baked in (40 % higher quantity multiplier)
  • 4 % return rate injected randomly

Key Techniques

  • Watermark-free SQLite — no ORM, raw sqlite3 + pandas read_sql_query for fast iteration
  • Three-layer separation — DB → analytics → charts layers are independently testable
  • Streamlit @st.cache_data — query results cached 5 min so navigation doesn't re-query
  • Plotly dual-axis — revenue bars + MoM growth % line share one figure (secondary y-axis)
  • Pivot cohort heatmapdf.pivot + row-wise normalisation → percentage share per month

What I Learned

  • Multi-page Streamlit — sidebar st.radio as a router, each page function is independently callable
  • Plotly dual y-axisyaxis2 with overlaying="y" and side="right" in update_layout
  • SQLite for BI prototyping — CASE expressions, sub-selects, and window-like patterns without a heavyweight DB
  • Test isolationtmp_path_factory session-scoped fixture gives one seeded DB for all tests, no I/O per test
  • Dependency inversiondb_path parameter on every analytics function enables test DBs without monkeypatching

License

MIT

Part of a 10-project Data Analyst portfolio

About

End-to-end BI platform in pure Python: six Streamlit dashboards powered by Plotly charts and a local SQLite backend - overview, revenue, products, sales team, drill-down, returns.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages