Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
af28118
feat: initialize local Supabase for Postgres development
William-Hill Feb 16, 2026
c8d5559
feat: add synthetic Bishop State data generation script and data files
William-Hill Feb 16, 2026
01629cc
feat: add Bishop State data merge script and merged dataset
William-Hill Feb 17, 2026
4ae162d
feat: migrate Python DB layer from pymysql/MariaDB to psycopg2/Postgres
William-Hill Feb 17, 2026
7ae2f90
feat: update ML pipeline for Bishop State data and Supabase Postgres
William-Hill Feb 17, 2026
9d80acd
feat: add shared Postgres pool module, swap mysql2 for pg
William-Hill Feb 17, 2026
b9d5ff5
feat: migrate dashboard API routes from mysql2 to pg (Postgres)
William-Hill Feb 18, 2026
6b1b664
Merge pull request #46 from devcolor/rebranding/task-4-python-db-migr…
William-Hill Feb 18, 2026
044b987
Merge pull request #47 from devcolor/rebranding/task-5-ml-pipeline
William-Hill Feb 18, 2026
4f99aa3
Merge pull request #48 from devcolor/rebranding/task-6-nextjs-pg-pool
William-Hill Feb 18, 2026
b1ca0b4
feat: migrate query API routes from mysql2/KCTCS to pg/Bishop State
William-Hill Feb 18, 2026
e507f6a
fix: correct cohort filter format in non-LLM prompt-analyzer fallback…
William-Hill Feb 18, 2026
7b418f1
fix: resolve merge conflicts with rebranding/bishop-state epic branch
William-Hill Feb 18, 2026
1921cea
Merge pull request #49 from devcolor/rebranding/task-7-dashboard-api-…
William-Hill Feb 18, 2026
fd779e3
fix: resolve merge conflicts with rebranding/bishop-state epic branch
William-Hill Feb 18, 2026
7f3607d
Merge pull request #50 from devcolor/rebranding/task-8-query-api-post…
William-Hill Feb 18, 2026
1630bb3
feat: rebrand frontend UI from KCTCS to Bishop State Community College
William-Hill Feb 19, 2026
2f4578c
Merge pull request #52 from devcolor/rebranding/task-9-frontend-ui
William-Hill Feb 19, 2026
7947d19
feat: migrate Docker setup from MariaDB to Postgres
William-Hill Feb 19, 2026
b913b4e
Merge pull request #53 from devcolor/rebranding/task-10-docker-config
William-Hill Feb 19, 2026
d5582c0
docs: rebrand documentation from KCTCS/MariaDB to Bishop State/Postgres
William-Hill Feb 19, 2026
1bdfdb5
Merge pull request #54 from devcolor/rebranding/task-11-update-docs
William-Hill Feb 19, 2026
8e892c9
chore: remove old KCTCS data files and merge script
William-Hill Feb 19, 2026
8deba7d
Merge pull request #55 from devcolor/rebranding/task-12-cleanup-kctcs…
William-Hill Feb 19, 2026
2b394e5
feat: final verification sweep - fix remaining KCTCS references (#56)
William-Hill Feb 19, 2026
b6c7115
chore: add .worktrees/ to .gitignore
William-Hill Feb 20, 2026
2af895b
fix: correct stale chart subtitles and KCTCS-era values in dashboard
William-Hill Feb 20, 2026
b15e963
docs: add readiness methodology with PDP citations and LLM feasibility
William-Hill Feb 20, 2026
f3b6a7d
docs: add readiness score as Model 9 to ML models guide
William-Hill Feb 20, 2026
cd132a3
feat: add PDP credit momentum and math placement to readiness rule en…
William-Hill Feb 20, 2026
745381b
feat: add optional LiteLLM narrative enrichment for medium/low readin…
William-Hill Feb 20, 2026
d1e2b9d
feat: add methodology page and nav link to dashboard
William-Hill Feb 20, 2026
f047e20
fix: add Metadata type annotation and fix citation list semantics
William-Hill Feb 20, 2026
0c74343
fix: lazy-import litellm, add CREATE TABLE IF NOT EXISTS for run log,…
William-Hill Feb 20, 2026
a7342b5
feat: readiness score rule engine (Option C) (#57)
William-Hill Feb 21, 2026
b8216ed
feat: merge readiness PDP alignment, methodology page, and LiteLLM en…
William-Hill Feb 21, 2026
77663a9
docs: add CHANGELOG.md with version history from v0.1.0-kctcs through…
William-Hill Feb 21, 2026
ea4ffd3
feat: update methodology page — remove impl refs, add worked examples…
William-Hill Feb 21, 2026
4a7a1c4
feat: SQL Query Interface UX — result table, chart layout, tooltips, …
William-Hill Feb 21, 2026
ac0dc24
fix: cast np.float64 metrics to float for psycopg2, force labels=[0,1…
William-Hill Feb 21, 2026
4c6c854
fix: upgrade Next.js 16.0.1 → 16.1.6 (CVE-2025-66478)
William-Hill Feb 21, 2026
5853c76
chore: add deploy script for Vercel until GitHub Actions are set up
William-Hill Feb 22, 2026
390dfa5
chore: update ML pipeline report from hosted Supabase run
William-Hill Feb 22, 2026
4f5ceb8
docs: update PRD for Bishop State focus and add Gamma slide deck upda…
William-Hill Feb 22, 2026
83aae90
Merge pull request #64 from devcolor/rebranding/task-17-update-docs-f…
William-Hill Feb 22, 2026
31ce6bb
docs: rewrite Gamma prompt with slide-by-slide instructions based on …
William-Hill Feb 22, 2026
47253f6
chore: merge main into epic — keep Bishop State/Postgres versions, ac…
William-Hill Feb 22, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
125 changes: 125 additions & 0 deletions .claude/docs/architectural_patterns.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
# Architectural Patterns

This document describes architectural patterns used consistently across this codebase.

## API Design Patterns

### Route Structure
All Next.js API routes export explicit HTTP method handlers. See:
- `codebenders-dashboard/app/api/analyze/route.ts:82-87`
- `codebenders-dashboard/app/api/dashboard/kpis/route.ts:22`
- `codebenders-dashboard/app/api/execute-sql/route.ts:33`

### Error Response Standardization
Consistent error structure: `{ error: string, details?: string }` with appropriate HTTP status codes (400 for bad requests, 404 for not found, 500 for server errors). See:
- `codebenders-dashboard/app/api/dashboard/kpis/route.ts:54-59`
- `codebenders-dashboard/app/api/execute-sql/route.ts:72-78`
- `codebenders-dashboard/app/api/analyze/route.ts:212-218`

### Console Logging with Prefixes
Debug logs use module prefixes for traceability (e.g., `[analyze]`, `[v0]`). See:
- `codebenders-dashboard/app/api/analyze/route.ts:83-211`
- `codebenders-dashboard/lib/query-executor.ts:11-72`

## Database Access Patterns

### Connection Pooling (TypeScript)
Lazy-initialized singleton pg Pool prevents connection exhaustion:
- `codebenders-dashboard/lib/db.ts` - `getPool()` singleton

Key config: `max: 10`, pool error handler registered on init

### Connection Pooling (Python)
psycopg2 with connection pooling via SQLAlchemy:
- `operations/db_utils.py` - `get_connection()`, `get_sqlalchemy_engine()`
- `operations/db_config.py` - Centralized DB_CONFIG

### Parameterized Queries
All dynamic queries use `$1`, `$2` placeholders (Postgres style) with params arrays to prevent SQL injection:
- `codebenders-dashboard/app/api/dashboard/readiness/route.ts:47-96`

### Bulk Data Insertion
Chunked DataFrame insertion (1000 records/batch) with progress tracking:
- `operations/db_utils.py:49-117` - `save_dataframe_to_db()`

## React/Next.js Patterns

### Independent State Variables
Multiple `useState` hooks for different data domains instead of single state object:
- `codebenders-dashboard/app/page.tsx:51-58`
- `codebenders-dashboard/app/query/page.tsx:27-32`

### Parallel Data Fetching
`Promise.all()` for concurrent API calls:
- `codebenders-dashboard/app/page.tsx:67-81`

### Component Loading States
Three-state rendering pattern: loading skeleton → error message → content:
- `codebenders-dashboard/components/kpi-card.tsx:19-59`
- `codebenders-dashboard/components/risk-alert-chart.tsx:47-84`

## ML Pipeline Patterns

### Feature Engineering Pipeline
Sequential stages: data loading → feature engineering → preprocessing → training → evaluation → storage:
- `ai_model/complete_ml_pipeline.py:91-179` - Target variable calculation
- `ai_model/complete_ml_pipeline.py:189-223` - Feature set definitions

### Preprocessing with Label Encoding
Centralized preprocessing: median imputation for numeric, "Unknown" for categorical, LabelEncoder for object types:
- `ai_model/complete_ml_pipeline.py:232-256` - `preprocess_features()`

### Model Performance Tracking
Metrics saved to `ml_model_performance` table after each training run:
- `operations/db_utils.py:159-210` - `save_model_performance()`

## Component Patterns

### TypeScript Props Interfaces
All components define explicit prop interfaces with optional loading/error fields:
- `codebenders-dashboard/components/kpi-card.tsx:5-16`
- `codebenders-dashboard/components/risk-alert-chart.tsx:13-17`

### Chart Color Mapping
Centralized color dictionaries mapping semantic values to hex colors:
- `codebenders-dashboard/components/risk-alert-chart.tsx:19-24`
- `codebenders-dashboard/components/retention-risk-chart.tsx:19-24`

Colors: LOW=#22c55e (green), MODERATE=#eab308 (yellow), HIGH=#f97316 (orange), URGENT=#ef4444 (red)

### Multi-Format Export
Single component handles CSV, JSON, Markdown exports via `downloadFile()` utility:
- `codebenders-dashboard/components/export-button.tsx:25-209`

## Configuration Patterns

### Environment Variable Hierarchy
ENV vars with fallback defaults for development:
- `codebenders-dashboard/app/api/dashboard/readiness/route.ts:4-10`

### Schema Configuration Constants
Database schema metadata as constants for multi-institution support:
- `codebenders-dashboard/app/api/analyze/route.ts:22-79` - SCHEMA_INFO
- `codebenders-dashboard/lib/prompt-analyzer.ts:4-28` - SCHEMA_CONFIG

## Error Handling Patterns

### Try-Catch with Typed Errors
All API routes wrap operations in try-catch, return structured errors with stack traces logged:
- `codebenders-dashboard/app/api/analyze/route.ts:209-220`
- `codebenders-dashboard/app/api/execute-sql/route.ts:65-79`

### Python Status Reporting
Visual status indicators with `print()` statements:
- `operations/db_utils.py` - Uses `✓` for success, `✗` for failure
- Section headers with `"=" * 80`

## Data Transformation Patterns

### JSON Field Parsing
Parse JSON columns from DB, aggregate values, handle parse errors gracefully:
- `codebenders-dashboard/app/api/dashboard/readiness/route.ts:119-157`

### Query Plan to SQL Conversion
Semantic query plans translated to SQL using schema-aware column mapping:
- `codebenders-dashboard/lib/prompt-analyzer.ts:30-174`
19 changes: 19 additions & 0 deletions .docker.env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# PostgreSQL Configuration for Docker Compose
# Copy this file to .docker.env and update with your values

# Database user
POSTGRES_USER=postgres

# Database password
POSTGRES_PASSWORD=devcolor2025

# Database name
POSTGRES_DB=bishop_state

# Port mapping (host:container)
POSTGRES_PORT=5432

# pgAdmin configuration (optional)
PGADMIN_EMAIL=admin@bishopstate.edu
PGADMIN_PASSWORD=devcolor2025
PGADMIN_PORT=8080
14 changes: 14 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,11 @@ lerna-debug.log*
package-lock.json
yarn.lock
pnpm-lock.yaml

# Exceptions: dashboard source files that conflict with Python ignores above
!codebenders-dashboard/lib/
!codebenders-dashboard/lib/**

dist/
dist-ssr/
*.local
Expand Down Expand Up @@ -150,15 +155,24 @@ supabase/.env
# Docker (if used for local dev)
docker-compose.override.yml

# Git worktrees
.worktrees/

# Misc
.cache/
*.seed
*.pid
*.pid.lock
.terraform/

# Exceptions: dashboard source files that conflict with Python ignores above
!codebenders-dashboard/lib/
!codebenders-dashboard/lib/**

# Operations scripts (local utilities)
operations/fix_institution_id.py
operations/list_tables.py
operations/convert_institution_id_to_string.py
operations/verify_institution_id.py
.vercel
.env.deploy
96 changes: 96 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
# Changelog

All notable changes to the Bishop State Student Success Dashboard are documented here.

Each version corresponds to a git tag. To compare any two versions:
```
git log v0.1.0-kctcs..v0.2.0-bishop-state-data --oneline
```

---

## [v0.6.0-readiness-engine] — 2026-02-20

Student readiness scoring, PDP-aligned methodology documentation, and optional LiteLLM narrative enrichment.

### Added
- Rule-based readiness score engine (`ai_model/generate_readiness_scores.py`) — deterministic, FERPA-safe, fully auditable
- Supabase migration creating `llm_recommendations` and `readiness_generation_runs` tables
- `docs/READINESS_METHODOLOGY.md` — full scoring formula, PDP alignment table, FERPA compliance notes, and 5 research citations (CCRC, CAPR, Bird et al. 2021)
- Model 9 (Readiness Score) section in `ML_MODELS_GUIDE.md`
- PDP credit momentum (12-credit Year 1 milestone) and math placement components to rule engine
- Optional `--enrich-with-llm` flag using LiteLLM for provider-agnostic narrative enrichment (OpenAI, Ollama, Anthropic, Azure)
- `/methodology` page in Next.js dashboard with scoring breakdown tables, PDP alignment, and references
- Methodology nav button in dashboard header

### Fixed
- Stale chart subtitles showing concatenated strings instead of student counts (PostgreSQL `COUNT(*)` bigint → string coercion)
- Data source footnote referencing wrong table name and student count

---

## [v0.5.0-docs-cleanup] — 2026-02-19

Final documentation sweep and removal of all remaining KCTCS references.

### Added
- Docker setup migrated from MariaDB to Postgres

### Changed
- All documentation rebranded from KCTCS/MariaDB to Bishop State/Postgres

### Removed
- Old KCTCS data files and merge script

### Fixed
- Final sweep removing remaining KCTCS references across codebase

---

## [v0.4.0-frontend-rebrand] — 2026-02-18

Frontend UI, dashboard API routes, and query API fully migrated to Bishop State Community College and Postgres.

### Changed
- Frontend UI rebranded from KCTCS to Bishop State Community College (institution name, colors, copy)
- Dashboard API routes migrated from `mysql2`/KCTCS to `pg`/Bishop State
- Query API routes migrated from `mysql2`/KCTCS to `pg`/Bishop State
- Shared Postgres connection pool module added for Next.js

### Fixed
- Cohort filter format in non-LLM prompt-analyzer fallback path

---

## [v0.3.0-postgres-migration] — 2026-02-17

Python ML pipeline database layer fully migrated from MariaDB to Postgres.

### Changed
- Python DB layer migrated from `pymysql`/MariaDB to `psycopg2`/Postgres
- ML pipeline updated for Bishop State data and Supabase Postgres

---

## [v0.2.0-bishop-state-data] — 2026-02-17

Bishop State data introduced and local Supabase development environment initialized.

### Added
- Local Supabase setup for Postgres development
- Synthetic Bishop State student data generation script
- Bishop State data merge script and merged dataset
- Updated ML models and documentation for Bishop State

---

## [v0.1.0-kctcs] — 2025-10-29

Baseline snapshot of the KCTCS/MariaDB codebase immediately before the Bishop State rebranding effort began.

### Context
This tag marks the final state of the project when it was built for Kentucky Community and Technical College System (KCTCS) using MariaDB. All subsequent versions represent the migration to Bishop State Community College and Postgres.

---

*Generated from git tags. See [GitHub Releases](https://github.com/devcolor/codebenders-datathon/releases) or use `git log <tag1>..<tag2>` for full commit details between versions.*
97 changes: 97 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Project Overview

Bishop State Student Success Prediction - Full-stack ML + web application predicting student outcomes for Bishop State Community College. Uses 5 ML models to generate retention predictions, early warnings, time-to-credential estimates, credential type forecasts, and GPA predictions for ~4K students.

## Tech Stack

| Layer | Technologies |
|-------|-------------|
| ML Pipeline | Python 3.8+, XGBoost, scikit-learn, pandas |
| Frontend | Next.js 16, React 19, TypeScript, Tailwind CSS |
| Charts | Recharts |
| UI Components | shadcn/ui (Radix UI) |
| Database | Postgres (Supabase), pg driver |
| AI Features | OpenAI (natural language query analysis) |
| Infrastructure | Docker Compose, Vercel |

## Key Directories

| Directory | Purpose |
|-----------|---------|
| `ai_model/` | Python ML pipeline - 5 models (XGBoost + Random Forest) |
| `codebenders-dashboard/` | Next.js web application |
| `codebenders-dashboard/app/` | App Router pages and API routes |
| `codebenders-dashboard/components/` | React components (shadcn/ui based) |
| `codebenders-dashboard/lib/` | Utilities: prompt-analyzer.ts, query-executor.ts |
| `operations/` | Database utilities and configuration |
| `data/` | CSV data files (~20K students, ~500K courses) |

## Essential Commands

### ML Pipeline
```bash
pip install -r requirements.txt # Install Python dependencies
cd ai_model && python complete_ml_pipeline.py # Run full pipeline
python -m operations.test_db_connection # Test DB connection
```

### Dashboard
```bash
cd codebenders-dashboard
npm install # Install dependencies
npm run dev # Dev server (localhost:3000)
npm run build # Production build
npm run lint # Lint check
```

### Docker
```bash
docker-compose up -d # Start Postgres + pgAdmin
docker-compose down -v # Stop and remove volumes
```

## Database Schema

Three main tables in the `bishop_state` Postgres database:
- `student_predictions` - Student-level predictions (~4K records)
- `course_predictions` - Course-level predictions (~100K records)
- `ml_model_performance` - Model metrics and training history

## Key Entry Points

| File | Purpose |
|------|---------|
| `ai_model/complete_ml_pipeline.py:1` | Main ML entry point |
| `codebenders-dashboard/app/page.tsx:1` | Dashboard home page |
| `codebenders-dashboard/app/query/page.tsx:1` | Query interface page |
| `codebenders-dashboard/lib/prompt-analyzer.ts:30` | LLM-powered SQL generation |
| `operations/db_config.py:8` | Database configuration |

## Python Environment

- **Always** use the project virtualenv at `venv/` when running Python commands.
- Activate with `source venv/bin/activate` or use `venv/bin/python` directly.
- Install dependencies into the venv, not globally.

## Git Commit Rules

- **Never** add `Co-Authored-By` lines to commit messages.

## Additional Documentation

Check these files for detailed information on specific topics:

| Topic | File |
|-------|------|
| Architectural patterns | `.claude/docs/architectural_patterns.md` |
| Project overview | `README.md` |
| Quick start guide | `QUICKSTART.md` |
| Data field descriptions | `DATA_DICTIONARY.md` |
| ML model details | `ML_MODELS_GUIDE.md` |
| Dashboard features | `codebenders-dashboard/DASHBOARD_README.md` |
| Database utilities | `operations/README.md` |
| Docker setup | `DOCKER_SETUP.md` |
6 changes: 3 additions & 3 deletions DASHBOARD_VISUALIZATIONS.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Dashboard Visualizations Guide
## KCTCS Student Success Analytics & Predictive Models
## Bishop State Community College Student Success Analytics & Predictive Models

**Dataset**: `kctcs_student_level_with_predictions.csv`
**Dataset**: `bishop_state_student_level_with_predictions.csv`
**Students**: 32,800
**Date**: October 28, 2025
**Purpose**: Comprehensive visualization guide for retention, graduation, and student success metrics
Expand Down Expand Up @@ -680,7 +680,7 @@ Average(risk_score) by Program_of_Study_Year_1

- **Model Performance Details**: See `ML_MODELS_GUIDE.md`
- **Data Dictionary**: See `DATA_DICTIONARY.md`
- **Raw Data**: `kctcs_student_level_with_predictions.csv`
- **Raw Data**: `bishop_state_student_level_with_predictions.csv`

---

Expand Down
Loading