Skip to content

MU-Smart/DataVis

Repository files navigation

Project Overview

DataVis Client is a multi-sensor data analysis and visualization toolkit for health research. It's a full-stack Next.js application that processes CSV/Excel sensor data through Python preprocessing pipelines and renders interactive time series visualizations with AI-powered insights.

Commands

npm run dev              # Start Next.js dev server (Turbopack)
npm run genkit:dev       # Start Genkit AI development server
npm run build            # Production build
npm run lint             # ESLint
npm run typecheck        # TypeScript check (tsc --noEmit)

Python scripts are executed via Node.js child_process. Python 3 with pandas, matplotlib, seaborn, and openpyxl is required.

Architecture

Data Flow

  1. Upload/Import → User uploads CSV/Excel files or imports from Google Drive
  2. Preprocessing/api/python/run executes python_scripts/preprocess.py which routes to specialized preprocessors:
    • preprocessors/metawear.py - MetaWear pressure sensors
    • preprocessors/palanalysis.py - PALanalysis activity data
    • preprocessors/glucose.py - CGM glucose monitoring
    • preprocessors/k5.py - COSMED K5 metabolic cart data
    • preprocessors/generic.py - Auto-detection fallback
  3. State Management → Processed data stored as RawDataEntry[] in main page state
  4. Visualizationsrc/lib/csv.ts handles time coercion, downsampling, and interpolation; Recharts renders charts
  5. AI Analysis → Genkit flows in src/ai/flows/ provide outlier detection, question answering, and Python script generation

Key Directories

  • src/app/ - Next.js App Router pages and API routes
  • src/app/page.tsx - Main DataVis Client page (primary orchestrator component)
  • src/app/sandboxing/ - Visualization sandbox for batch processing
  • src/ai/flows/ - Genkit AI prompts and flows (Zod-validated)
  • src/components/ - React components including Shadcn/ui primitives in ui/
  • src/lib/csv.ts - CSV parsing, downsampling, timeline building, series interpolation
  • src/server/python/runner.ts - Python subprocess execution
  • python_scripts/ - Python preprocessing and export scripts

API Routes

  • POST /api/python/run - Execute preprocessing on uploaded files
  • POST /api/drive/process-folder - Google Drive folder import with retry/backoff
  • POST /api/sandbox/generate-script - LLM-generated visualization scripts
  • POST /api/run-export - Execute export scripts from python_scripts/

Data Conventions

  • Time values are coerced to elapsed seconds relative to first timestamp
  • Series names are mangled as ${metric} (${fileName}) for multi-file support
  • Charts downsample to max 1000 points for performance
  • buildTimeline() and interpolateSeriesValues() align multi-file data on a uniform time grid

Environment Variables

Required in .env or .env.local:

  • GEMINI_API_KEY - Google Gemini API
  • NEXT_PUBLIC_GOOGLE_CLIENT_ID - OAuth client for Drive
  • NEXT_PUBLIC_GOOGLE_API_KEY - Google API key
  • OPENAI_API_KEY - OpenAI API (optional, for script generation)

Style

  • Dark mode default with light mode toggle
  • Colors: Steel blue (#4682B4) primary, Salmon (#FA8072) accent
  • Fonts: Inter (body), Source Code Pro (code)
  • Tailwind CSS with HSL color variables

About

This is the repository for the DataVis visualization tool

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors