Live Demo:
Gesture Data Collection and Evaluation is a research prototype for collecting natural human gestures over computer interface screenshots and evaluating multimodal models on gesture-intent understanding.
The project has two main surfaces:
- A FastAPI + Vue web app with a researcher admin UI and a participant collection UI.
- A CLI evaluation runner that sends gesture videos and screenshots to target models, then scores predicted intent with a judge model.
This public repository includes source code, tests, prompt templates, and small synthetic fixtures. It intentionally does not include local collection databases, real participant videos, private questionnaire responses, run outputs, or API keys.
- Participant collection flow at
/collect. - Researcher admin UI at
/admin. - Screenshot library and collection enable/disable controls.
- Gesture recording with target-region selection and free-text gesture/intent descriptions.
- Questionnaire templates and responses.
- Collection package preview/import support.
- Model evaluation modes:
single,two_stage,video_only,both,all, andopenrouter_only. - Providers: mock, OpenAI/GPT, Gemini/Gemma, Qwen, DeepSeek judge, OpenRouter, OpenAI-compatible endpoints, and Ollama.
- JSONL dataset manifests and structured run outputs.
gesture_eval/: Python package, FastAPI app, CLI runner, providers, database helpers, prompts, and evaluation core.frontend/src/: Vue 3 source for admin and collection pages.examples/: small synthetic fixtures for smoke tests and schema examples.prompts/: target-model and judge prompt templates.tests/: Python unittest coverage.scripts/: safe utility scripts.docs/images/: public-safe synthetic screenshots used in this README.
The following are intentionally ignored and should remain local:
.envdata/runs/output/.tmp-tests/.uv-cache/.venv/frontend/node_modules/gesture_eval/web/static/app/- real collected videos, local SQLite databases, questionnaires, and API keys
The images below are synthetic AI-generated desktop/application screenshots used as public examples.
This repository also includes an English-only GitHub Pages showcase at docs/index.html. It is designed for direct browser launch and does not require the FastAPI server.
The Pages showcase links to two static demos that mirror the real /collect and /admin surfaces as closely as possible without a backend:
docs/demo/index.html: participant collection, cross evaluation using the same browser-local clip, questionnaire, and simulated upload.docs/researcher/index.html: research console dashboard, Cloudflare tunnel panel, provider settings, collection analysis, evaluation setup, and run detail preview.
For GitHub Pages, publish the repository from the docs/ folder and open the Pages root. For local preview, open docs/index.html in a current desktop browser.
Install Python dependencies with uv:
UV_CACHE_DIR=.uv-cache uv syncInstall frontend dependencies:
cd frontend
npm installBuild the frontend before serving the production web app:
cd frontend
npm run buildFrom the repository root:
UV_CACHE_DIR=.uv-cache uv run python -m gesture_eval.web serve \
--db data/gesture_data.sqlite \
--host 127.0.0.1 \
--port 8765Open:
- Admin UI:
http://127.0.0.1:8765/admin - Collection UI:
http://127.0.0.1:8765/collect
The app creates local SQLite and media files under data/. That directory is ignored by Git because it may contain participant data.
The helper scripts provide the same local flow:
./start.shWindows PowerShell:
.\start.ps1For short-lived remote collection sessions, run the local FastAPI app and expose it with a temporary Cloudflare tunnel:
cloudflared tunnel --url http://127.0.0.1:8765The admin dashboard also includes tunnel status/start/stop controls when cloudflared is available. Share only the /collect link with participants. Keep /admin protected with GESTURE_ADMIN_PASSWORD in any shared environment.
Copy .env.example to .env for local credentials:
cp .env.example .envSupported environment variables include:
OPENAI_API_KEYGEMINI_API_KEYGOOGLE_GEMINI_API_KEYOPENROUTER_API_KEYOPENROUTER_BASE_URLQWEN_API_KEYDEEPSEEK_API_KEYMULTIMODAL_API_KEYGESTURE_ADMIN_PASSWORDGESTURE_ADMIN_COOKIE_SECURE
The prototype can write provider settings to .env from the admin UI. Do not commit .env.
Run the mock provider against the synthetic example dataset:
UV_CACHE_DIR=.uv-cache uv run python -m gesture_eval.cli \
--dataset examples/dataset.jsonl \
--target-model mock-vlm \
--provider mock \
--mode both \
--judge-provider mock \
--judge-model mock-judge \
--output-dir runs/smokeOutputs include:
call_logs.jsonlresults.jsonlsummary.jsondetailed_report.jsonpaper_report.md
runs/ is ignored because real runs can include prompts, model responses, media paths, and local evaluation metadata.
Dry-run synthetic screenshot generation:
UV_CACHE_DIR=.uv-cache uv run python -m gesture_eval.screenshot_cli \
--core-intent "archive the highlighted email" \
--ui-category email \
--output-dir runs/generated_screenshots \
--mockReal image generation requires an OpenAI-compatible Images API key:
MULTIMODAL_API_KEY=<your-key> UV_CACHE_DIR=.uv-cache uv run python -m gesture_eval.screenshot_cli \
--core-intent "open the selected user's profile" \
--ui-category people_grid \
--image-base-url https://example.com/v1 \
--use-prompt-model \
--count 3Generated screenshots should be reviewed before becoming public examples.
- SQLite stores media paths, not video/image blobs.
- Public collection routes should expose only screenshots marked as collection-enabled.
- Collection sessions use server-generated tokens for session-scoped access.
- Admin routes and evaluation routes should be protected with
GESTURE_ADMIN_PASSWORDwhen shared beyond local development. - API keys may be stored in
.envduring prototype use. - Real videos, questionnaires, user/session data, and run outputs are not part of this public repository.
Python:
UV_CACHE_DIR=.uv-cache uv run python -m unittest discover -s tests -v
UV_CACHE_DIR=.uv-cache uv run python -m compileall gesture_eval testsFrontend:
cd frontend
npm test -- --run
npm run buildPublic release check:
UV_CACHE_DIR=.uv-cache uv run python scripts/check_public_release.py

