Natural language task management powered by GPT-4 + deterministic relevance scoring
MindFlow demonstrates an AI-first vertical slice integrating Custom GPT (Actions), Google Apps Script (API), and Google Sheets (data store) for intelligent task prioritization through conversation.
Goal: Enable users to manage tasks using natural language while maintaining transparent, explainable AI recommendations.
Tech Stack:
- Frontend: Custom GPT with Actions (OpenAI function calling)
- Backend: Google Apps Script (REST API)
- Database: Google Sheets (tasks + audit logs)
- Intelligence: GPT-4 + deterministic scoring algorithm
Key Features:
- 🎯 Ask "What should I do next?" → Get intelligently ranked task with reasoning
- ✍️ Create/update tasks conversationally (no forms)
- 📊 Real-time updates visible in Google Sheet
- 🔍 Full audit trail of all operations
- 🧮 Explainable relevance scores (not ML black box)
┌─────────────────────────────────────────────────────────┐
│ USER (Natural Language) │
│ "What should I do next?" │
└────────────────────────┬────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ CUSTOM GPT (Intent Recognition) │
│ • Parses user intent │
│ • Calls API via Actions (function calling) │
│ • Returns natural language response │
└────────────────────────┬────────────────────────────────┘
│ HTTPS (JSON)
┌────────────────┼────────────────┐
▼ ▼ ▼
/create /best-task /complete
/update /snooze /query
│ │ │
└────────────────┼────────────────┘
▼
┌─────────────────────────────────────────────────────────┐
│ GOOGLE APPS SCRIPT (REST API Layer) │
│ • Input validation │
│ • Relevance scoring engine │
│ • CRUD operations on Sheets │
│ • Audit logging │
└────────────────────────┬────────────────────────────────┘
│ Sheets API
┌───────────────┴───────────────┐
▼ ▼
┌──────────┐ ┌──────────┐
│ Tasks │ │ Logs │
│ Sheet │ │ Sheet │
└──────────┘ └──────────┘
│
▼
┌──────────────────────────────────────┐
│ RELEVANCE SCORING (Deterministic) │
│ score = 0.40×priority │
│ + 0.35×urgency │
│ + 0.15×context │
│ + 0.10×momentum │
└──────────────────────────────────────┘
| Component | Rationale |
|---|---|
| Custom GPT | Natural language interface, built-in function calling, no custom UI needed |
| Google Apps Script | Zero server ops, free tier, rapid prototyping, native Sheets integration |
| Google Sheets | Transparent data store, real-time collaboration, familiar UX, instant audit trail |
| Deterministic Scoring | Explainable (not ML black box), customizable weights, debuggable logic |
Sheet Name: tasks
| Column | Type | Required | Description | Example |
|---|---|---|---|---|
id |
UUID | ✓ | Unique identifier | a1b2c3d4-e5f6-... |
title |
String(256) | ✓ | Task description | Review Q4 metrics |
description |
String(1000) | ✗ | Detailed notes | Analyze quarterly performance... |
status |
Enum | ✓ | Current state | pending, in_progress, completed, snoozed |
priority |
Integer(1-5) | ✓ | Urgency level | 5 (most urgent) to 1 (low) |
due_date |
ISO8601 | ✗ | Deadline | 2025-11-15T17:00:00Z |
snoozed_until |
ISO8601 | ✗ | Hidden until | 2025-10-31T14:00:00Z |
created_at |
ISO8601 | ✓ | Creation time | 2025-10-30T10:00:00Z |
updated_at |
ISO8601 | ✓ | Last modified | 2025-10-30T11:00:00Z |
Validation Rules:
statusmust be one of:pending,in_progress,completed,snoozedprioritymust be integer 1-5 (inclusive)titlemust be non-empty, max 256 characters- Date fields must be valid ISO8601 or empty
Sheet Name: logs
| Column | Description | Example |
|---|---|---|
timestamp |
When request occurred | 2025-10-30T11:00:00Z |
action |
Operation performed | GET_BEST_TASK, CREATE_TASK |
result |
Outcome | success, error |
status_code |
HTTP status | 200, 400, 500 |
request_id |
Tracing UUID | req-xyz-123 |
error_message |
Error details (if failed) | Missing required field: title |
Purpose: Full audit trail for debugging and compliance.
https://script.google.com/macros/s/{YOUR_SCRIPT_ID}/exec
Endpoint: POST /?action=create
Request:
{
"title": "Review Q4 metrics",
"description": "Analyze quarterly performance data",
"priority": 4,
"due_date": "2025-11-15T17:00:00Z"
}Response (201):
{
"status": "success",
"code": 201,
"data": {
"id": "a1b2c3d4-e5f6-7890-g1h2-i3j4k5l6m7n8",
"title": "Review Q4 metrics",
"status": "pending",
"created_at": "2025-10-30T11:00:00Z"
}
}Endpoint: GET /?action=best
Query Parameters:
timezone(optional): IANA timezone, defaultUTC
Response (200):
{
"status": "success",
"code": 200,
"data": {
"id": "a1b2c3d4-e5f6",
"title": "Review Q4 metrics",
"priority": 4,
"due_date": "2025-11-15T17:00:00Z",
"score": 52,
"reasoning": "High priority (4) + due in 16 days"
}
}Response (200 - No Tasks):
{
"status": "success",
"code": 200,
"data": {
"status": "no_tasks",
"message": "No active tasks"
}
}Endpoint: POST /?action=update&id={task_id}
Request:
{
"status": "in_progress",
"priority": 5
}Response (200):
{
"status": "success",
"code": 200,
"data": {
"id": "a1b2c3d4-e5f6",
"updated_at": "2025-10-30T11:05:00Z"
}
}Endpoint: POST /?action=complete&id={task_id}
Response (200):
{
"status": "success",
"code": 200,
"message": "Task marked as completed"
}Endpoint: POST /?action=snooze&id={task_id}
Request:
{
"snooze_duration": "2h"
}Supported durations: 1h, 2h, 4h, 1d, 2d, 1w
Response (200):
{
"status": "success",
"code": 200,
"data": {
"snoozed_until": "2025-10-30T13:00:00Z"
}
}Endpoint: GET /?action=query
Query Parameters:
status(optional): Filter by statuspriority(optional): Filter by prioritylimit(optional): Max results (default 10, max 20)
Example:
GET /?action=query&status=pending&priority=5
Response (200):
{
"status": "success",
"code": 200,
"data": [
{
"id": "a1b2c3d4-e5f6",
"title": "Review Q4 metrics",
"status": "pending",
"priority": 4,
"due_date": "2025-11-15T17:00:00Z"
}
]
}All errors follow this format:
{
"status": "error",
"code": 400,
"message": "Validation failed",
"errors": [
{
"field": "title",
"issue": "Title is required"
}
],
"requestId": "req-a1b2c3d4"
}Common Error Codes:
400– Validation failed (invalid input)404– Task not found500– Internal server error
The "best task right now" is determined using a weighted scoring model:
score = (0.40 × priority_score)
+ (0.35 × urgency_score)
+ (0.15 × context_score)
+ (0.10 × momentum_score)
priority_score = priority_level × 20
| Priority | Score | Meaning |
|---|---|---|
| 5 | 100 | Urgent (do now) |
| 4 | 80 | High (this week) |
| 3 | 60 | Normal |
| 2 | 40 | Low (can wait) |
| 1 | 20 | Nice-to-have |
Based on time remaining until due_date:
| Time Remaining | Urgency Score | Rationale |
|---|---|---|
| Overdue | 100 | Critical! |
| < 4 hours | 90 | Immediate attention |
| < 24 hours | 75 | Today's work |
| < 72 hours | 50 | This week |
| > 10 days | 0-40 | Linear decay |
Currently fixed at 50. Future enhancement: time-of-day awareness.
Planned Logic:
- Tasks tagged
morningget +20 boost from 6am-12pm - Tasks tagged
afternoonget +20 boost from 12pm-6pm - Tasks tagged
eveningget +20 boost from 6pm-11pm
Encourages task completion and focus:
| Status | Momentum Score | Rationale |
|---|---|---|
in_progress |
80 | Don't lose focus on started work |
pending (old) |
20-40 | Age encourages completion |
completed |
0 | Filtered out |
snoozed |
0 | Filtered out |
Task: "Review Q4 metrics"
- Priority: 4 (high)
- Due: 2025-11-15 (16 days away)
- Status: pending
- Created: 4 days ago
- Current Time: 2025-10-30 14:00 UTC
Calculation:
priority_score = 4 × 20 = 80
urgency_score = 30 (16 days away → light decay)
context_score = 50 (no time-of-day tag)
momentum_score = 20 (4 days old)
total_score = (0.40 × 80) + (0.35 × 30) + (0.15 × 50) + (0.10 × 20)
= 32 + 10.5 + 7.5 + 2
= 52 (moderate priority)
GPT Response:
"You should work on 'Review Q4 metrics' next. It's high priority (4) and due in 16 days, giving it a score of 52."
- Default: All timestamps stored in UTC (ISO8601)
- Query Support:
timezoneparameter for/bestendpoint - Apps Script: Uses
Session.getScriptTimeZone()for server locale
- Task Creation: Unique
id(UUID) prevents duplicates - Updates: Based on
id, safe to retry - Completion: Idempotent (multiple calls = same result)
Every API request logs:
- Timestamp (when)
- Action (what)
- Result (success/error)
- Status code (HTTP)
- Request ID (tracing)
- Error message (if failed)
Query logs:
// Open Logs sheet
// Filter by status_code >= 400 to see errors
// Use request_id to trace user sessionsCurrent: None (demo only)
Production Recommendation: 60 requests/minute per user
Fastest method using modern Python tooling:
# Install uv (if not installed)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install dependencies
uv sync --all-extras
# Seed test data (47 realistic tasks)
make seed
# Run tests
make testEstimated Time: 10-15 minutes
Features:
- ✅ Modern Python package management with
uv(blazing fast) - ✅ Factory-generated realistic test data (60+ test cases)
- ✅ Comprehensive test coverage with pytest
- ✅ Make commands for easy workflow
- ✅ Full documentation with guides
Guides:
TESTING.md- Testing guide with uv commandsCUSTOM_GPT_SETUP.md- Custom GPT configuration
Step-by-step walkthrough for those who prefer full control:
Summary:
- Create Google Sheet with
tasksandlogstabs - Deploy Google Apps Script as web app
- Configure Custom GPT with Actions schema
- Test end-to-end flow
Estimated Time: 45 minutes
Guide: Follow DEPLOYMENT.md for complete manual instructions.
Choose manual setup if:
- You want to understand every step in detail
- You're deploying in a restricted environment
- You prefer manual configuration over automation
Use curl or Postman to test endpoints:
# Replace YOUR_SCRIPT_ID with your actual deployment ID
BASE_URL="https://script.google.com/macros/s/YOUR_SCRIPT_ID/exec"
# Test 1: Create a task
curl -X POST "${BASE_URL}?action=create" \
-H "Content-Type: application/json" \
-d '{
"title": "Test task",
"priority": 3,
"due_date": "2025-11-01T17:00:00Z"
}'
# Test 2: Get best task
curl "${BASE_URL}?action=best"
# Test 3: Query tasks
curl "${BASE_URL}?action=query&status=pending"Test Case 1: Empty State
User: What should I do next?
Expected: "You currently have no active tasks! Would you like to create one?"
Test Case 2: Task Creation
User: Create a task to prepare presentation by tomorrow, priority 5
Expected: [Task created, shows confirmation with ID]
Test Case 3: Best Task Query
User: What should I do next?
Expected: "You should work on 'Prepare presentation' - it's urgent (priority 5)
and due tomorrow (score: 92)."
Test Case 4: Task Completion
User: Mark that task as complete
Expected: "Done! I've marked 'Prepare presentation' as complete."
After each test:
- ✅ Check
taskssheet for correct data - ✅ Check
logssheet for audit trail - ✅ Verify
updated_attimestamps change - ✅ Confirm GPT responses match reality
| Decision | Assumption | Trade-off | Mitigation |
|---|---|---|---|
| Public GAS endpoint | Demo/testing only | Anyone can access | Use OAuth for production |
| Google Sheets as DB | < 1000 tasks, single user | Limited scalability | Clear migration path to Postgres |
| Deterministic scoring | Explainability > accuracy | No ML personalization | Can add ML layer later |
| No authentication | Trusted environment | Security risk | Document hardening steps |
| Synchronous API | Low concurrency | May timeout at scale | Use async/queue for production |
Reasons for rule-based approach:
- Explainability: Users see exact reasoning ("due today + high priority")
- Debuggability: Scores are reproducible and testable
- Transparency: No black box; weights are visible
- Speed: No model inference latency
- Simplicity: No training data needed
When to switch to ML:
- User feedback on recommendations
- Personalization (different users = different priorities)
- Context awareness (calendar integration, location, etc.)
-
/metricsendpoint (tasks completed today, average completion time) - Google Calendar integration (block focus time automatically)
- Recurring tasks (daily standup, weekly review)
- Task dependencies (can't start B until A is complete)
- Collaborative tasks (assign to others)
- OAuth 2.0 authentication
- Rate limiting (60 req/min per user)
- Input sanitization (XSS prevention)
- HTTPS enforcement
- Backup/restore functionality
Transition to FastAPI + Postgres:
# Equivalent FastAPI endpoint
from fastapi import FastAPI
app = FastAPI()
@app.get("/tasks/best")
async def get_best_task(timezone: str = "UTC"):
tasks = await db.query(Task).filter(Task.status != 'completed')
scored = [(t, score_task(t)) for t in tasks]
best = max(scored, key=lambda x: x[1])
return {"id": best[0].id, "score": best[1], "reasoning": "..."}Benefits:
- 1000+ QPS (vs GAS ~5-10 QPS)
- Rich SQL queries
- Multi-tenancy support
- Background jobs
- WebSocket support (real-time updates)
Migration Checklist:
- Implement Postgres schema
- Port GAS logic to FastAPI
- Deploy to Fly.io / Render
- Update OpenAPI schema (new URL)
- Sync data from Sheets → Postgres
- Update Custom GPT config
- Run integration tests
- Switch traffic
- Decommission GAS
- Custom GPT: Try MindFlow - Live conversational task manager
- Video Demo: Watch 5-minute walkthrough - See MindFlow in action
- Google Sheet:
[Configure your own - see DEPLOYMENT.md] - API Endpoint:
[Deploy your own - see DEPLOYMENT.md]
- Deployment Guide: DEPLOYMENT.md
- Architecture Deep-Dive: docs/raw/architecture-1pager.md
- Testing Guide: TESTING.md
- Custom GPT Setup: CUSTOM_GPT_SETUP.md
- Google Apps Script: src/gas/Code.gs - Complete API implementation
- OpenAPI Schema: src/gas/openapi-schema-gpt.json - GPT-optimized (6 operations, response limits)
This is an MVP demonstration project. Contributions welcome for:
- Bug fixes
- Documentation improvements
- Test coverage
- Security hardening
- Performance optimizations
Not accepting:
- Major architectural changes (out of scope for MVP)
- ML/AI model integration (future phase)
MIT License - see LICENSE for details.
Built as a demonstration of AI-first architecture combining:
- OpenAI Custom GPTs (Actions)
- Google Apps Script (serverless backend)
- Google Sheets (transparent data store)
- Deterministic relevance scoring (explainable AI)
Questions? Open an issue or see DEPLOYMENT.md for setup help.