Minimal Django app to upload CSV/Excel/JSON datasets, generate an initial data profile with an OpenAI agent (Code Interpreter), and chat for exploratory analysis and charts.
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtCopy env file and set keys:
cp .env.example .envSet OPENAI_API_KEY and optionally DJANGO_SECRET_KEY in .env.
Default (no DB URL configured): local SQLite.
Recommended for staging/production: Supabase Postgres.
Set one of these in .env:
SUPABASE_DB_URL=postgresql://postgres.<project-ref>:<password>@<host>:6543/postgres
# or
DATABASE_URL=postgresql://...Optional DB flags:
DB_SSL_REQUIRE=true
DB_CONN_MAX_AGE=600Optional Stripe placeholders:
STRIPE_SECRET_KEY=
STRIPE_PUBLISHABLE_KEY=
STRIPE_WEBHOOK_SECRET=
STRIPE_PRICE_PRO_MONTHLY=
STRIPE_PRICE_TEAM_MONTHLY=Optional async/email settings:
ANALYSIS_JOB_TIMEOUT_SECONDS=900
ANALYSIS_JOB_MAX_ATTEMPTS=3
EMAIL_BACKEND=django.core.mail.backends.console.EmailBackend
DEFAULT_FROM_EMAIL=noreply@dataanalystagent.local
EMAIL_HOST=localhost
EMAIL_PORT=1025
EMAIL_HOST_USER=
EMAIL_HOST_PASSWORD=
EMAIL_USE_TLS=false
EMAIL_USE_SSL=false
EMAIL_TIMEOUT=30python manage.py migrate
python manage.py runserverOpen http://127.0.0.1:8000/ and upload a dataset.
The app now requires authentication (/signup, /login) and isolates datasets by user.
Pricing page supports plan switching in demo mode (Free/Pro/Team) with monthly usage limits.
If Stripe keys/prices are configured, paid plans can use Stripe Checkout.
Webhook endpoint is available at /billing/webhook/ (placeholder-friendly for local testing).
When DEBUG=false, webhook processing requires STRIPE_WEBHOOK_SECRET.
Billing management endpoint is available at /billing/portal/ (when Stripe is configured).
python manage.py test- Uploaded files are stored in
media/. - Metadata, chat sessions, and artifacts are stored in the configured DB backend (SQLite by default, Supabase Postgres when DB URL is set).
- The workspace now lists only the authenticated user's datasets and supports owner-only open/download/rename/replace/retention/delete actions.
- User datasets are intentionally not exposed through Django admin.
- Uploads and generated artifacts are now encrypted at rest in application storage and decrypted only through owner-authorized app flows.
- Set
FILE_ENCRYPTION_KEYto a Fernet-compatible key for a dedicated storage encryption secret. If omitted, the app derives one fromDJANGO_SECRET_KEY. - When
DJANGO_DEBUG=false, the app now expectsFILE_ENCRYPTION_KEYto be configured explicitly via Django system checks. - Direct media URLs are no longer used for user files; charts are served through authenticated artifact routes and exported HTML embeds chart data inline.
- Dataset management actions are recorded in an owner-scoped audit trail.
- Replace/delete attempt best-effort cleanup of the uploaded source file previously sent to OpenAI (
openai_file_id). - This is not zero-knowledge privacy: the backend can still decrypt files in order to process them and send analysis requests to OpenAI.
- Default retention mode is
ephemeral. Usesave_analysisat upload time to persist history. - Monthly usage quotas are enforced for
analyzeandchatoperations based on current plan. - Paid plan quotas require entitlement status (
demo,active, ortrialing); otherwise limits fall back to Free. - Dataset detail includes a downloadable executive HTML export.
- Dataset detail now includes executive highlights and suggested questions for faster analysis.
- Dataset detail includes domain playbooks (finance/retail/e-commerce one-click guided analyses).
- Dataset detail includes metric dictionary coverage (mapped vs missing metrics).
- Dataset detail supports manual metric mapping overrides persisted per dataset.
- Dataset detail supports recurring schedules for playbook runs (weekly/monthly) with async queueing.
- Dataset detail supports custom playbooks created by users and runnable on demand.
- Analysis, chat, and playbook execution now run as background jobs with status polling and cancellation support.
- Scheduled report runs are dispatched asynchronously and can send email via the configured Django email backend.
- Cleanup command for expired ephemeral uploads:
python manage.py cleanup_ephemeral_uploads --hours 24- Run due scheduled reports locally:
python manage.py run_scheduled_reports --limit 20- Run the async job worker locally:
python manage.py run_job_worker --once --limit 20
# or continuously
python manage.py run_job_worker --poll-interval 2- Run the scheduler loop locally:
python manage.py run_scheduler --poll-interval 30- Encrypt legacy plaintext uploads/artifacts already stored on disk:
python manage.py encrypt_stored_files