Ask a business question in plain English and get a detailed report back.
- Which articles kept readers on the page the longest last month? - How has subscriber growth trended over the past 6 months? - Which markets drove the biggest revenue increase in Q1 2026? - Compare mobile vs desktop subs conversion rates for the last 90 days
No SQL knowledge required. No need to learn what a LEFT OUTER JOIN does!
Run this command on your terminal:
curl -fsSL https://raw.githubusercontent.com/ocelma/mega-djinn/main/install.sh | bashThe installer will:
- clone the repo
- install the Databricks CLI if not already present
- ask for your Databricks endpoint
- (optional) ask for your Alation token
- create
.env - create
.venv - install
requirements.txt
For the full step-by-step setup, use the manual install flow below.
Once mega-djinn agent is successfully installed and configured (see Setup below), open a terminal, cd into the repo, and launch Claude Code (Codex, Gemini CLI, or Cursor):
cd /path/to/mega-djinn
claudeπ§β¨ Your Mega Djinn is officially out of the bottle and ready to grant your data wishes!
The agent will find the right tables, show you the SQL it plans to run, ask for your confirmation, execute it, and save the results as an HTML report in reports/.
Mega Djinn AI agent does its magic and make your dreams questions come true!
- Takes a natural language question from a user (no SQL knowledge needed)
- Searches the local knowledge base (
.ai/knowledge/) for verified queries and known pitfalls from teammates β no network cost - Retrieves relevant tables from Databricks Unity Catalog as well as approved queries, table metadata, and business glossary from Alation (if available)
- Using your preferred LLM, it generates the SQL grounded in your org's conventions
- Executes the query on Databricks and returns results β no need to open Databricks
- Saves results as a styled HTML report in
./reports/ - Saves the verified SQL (or failure notes) to
.ai/knowledge/so teammates benefit from it on their next query
Caution
Always verify that the SQL queries and results are correct. Free-will Djinns are sometimes evil or mischievous!
User (natural language query)
β
Agent (LLM)
β retrieves context from
βββββββββββββββββββββΌβββββββββββββββββββββββ
β β β
.ai/knowledge/ Databricks Unity Catalog Alation (Optional)
(verified queries, (schemas, tables/ (approved queries,
known pitfalls) column metadata) glossary, lineage)
βββββββββββββββββββββΌβββββββββββββββββββββββ
β
Agent creates a Plan
β
Agent generates SQL (user agrees to run the query)
β
Databricks SQL execution
β
Results
β
βββββββββββββββΌββββββββββββββ
β β β
User HTML in ./reports .ai/knowledge/
(saves verified SQL
or failure notes)
The project ships a Claude Code skill and two slash commands:
| Path | Purpose |
|---|---|
.claude/skills/mega-djinn/SKILL.md |
Full workflow, tables, SQL rules, safety guardrails β auto-loaded by Claude Code |
.claude/commands/query.md |
/query "<question>" β runs the full plan β SQL β report workflow |
.claude/commands/report.md |
/report β regenerates the last result as a dated HTML report |
CLAUDE.md |
Project-level invariants (confirm before execute, read-only, report format) |
.ai/knowledge/ |
Shared knowledge base β verified queries, table notes, known pitfalls; grows via git push/pull |
| Path | Purpose |
|---|---|
AGENTS.md |
It will read this file and understand what Mega Djinn can do. |
| Path | Purpose |
|---|---|
GEMINI.md |
Project-level invariants for Gemini CLI (mirrors CLAUDE.md) |
.gemini/skills |
Symlink to .claude/skills β shares the same skill definitions as Claude Code |
| Path | Purpose |
|---|---|
.cursor/rules/mega-djinn.mdc |
Always-on workspace rule β mirrors the skill for Cursor |
.cursor/rules/python-scripts.mdc |
Auto-attaches when editing scripts/*.py |
.cursor/mcp.json |
Databricks MCP server config for Cursor |
.cursorignore |
Excludes .venv/, __pycache__ from Cursor indexing |
If you did not run Quick Install, follow the steps below.
- Python 3.13+
- Databricks CLI installed and authenticated. If not installed, run:
brew tap databricks/tap
brew install databricks
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtRun the helper script through the project virtualenv so the installed SDK dependencies are used:
.venv/bin/python scripts/execute.py --helpAuth is handled via the existing [dev] or [prod] profile in ~/.databrickscfg, using OAuth with auto-refresh. Make sure the file exists:
test -f ~/.databrickscfg && echo "Databricks Config File Exists" || echo "Error. Not found"# ~/.databrickscfg
[dev]
host = https://your-org-dev.cloud.databricks.com
auth_type = databricks-cli
[prod]
host = https://your-org-prod.cloud.databricks.com
auth_type = databricks-cliThe SDK reads DATABRICKS_CONFIG_PROFILE from .env and automatically refreshes the OAuth token using the cached refresh token in ~/.databricks/token-cache.json.
(Skip this step if Alation is not available; the agent will still work using Databricks Unity Catalog metadata).
If your organization uses Alation, connecting it gives the agent access to your company's accumulated data knowledge β business glossary definitions, approved SQL patterns, table documentation, and curated queries βso answers are grounded in your org's conventions rather than inferred from schema names alone. Access uses OAuth 2.0 client credentials (machine-to-machine). No browser login or manual token renewal required.
Caution
An Alation Admin (most likely his is NOT you!) must create an OAuth client application for the Agent:
- Alation Admin opens Settings (gear icon, top right) β Authentication β OAuth Client Applications β Add.
- Enter a name, set Access Token Duration in seconds (e.g.
3600for 1 hour), and set User Role to Viewer. - Click Save β copy the Client ID and Client Secret immediately (the secret is shown only once) and add them to your
.env(see section below).
scripts/execute.py fetches a JWT automatically on each run. No manual token refresh needed.
Copy .env.example to a new file named .env. Set the Databricks variables below. If you completed Alation Auth (optional) above, add ALATION_BASE_URL, ALATION_CLIENT_ID, and ALATION_CLIENT_SECRET (else leave them empty).
# Databricks Configuration
DATABRICKS_CONFIG_PROFILE=dev
DATABRICKS_AUTH_TYPE=databricks-cli
DATABRICKS_HOST=https://your-org-dev.cloud.databricks.com
# Optional: pin a specific SQL warehouse. If unset, execute.py auto-selects a running warehouse.
# DATABRICKS_WAREHOUSE=
# Alation Configuration. Optional (only if organization uses it)
ALATION_BASE_URL=https://your-org.alationcloud.com
ALATION_CLIENT_ID=
ALATION_CLIENT_SECRET=.venv/bin/python scripts/execute.py --sql auto-selects a running SQL warehouse. Set DATABRICKS_WAREHOUSE in .env to pin a specific one.
The helper script defaults to DATABRICKS_AUTH_TYPE=databricks-cli so it uses the authenticated DATABRICKS_CONFIG_PROFILE CLI profile directly and avoids relying on automatic host metadata resolution.
Caution
Never commit .env to git!
To re-authenticate with Databricks once the refresh token expires do:
MY_PROFILE=$(grep "DATABRICKS_CONFIG_PROFILE" .env | cut -d '=' -f2)
databricks auth login --profile $MY_PROFILECaution
To use Mega Djinn you must be logged in into Databricks (see command above).
Databricks AI Dev Kit ships databricks-mcp-server, which exposes Databricks actions (SQL, table metadata, etc.) to Claude Code, Codex, or Cursor IDE. This repository already includes a root .mcp.json that starts that server using your DATABRICKS_CONFIG_PROFILE profile (same as Databricks CLI Auth).
You do not need MCP to use Mega Djinn as .venv/bin/python scripts/execute.py --sql talks to Databricks CLI directly. MCP is only for agents inside your editor.
Prerequisites
- Finish Databricks CLI Auth (
databricks auth login --profile $MY_PROFILE). - Install uv if you do not have it β the AI Dev Kit installer uses it to create the MCP Python environment.
Install Databricks MCP server (step by step)
-
Run the official installer (macOS / Linux; can be run from any directory):
curl -fsSL https://raw.githubusercontent.com/databricks-solutions/ai-dev-kit/main/install.sh | bash -
Answer the prompts so that MCP gets installed. In particular:
- Enable MCP when asked which components to install.
- Select Cursor (and any other editors you use) if you want the kit to drop helper config into those tools.
- When asked for the MCP server install path, accept the default
~/.ai-dev-kitunless you want to use a different location. This repoβs.mcp.jsonassumes that default (see step 4 if you change it).
-
Wait for the script to finish. It will create:
~/.ai-dev-kit/.venv/bin/pythonβ interpreter for the MCP server~/.ai-dev-kit/repo/databricks-mcp-server/run_server.pyβ Databricks MCP entrypoint
-
Open
.mcp.jsonand verify the paths point to$HOME/.ai-dev-kit/(the default install location), or update them to the custom path where you installed it.
Verify paths exist (optional):
test -x "$HOME/.ai-dev-kit/.venv/bin/python" \
&& test -f "$HOME/.ai-dev-kit/repo/databricks-mcp-server/run_server.py" \
&& echo "MCP runtime OK"For Windows OS: the committed
.mcp.jsonuses/bin/shand Unix-style paths. On native Windows, configure the Databricks MCP server in your editor with explicit paths to your Python andrun_server.py, or use WSL and run the steps above inside WSL.
References:
- Databricks AI Dev Kit README
- Databricks MCP server
- Alation REST-API overview
- Alation Authenticate API calls with OAuth 2.0
- Alation Introspect token