ADE provides the structured context that transforms AI agents from code assistants into autonomous data engineering partners.
- Extract → Pull metadata from Databricks notebooks, jobs, and source code
- Search → Find any data asset across your platforms from Claude
- Trace → Analyze dependencies between notebooks and tables
- Understand → Let AI reason about your architecture without reading 100 files
In data engineering, context is scattered: SQL in views, DAX in Power BI, PySpark in notebooks, YAML in pipelines. There's no single repo to read.
ADE collects this context and makes it queryable — so AI agents can actually help.
# 1. Clone and install
git clone https://github.com/rbutinar/ade-core.git
cd ade-core
pip install -r requirements.txt
# 2. Test with demo data
python -m ade_app.mcp_server.server
# Server starts with pre-loaded synthetic demo data
# 3. Configure in Claude Code (see below)ADE works with any Claude client that supports MCP (Model Context Protocol).
Add to ~/.claude/mcp.json:
{
"mcpServers": {
"ade": {
"command": "python",
"args": ["-m", "ade_app.mcp_server.server"],
"cwd": "/path/to/ade-core"
}
}
}Add to your config file:
- Windows:
%APPDATA%\Claude\claude_desktop_config.json - Mac:
~/Library/Application Support/Claude/claude_desktop_config.json
{
"mcpServers": {
"ade": {
"command": "python",
"args": ["-m", "ade_app.mcp_server.server"],
"cwd": "C:\\path\\to\\ade-core"
}
}
}Restart Claude Desktop after saving.
"What notebooks do we have in the demo environment?"
"Show me the source code of the sales aggregation notebook"
"What tables does the 01_ingest_raw_sales notebook write to?"
| Tool | Description |
|---|---|
get_ade_overview() |
What is ADE and how to use it |
list_environments() |
See available environments |
set_environment(id) |
Switch to different environment |
get_environment_info() |
Current environment details |
search_catalog(query) |
Find objects by name |
get_object_details(name, platform) |
Full metadata with source code |
get_platform_stats() |
Object counts by platform |
get_notebook_lineage(name) |
Analyze notebook dependencies |
Extract metadata from your Databricks workspace:
python -m ade_app.platforms.databricks.extractor \
--host https://your-workspace.azuredatabricks.net \
--token your_databricks_token \
--output ade_data/my_env/extractions/databricksThen set the environment in Claude Code:
"Switch to my_env environment"
"What notebooks do I have?"
ade-core/
├── ade_app/
│ ├── mcp_server/ # MCP server for AI agents
│ └── platforms/
│ └── databricks/ # Databricks extractor
├── ade_data/
│ └── demo/ # Demo environment with synthetic data
│ └── extractions/
│ └── databricks/
│ ├── notebooks.json
│ └── jobs.json
├── requirements.txt
└── README.md
| Platform | Status | Features |
|---|---|---|
| Databricks | ✅ Ready | Notebooks, jobs, source code extraction |
| Power BI | 🔜 Coming | Datasets, measures, DAX |
| PostgreSQL | 🔜 Coming | Tables, views, SQL definitions |
Additional platforms available in the extended version:
| Platform | Features |
|---|---|
| Microsoft Fabric | Warehouses, lakehouses, pipelines, notebooks, semantic models |
| Talend | Jobs, components, data flows |
| Tableau | Workbooks, datasources, worksheets |
| SQL Server / SSIS | Packages, data flows, connections |
| Cloudera | Hive tables, Spark jobs |
| Synapse | Pools, procedures, views |
The extended version also includes:
Infrastructure:
- SQL Server metadata store with full lineage graph
- Multi-environment management with secure credential isolation
- Project templates and session tracking for continuity
Platform Skills (Claude commands):
- Databricks: deploy, run, status, query
- Fabric: notebook deploy, SQL deploy, stored procedure execution, warehouse testing
- Power BI: model creation and editing
Automation:
- Cross-platform impact analysis
- AI-powered documentation generation
- Platform backup and security scanning
- Streamlit dashboard for visual exploration
Interested? Contact: roberto.butinar@gmail.com
This project powers The Autonomous Data Engineer video series on YouTube, showing real agentic workflows for data platforms.
- YouTube Channel (coming soon)
Apache 2.0 — See LICENSE
Roberto Butinar — Data Engineer & AI Automation Specialist