Skip to content

feat: Python backend (appkit-py) #274

Open
jamesbroadhead wants to merge 13 commits intodatabricks:mainfrom
jamesbroadhead:feat/appkit-py
Open

feat: Python backend (appkit-py) #274
jamesbroadhead wants to merge 13 commits intodatabricks:mainfrom
jamesbroadhead:feat/appkit-py

Conversation

@jamesbroadhead
Copy link
Copy Markdown
Contributor

Summary

Python implementation of the AppKit backend using FastAPI, providing the same HTTP API surface as the TypeScript version. Drop-in replacement that works with the existing React frontend (appkit-ui).

Depends on #273 (inline Arrow IPC support) — merge that first.

What's included

  • 17+ HTTP endpoints: health, analytics (SSE query streaming + Arrow), files (11 routes), genie (3 SSE routes), reconnect (SSE test)
  • SSE streaming: Ring buffer reconnection, heartbeat, wire-format compatible with connectSSE()
  • Auth: Per-request OBO WorkspaceClient from x-forwarded-* headers via contextvars
  • Interceptors: Retry (exponential backoff), timeout (asyncio.wait_for), TTL cache
  • Connectors: databricks-sdk + pyarrow for SQL warehouse, files, genie
  • Format fallback: ARROW_STREAM → JSON → ARROW (matching TS _executeWithFormatFallback)
  • Static serving: SPA catch-all with <script id="__appkit__"> config injection
  • 89 tests: 48 unit + 41 integration (verified against live Databricks SQL Warehouse)

Security (ACE multi-model review: GPT 5.4 + Gemini 3.1 Pro + Claude)

  • Path traversal protection (SPA serving, query loading, file connector)
  • Streaming upload with size enforcement (no OOM)
  • Content-Disposition header injection prevention
  • Per-request OBO client isolation

Running

cd packages/appkit-py
pip install -e ".[dev]"
python -m appkit_py  # starts on :8000

# Tests
pytest tests/unit/ -v
uvicorn appkit_py.server:app --port 8000 &
APPKIT_TEST_URL=http://localhost:8000 pytest tests/integration/ -v

Test plan

  • 48 Python unit tests pass
  • 41 Python integration tests pass (12 skip without live Databricks)
  • JSON + ARROW_STREAM queries verified against live Databricks SQL Warehouse
  • ACE review: 14 findings fixed, 0 remaining critical/high
  • TS tests still pass (1566 tests)
  • Verify React frontend renders when served by Python backend

Related

This pull request was AI-assisted by Isaac.

Some serverless warehouses only support ARROW_STREAM with INLINE
disposition, but the analytics plugin only offered JSON_ARRAY (INLINE)
and ARROW_STREAM (EXTERNAL_LINKS). This adds a new "ARROW_STREAM"
format option that uses INLINE disposition, making the plugin
compatible with these warehouses.

Fixes databricks#242
Tests verify:
- ARROW_STREAM format passes INLINE disposition + ARROW_STREAM format
- ARROW format passes EXTERNAL_LINKS disposition + ARROW_STREAM format
- Default JSON format does not pass disposition or format overrides
The server-side ARROW_STREAM format added in the previous commit was
not exposed to the frontend or typegen:

- Add "ARROW_STREAM" to AnalyticsFormat in appkit-ui hooks
- Add "arrow_stream" to DataFormat in chart types
- Handle "arrow_stream" in useChartData's resolveFormat()
- Make typegen resilient to ARROW_STREAM-only warehouses by
  retrying DESCRIBE QUERY without format when JSON_ARRAY is rejected

Co-authored-by: Isaac
Signed-off-by: James Broadhead <jamesbroadhead@gmail.com>
…compatibility

ARROW_STREAM with INLINE disposition is the only format that works
across all warehouse types, including serverless warehouses that reject
JSON_ARRAY. Change the default from JSON to ARROW_STREAM throughout:

- Server: defaults.ts, analytics plugin request handler
- Client: useAnalyticsQuery, UseAnalyticsQueryOptions, useChartData
- Tests: update assertions for new default

JSON and ARROW formats remain available via explicit format parameter.

Co-authored-by: Isaac
Signed-off-by: James Broadhead <jamesbroadhead@gmail.com>
When using the default ARROW_STREAM format, the analytics plugin now
automatically falls back through formats if the warehouse rejects one:
ARROW_STREAM → JSON → ARROW.

This handles warehouses that only support a subset of format/disposition
combinations without requiring users to know their warehouse's
capabilities. Explicit format requests (JSON, ARROW) are respected
without fallback.

Co-authored-by: Isaac
Signed-off-by: James Broadhead <jamesbroadhead@gmail.com>
Previously, _transformDataArray unconditionally called updateWithArrowStatus
for any ARROW_STREAM response, which discards inline data and returns only
statement_id + status. This was designed for EXTERNAL_LINKS (where data is
fetched separately) but broke INLINE disposition where data is in data_array.

Changes:
- _transformDataArray now checks for data_array before routing to the
  EXTERNAL_LINKS path: if data_array is present, it falls through to the
  standard row-to-object transform.
- JSON format now explicitly sends JSON_ARRAY + INLINE rather than relying
  on connector defaults. This prevents the connector default format from
  leaking into explicit JSON requests.
- Connector defaults reverted to JSON_ARRAY for backward compatibility with
  classic warehouses (the analytics plugin sets formats explicitly).
- Added connector-level tests for _transformDataArray covering ARROW_STREAM
  + INLINE, ARROW_STREAM + EXTERNAL_LINKS, and JSON_ARRAY paths.

Co-authored-by: Isaac
Signed-off-by: James Broadhead <jamesbroadhead@gmail.com>
Some serverless warehouses return ARROW_STREAM + INLINE results as base64
Arrow IPC in `result.attachment` rather than `result.data_array`. This adds
server-side decoding using apache-arrow's tableFromIPC to convert the
attachment into row objects, producing the same response shape as JSON_ARRAY
regardless of warehouse backend.

This abstracts a Databricks internal implementation detail (different
warehouses returning different response formats) so app developers get a
consistent `type: "result"` response with named row objects.

Changes:
- Add apache-arrow@21.1.0 as a server dependency (already used client-side)
- _transformDataArray detects `attachment` field and decodes via tableFromIPC
- Connector tests use real base64 Arrow IPC captured from a live serverless
  warehouse, covering: classic JSON_ARRAY, classic EXTERNAL_LINKS,
  serverless INLINE attachment, data_array fallback, and edge cases

Co-authored-by: Isaac
Signed-off-by: James Broadhead <jamesbroadhead@gmail.com>
Python implementation of the AppKit backend using FastAPI, providing
the same HTTP API surface as the TypeScript version for all plugins:
analytics (SSE query streaming), files (11 endpoints), and genie
(3 SSE endpoints). Includes full test suite (48 unit + 41 integration
tests), SSE streaming infrastructure with reconnection support,
contextvars-based user context, interceptor chain (retry/timeout/cache),
and Databricks SDK connector wiring.

Co-authored-by: Isaac
- Fix path traversal in SPA static file serving (use resolve() + prefix check)
- Fix upload endpoint OOM: stream body with running size counter
- Fix CacheInterceptor to actually use TTL (was storing forever)
- Fix StreamManager reconnection: persist EventRingBuffer per stream_id
- Fix _UserContextProxy: only wrap async methods, leave sync methods alone
- Fix _load_query path traversal: reject /, \, .. in query_key
- Fix Content-Disposition header injection: sanitize filename
- Fix format_buffered_event: apply sanitize_event_type on replay
- Fix ruff target-version to match requires-python (py312)
- Fix __main__.py: load dotenv, use APPKIT_HOST env var
- Add abort_all() implementation to StreamManager

Co-authored-by: Isaac
…es, path traversal

- Fix OBO: create per-request WorkspaceClient from x-forwarded-access-token
  instead of reusing global service-principal client for all routes
- Fix ARROW format: use EXTERNAL_LINKS disposition and emit arrow event
  with statement_id (matching TS FORMAT_CONFIGS)
- Fix SQL connector: check for FAILED/CANCELED/CLOSED states after polling
  and raise with error message instead of returning empty result
- Fix FilesConnector.resolve_path: reject path traversal (..) sequences
- Update all file/genie endpoints to use per-request user client

Co-authored-by: Isaac
…aceId

- Add pyarrow-based Arrow IPC attachment decoding (decode_arrow_attachment)
  matching TS _transformArrowAttachment for serverless warehouse support
- Implement get_arrow_data: download external link chunks via httpx
- Use transform_result() in analytics handler for unified result processing
- Add maxSize enforcement to FilesConnector.read()
- Auto-inject workspaceId parameter in process_query_params when query
  references :workspaceId
- Add pyarrow and httpx to runtime dependencies

Co-authored-by: Isaac
…→ ARROW)

Mirrors the TS _executeWithFormatFallback: when the default ARROW_STREAM
format is rejected by a warehouse (classic warehouses don't support
INLINE + ARROW_STREAM), automatically falls back through JSON then ARROW.
Verified working against live Databricks SQL Warehouse.

Co-authored-by: Isaac
@jamesbroadhead jamesbroadhead changed the title feat: Python backend (appkit-py) — 100% API compatible feat: Python backend (appkit-py) Apr 15, 2026
Extract monolithic server.py into proper Plugin subclasses:
- AnalyticsPlugin: SQL query execution with format fallback, query file loading
- FilesPlugin: 11 routes with volume discovery, path validation, OBO
- GeniePlugin: 3 SSE routes with space alias resolution
- ServerPlugin: orchestrates plugin mounting, static serving, shutdown

Add create_app() factory matching TS createApp():
- Plugin phase ordering (core → normal → deferred)
- WorkspaceClient injection into plugins
- Plugin exports for programmatic API (appkit.analytics.query(...))
- Client config aggregation from all plugins

Plugin base class now has:
- execute() with interceptor chain (timeout → retry → cache)
- execute_stream() for SSE responses
- route() helper for endpoint registration and tracking
- to_plugin() factory matching TS toPlugin()

server.py is now a thin wrapper: create plugins → create_app() → return app.
All 89 tests pass. Live Databricks queries verified.

Co-authored-by: Isaac
@arsenyinfo
Copy link
Copy Markdown

arsenyinfo commented Apr 15, 2026

Review: Python Backend (appkit-py)

Reviewed by nitpicker (multi-model debate: Claude Opus + Gemini Pro + Gemini Flash, 3 rounds) and manually.

Verdict: do not merge. Multiple critical bugs, security flaws, and pervasive JS-style patterns.


Critical Bugs

1. Infinite recursion in interceptor chain

File: plugin/plugin.py, execute() method

All lambdas capture interceptor and prev by reference. After the chain is built, all see the last-assigned values → TimeoutInterceptor.intercept(lambda_retry)lambda_retry sees interceptor=TimeoutInterceptor, prev=itself → infinite recursion. Cache and retry interceptors are never invoked.

Fix: Use functools.partial — it captures by value at call time:

from functools import partial
current = partial(interceptor.intercept, prev)

2. resolve_path() allows arbitrary volume escape (Security)

File: connectors/files/client.py

if file_path.startswith("/Volumes/"):
    return file_path  # bypasses default_volume entirely

User can pass /Volumes/other_catalog/other_schema/other_vol/secret.txt to read/write/delete files in any volume accessible to the service principal.

Fix: Remove the /Volumes/ bypass; force all paths through scoped default_volume.

3. Unbounded memory leaks (OOM guaranteed)

  • CacheManager: raw dict with no max capacity, lazy-only eviction. cachetools>=5.3 declared as dep but never imported. Keys that are never re-requested stay forever.
  • StreamManager: UUID-keyed EventRingBuffer created per connection, never cleaned up on disconnect (only in abort_all() at shutdown).

Fix: Use cachetools.TTLCache(maxsize=N). Add on_disconnect cleanup for stream buffers.

4. Silent privilege escalation on OBO failure

File: plugin/plugin.py

except Exception:
    pass  # falls back to service-principal client

If OBO WorkspaceClient creation fails, the exception is swallowed and the high-privilege service-principal client is returned. Auth errors must propagate as 401/403.

5. execute_stream() bypasses StreamManager entirely

File: plugin/plugin.py

All SSE handlers use execute_stream(), which creates a raw StreamingResponse inline. The StreamManager (heartbeats, ring-buffer replay, reconnection) is instantiated per-plugin but never used. last_event_id is read but ignored. The inner send() function is dead code.

6. ContextVar isolation failure — privilege escalation

File: plugin/plugin.py

_UserContextProxy wraps calls in run_in_user_context() (sets a contextvar), but get_workspace_client() never reads that contextvar — it only creates OBO client when request object is explicitly passed. Programmatic calls via as_user() always get service-principal client. Additionally, plugin = self.as_user(request) in analytics/plugin.py creates a proxy that's never used — handler continues with self.


High-Severity Issues

7. Silent error swallowing on request body parsing

Files: analytics/plugin.py, genie/plugin.py, files/plugin.py

body = {}
try:
    body = await request.json()
except Exception:
    pass

Malformed JSON silently treated as {}. Users get misleading downstream errors instead of 400 Bad Request.

8. files.exists() returns False on any exception

File: connectors/files/client.py

Network errors, 403 Forbidden, SDK bugs — all silently become "file not found". Permission errors must propagate.

9. Upload accumulates up to 5GB in memory

File: plugins/files/plugin.py

Despite claiming streaming, chunks are collected into a list and joined into bytes before upload. With _FILES_MAX_UPLOAD_SIZE = 5GB, this will OOM.

10. files.preview() silently swallows errors

File: connectors/files/client.pyexcept Exception: pass hides download failures.

11. No raise ... from e anywhere

Zero uses of exception chaining across the entire codebase. Original tracebacks lost on re-raises.


Pervasive JS/TS Patterns (not Python)

Type system failures

  • 60+ uses of dict[str, Any] as return types — should be @dataclass(frozen=True), TypedDict, or NamedTuple
  • Bare Any on parameters/returns: client: Any, request: Any, signal: Any
  • 26+ missing return type annotations on async handlers
  • No type checker config — no [tool.ty] or [tool.mypy] section despite mypy>=1.10 in dev deps. Recommend ty (from the ruff/astral team) for strict type checking.
  • # type: ignore[misc] masking real type issues instead of fixing signatures

JS option bags instead of keyword arguments

# Current (JS-style):
async def read(self, client, file_path, options: dict | None = None):
    max_size = (options or {}).get("maxSize")

# Should be:
async def read(self, client, file_path, *, max_size: int | None = None):

or "" defaults instead of early errors

self.default_volume = default_volume or ""  # silently accepts None, proceeds with empty string
self.config = config or {}
directory_path or ""

Python: validate at the boundary, fail early. Don't silently degrade.

camelCase everywhere internally

Dict keys ("conversationId", "messageId", "statusCode"), config keys ("autoStart", "maxUploadSize"), method names (asUser). Internal Python code should be snake_case; serialize to camelCase only at the HTTP boundary.

Java/JS naming and design

  • CacheManager.get_instance() / get_instance_sync() — module-level singleton or DI in Python
  • EventRingBuffer.has_event(), RingBuffer.has(), CacheManager.has() — should be __contains__ (then if x in buffer:)
  • Missing __len__, __repr__, __bool__ on buffer/cache classes
  • _UserContextProxy uses object.__setattr__ hack instead of proper __init__
  • Stateless classes (QueryProcessor, FilesConnector, GenieConnector, SQLWarehouseConnector) used as namespaces — Python prefers module-level functions

Missing Python features

  • contextvars.ContextVar for per-request state — partially used but broken (see WIP: feat: add volume serving plugin #6)
  • @dataclass(frozen=True, slots=True) for value objects (UserContext, BufferedEvent) — currently mutable
  • __all__ exports in ~10 subpackage __init__.py files — all empty
  • __slots__ on ~20 classes
  • Lifespan context managerapp.on_event("startup") is deprecated in FastAPI
  • Protocol defined but unusedExecutionInterceptor Protocol exists but interceptors don't declare they implement it
  • Plugin class SRP violation — 286 lines mixing routing, interceptors, context, caching, streaming. Should be composed from smaller mixins or protocols.

Unused dependency

cachetools>=5.3 in pyproject.toml but never imported.


Required Before Merge

  1. Fix closures with functools.partial (not default-arg lambdas)
  2. Close resolve_path volume escape
  3. Replace raw dict cache with cachetools.TTLCache(maxsize=N)
  4. Remove all except Exception: pass — propagate or handle specifically
  5. Fix get_workspace_client() to read contextvar
  6. Wire execute_stream() through StreamManager
  7. Add ty strict config, run it, fix findings
  8. Replace dict[str, Any] returns with frozen dataclasses / TypedDict
  9. Stream uploads instead of accumulating in memory
  10. Add __contains__/__len__/__repr__ dunders

🔍 Reviewed by nitpicker

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants