feat: bundle React frontend in Python wheel (Streamlit-style)#275
Open
jamesbroadhead wants to merge 14 commits intodatabricks:mainfrom
Open
feat: bundle React frontend in Python wheel (Streamlit-style)#275jamesbroadhead wants to merge 14 commits intodatabricks:mainfrom
jamesbroadhead wants to merge 14 commits intodatabricks:mainfrom
Conversation
Some serverless warehouses only support ARROW_STREAM with INLINE disposition, but the analytics plugin only offered JSON_ARRAY (INLINE) and ARROW_STREAM (EXTERNAL_LINKS). This adds a new "ARROW_STREAM" format option that uses INLINE disposition, making the plugin compatible with these warehouses. Fixes databricks#242
Tests verify: - ARROW_STREAM format passes INLINE disposition + ARROW_STREAM format - ARROW format passes EXTERNAL_LINKS disposition + ARROW_STREAM format - Default JSON format does not pass disposition or format overrides
The server-side ARROW_STREAM format added in the previous commit was not exposed to the frontend or typegen: - Add "ARROW_STREAM" to AnalyticsFormat in appkit-ui hooks - Add "arrow_stream" to DataFormat in chart types - Handle "arrow_stream" in useChartData's resolveFormat() - Make typegen resilient to ARROW_STREAM-only warehouses by retrying DESCRIBE QUERY without format when JSON_ARRAY is rejected Co-authored-by: Isaac Signed-off-by: James Broadhead <jamesbroadhead@gmail.com>
…compatibility ARROW_STREAM with INLINE disposition is the only format that works across all warehouse types, including serverless warehouses that reject JSON_ARRAY. Change the default from JSON to ARROW_STREAM throughout: - Server: defaults.ts, analytics plugin request handler - Client: useAnalyticsQuery, UseAnalyticsQueryOptions, useChartData - Tests: update assertions for new default JSON and ARROW formats remain available via explicit format parameter. Co-authored-by: Isaac Signed-off-by: James Broadhead <jamesbroadhead@gmail.com>
When using the default ARROW_STREAM format, the analytics plugin now automatically falls back through formats if the warehouse rejects one: ARROW_STREAM → JSON → ARROW. This handles warehouses that only support a subset of format/disposition combinations without requiring users to know their warehouse's capabilities. Explicit format requests (JSON, ARROW) are respected without fallback. Co-authored-by: Isaac Signed-off-by: James Broadhead <jamesbroadhead@gmail.com>
Previously, _transformDataArray unconditionally called updateWithArrowStatus for any ARROW_STREAM response, which discards inline data and returns only statement_id + status. This was designed for EXTERNAL_LINKS (where data is fetched separately) but broke INLINE disposition where data is in data_array. Changes: - _transformDataArray now checks for data_array before routing to the EXTERNAL_LINKS path: if data_array is present, it falls through to the standard row-to-object transform. - JSON format now explicitly sends JSON_ARRAY + INLINE rather than relying on connector defaults. This prevents the connector default format from leaking into explicit JSON requests. - Connector defaults reverted to JSON_ARRAY for backward compatibility with classic warehouses (the analytics plugin sets formats explicitly). - Added connector-level tests for _transformDataArray covering ARROW_STREAM + INLINE, ARROW_STREAM + EXTERNAL_LINKS, and JSON_ARRAY paths. Co-authored-by: Isaac Signed-off-by: James Broadhead <jamesbroadhead@gmail.com>
Some serverless warehouses return ARROW_STREAM + INLINE results as base64 Arrow IPC in `result.attachment` rather than `result.data_array`. This adds server-side decoding using apache-arrow's tableFromIPC to convert the attachment into row objects, producing the same response shape as JSON_ARRAY regardless of warehouse backend. This abstracts a Databricks internal implementation detail (different warehouses returning different response formats) so app developers get a consistent `type: "result"` response with named row objects. Changes: - Add apache-arrow@21.1.0 as a server dependency (already used client-side) - _transformDataArray detects `attachment` field and decodes via tableFromIPC - Connector tests use real base64 Arrow IPC captured from a live serverless warehouse, covering: classic JSON_ARRAY, classic EXTERNAL_LINKS, serverless INLINE attachment, data_array fallback, and edge cases Co-authored-by: Isaac Signed-off-by: James Broadhead <jamesbroadhead@gmail.com>
Python implementation of the AppKit backend using FastAPI, providing the same HTTP API surface as the TypeScript version for all plugins: analytics (SSE query streaming), files (11 endpoints), and genie (3 SSE endpoints). Includes full test suite (48 unit + 41 integration tests), SSE streaming infrastructure with reconnection support, contextvars-based user context, interceptor chain (retry/timeout/cache), and Databricks SDK connector wiring. Co-authored-by: Isaac
- Fix path traversal in SPA static file serving (use resolve() + prefix check) - Fix upload endpoint OOM: stream body with running size counter - Fix CacheInterceptor to actually use TTL (was storing forever) - Fix StreamManager reconnection: persist EventRingBuffer per stream_id - Fix _UserContextProxy: only wrap async methods, leave sync methods alone - Fix _load_query path traversal: reject /, \, .. in query_key - Fix Content-Disposition header injection: sanitize filename - Fix format_buffered_event: apply sanitize_event_type on replay - Fix ruff target-version to match requires-python (py312) - Fix __main__.py: load dotenv, use APPKIT_HOST env var - Add abort_all() implementation to StreamManager Co-authored-by: Isaac
…es, path traversal - Fix OBO: create per-request WorkspaceClient from x-forwarded-access-token instead of reusing global service-principal client for all routes - Fix ARROW format: use EXTERNAL_LINKS disposition and emit arrow event with statement_id (matching TS FORMAT_CONFIGS) - Fix SQL connector: check for FAILED/CANCELED/CLOSED states after polling and raise with error message instead of returning empty result - Fix FilesConnector.resolve_path: reject path traversal (..) sequences - Update all file/genie endpoints to use per-request user client Co-authored-by: Isaac
…aceId - Add pyarrow-based Arrow IPC attachment decoding (decode_arrow_attachment) matching TS _transformArrowAttachment for serverless warehouse support - Implement get_arrow_data: download external link chunks via httpx - Use transform_result() in analytics handler for unified result processing - Add maxSize enforcement to FilesConnector.read() - Auto-inject workspaceId parameter in process_query_params when query references :workspaceId - Add pyarrow and httpx to runtime dependencies Co-authored-by: Isaac
…→ ARROW) Mirrors the TS _executeWithFormatFallback: when the default ARROW_STREAM format is rejected by a warehouse (classic warehouses don't support INLINE + ARROW_STREAM), automatically falls back through JSON then ARROW. Verified working against live Databricks SQL Warehouse. Co-authored-by: Isaac
Extract monolithic server.py into proper Plugin subclasses: - AnalyticsPlugin: SQL query execution with format fallback, query file loading - FilesPlugin: 11 routes with volume discovery, path validation, OBO - GeniePlugin: 3 SSE routes with space alias resolution - ServerPlugin: orchestrates plugin mounting, static serving, shutdown Add create_app() factory matching TS createApp(): - Plugin phase ordering (core → normal → deferred) - WorkspaceClient injection into plugins - Plugin exports for programmatic API (appkit.analytics.query(...)) - Client config aggregation from all plugins Plugin base class now has: - execute() with interceptor chain (timeout → retry → cache) - execute_stream() for SSE responses - route() helper for endpoint registration and tracking - to_plugin() factory matching TS toPlugin() server.py is now a thin wrapper: create plugins → create_app() → return app. All 89 tests pass. Live Databricks queries verified. Co-authored-by: Isaac
Python developers can now `pip install appkit-py` and get a working frontend without needing Node.js/npm. The pre-built React app (using appkit-ui components) is included as static assets in the wheel and served automatically when no user-provided frontend directory is found. - Add frontend/ with standalone Vite+React app that dynamically discovers enabled plugins from window.__appkit__ config - Add scripts/build_frontend.sh to compile frontend at release time - Update pyproject.toml with package-data for static/**/* - Update ServerPlugin._find_static_dir() to fall back to bundled assets - Add MANIFEST.in for source distribution support - Add unit tests for static file discovery logic Co-authored-by: Isaac
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
packages/appkit-py/frontend/) that dynamically discovers enabled plugins fromwindow.__appkit__config — pages for analytics, files, and genie appear automatically based on what the Python backend exposespyproject.tomlpackage-data, sopip install appkit-pygives Python devs a working UI without needing Node.js/npm (same pattern as Streamlit)ServerPlugin._find_static_dir()now falls back to the bundledappkit_py/static/when no user-provided frontend directory (e.g.,client/dist) existsDetails
scripts/build_frontend.shbuildsappkit-uithen compiles the Vite+React app intosrc/appkit_py/static/(run at release time, not by end users)Test plan
npm run buildproduces correct output insrc/appkit_py/static/pytest tests/unit/— all 51 tests passpython -m appkit_pywithout aclient/distdir and verify the bundled UI loadsclient/distdir and verify it takes priority over bundled assetspip wheel . && unzip -l *.whl | grep staticDepends on #274.
This pull request was AI-assisted by Isaac.