Conversation
744d776 to
5200066
Compare
There was a problem hiding this comment.
Pull request overview
This PR introduces a comprehensive Files plugin for AppKit that enables Unity Catalog volume file operations. The plugin provides both HTTP routes and a programmatic API for common file operations including list, read, download, upload, delete, and preview. It integrates with AppKit's execution interceptor pipeline for caching, retry, and timeout handling.
The PR also modifies the server plugin to support bypassing JSON body parsing for specific routes (needed for file upload streaming), introducing a new skipBodyParsing route configuration option and getSkipBodyParsingPaths() plugin interface method.
Changes:
- New Files plugin with full CRUD operations for Unity Catalog volumes
- Server plugin modification to support raw body streaming for file uploads
- Comprehensive test coverage including unit, integration, and connector tests
- Documentation updates including README, API docs, and plugin guide
Reviewed changes
Copilot reviewed 32 out of 33 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
packages/appkit/src/plugins/files/plugin.ts |
Main FilesPlugin class with HTTP routes and programmatic API |
packages/appkit/src/plugins/files/types.ts |
TypeScript type definitions for file operations |
packages/appkit/src/plugins/files/defaults.ts |
Execution defaults for different operation tiers (read/download/write) |
packages/appkit/src/plugins/files/manifest.json |
Plugin manifest declaring resource requirements |
packages/appkit/src/connectors/files/client.ts |
FilesConnector implementing core file operations with telemetry |
packages/appkit/src/connectors/files/defaults.ts |
Content-type resolution and text detection helpers |
packages/appkit/src/plugins/server/index.ts |
Modified to support selective body parsing bypass |
packages/shared/src/plugin.ts |
Added getSkipBodyParsingPaths() to BasePlugin interface |
packages/appkit/src/plugin/plugin.ts |
Implementation of skipBodyParsing tracking and path registration |
packages/appkit/src/plugins/files/tests/* |
Comprehensive test suites for plugin, connector, and integration |
packages/appkit/src/plugins/server/tests/server.test.ts |
Tests for body parsing bypass functionality |
docs/docs/plugins.md |
Plugin guide with Files plugin documentation and examples |
packages/appkit/src/index.ts |
Export of files plugin and contentTypeFromPath helper |
apps/dev-playground/server/index.ts |
Dev playground integration |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
cd3d855 to
1c5805b
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 32 out of 33 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
e86e4a4 to
bb57226
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 32 out of 32 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 32 out of 32 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
The plugin looks awesome, great work! 👌
Could you please record a short demo how it works with the databricks apps init command and whether the apps deploy passes? Thanks in advance!
# in cli repo
make build
sudo ln -s "$(pwd)/cli" /usr/local/bin/dbx
# in appkit repo
DATABRICKS_APPKIT_TEMPLATE_PATH="/Users/pawel.kosiec/repositories/databricks-os/appkit/template" dbx apps init
packages/appkit/src/plugins/files/tests/plugin.integration.test.ts
Outdated
Show resolved
Hide resolved
fjakobs
left a comment
There was a problem hiding this comment.
Blocking - we need to solve access control before we can merge it
apps/dev-playground/server/index.ts
Outdated
| telemetryExamples(), | ||
| analytics({}), | ||
| lakebaseExamples(), | ||
| files({ defaultVolume: process.env.DATABRICKS_DEFAULT_VOLUME }), |
There was a problem hiding this comment.
how to we manage permissions. Can all users read and upload all the files?
docs/docs/plugins.md
Outdated
| | `GET` | `/api/files/preview?path=` | Get a file preview with text excerpt. | | ||
| | `POST` | `/api/files/upload?path=` | Upload a file (stream the request body). | | ||
| | `POST` | `/api/files/mkdir` | Create a directory (`{ path }` in body). | | ||
| | `POST` | `/api/files/delete` | Delete a file or directory (`{ path }` in body). | |
There was a problem hiding this comment.
We also need something like createSignedUrl() (see https://supabase.com/docs/reference/javascript/storage-from-createsignedurl) so we can download files that take longer than 60s to download.
| private async traced<T>( | ||
| operation: string, | ||
| attributes: Record<string, string>, | ||
| fn: (span: Span) => Promise<T>, | ||
| ): Promise<T> { | ||
| const startTime = Date.now(); | ||
| let success = false; | ||
|
|
||
| return this.telemetry.startActiveSpan( | ||
| `files.${operation}`, | ||
| { | ||
| kind: SpanKind.CLIENT, | ||
| attributes: { | ||
| "files.operation": operation, | ||
| ...attributes, | ||
| }, | ||
| }, | ||
| async (span: Span) => { | ||
| try { | ||
| const result = await fn(span); | ||
| success = true; | ||
| span.setStatus({ code: SpanStatusCode.OK }); | ||
| return result; | ||
| } catch (error) { | ||
| span.recordException(error as Error); | ||
| span.setStatus({ | ||
| code: SpanStatusCode.ERROR, | ||
| message: error instanceof Error ? error.message : String(error), | ||
| }); | ||
| throw error; | ||
| } finally { | ||
| span.end(); | ||
| const duration = Date.now() - startTime; | ||
| const metricAttrs = { | ||
| "files.operation": operation, | ||
| success: String(success), | ||
| }; | ||
| this.telemetryMetrics.operationCount.add(1, metricAttrs); | ||
| this.telemetryMetrics.operationDuration.record(duration, metricAttrs); | ||
| } | ||
| }, | ||
| { name: this.name, includePrefix: true }, | ||
| ); | ||
| } |
There was a problem hiding this comment.
this whole method seems like a nice abstraction we could have as a core part of plugins no? I see there are some files related stuff but maybe we could have something more generic as part of the core? Not blocking, just a suggestion / idea
| volume(volumePath: string): FilesConnector { | ||
| return new FilesConnector({ | ||
| defaultVolume: volumePath, | ||
| telemetry: false, | ||
| customContentTypes: this.customContentTypes, | ||
| }); | ||
| } |
There was a problem hiding this comment.
why do we return a full new instance of the class?
Move dev-playground route and UI to a separate branch (plugin/files-playground). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Atila Fassina <atila@fassina.eu>
… docs, remove redundancy
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* feat: create gateway and inject user context to every handler * docs: update API references * fix: register volume in `app.yml` for dev-playground * docs: update Class Plugin * chore: add files route and UI to dev-playground (#114) * chore: add files route and UI to dev-playground Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Atila Fassina <atila@fassina.eu> * docs: regen --------- Signed-off-by: Atila Fassina <atila@fassina.eu> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * fix: url path when uploading files + error handling/logging * Update packages/appkit/src/plugins/files/plugin.ts Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * chore: address code review comments * chore: remove unused import * fix: cache TTL should be in seconds * test: add integration tests for Files plugin * refactor: remove OBO Gateway * feat: add rate-limiter for uploads, cleanup code * chore: address docs and type gen --------- Signed-off-by: Atila Fassina <atila@fassina.eu> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
e7ecbe6 to
4d1d69d
Compare
| list: this.list.bind(this), | ||
| /** Read a file as a string. */ | ||
| read: this.read.bind(this), | ||
| /** Download a file as a readable stream. */ | ||
| download: this.download.bind(this), | ||
| /** Check whether a file exists. */ | ||
| exists: this.exists.bind(this), | ||
| /** Retrieve file metadata. */ | ||
| metadata: this.metadata.bind(this), | ||
| /** Upload a file. */ | ||
| upload: this.upload.bind(this), | ||
| /** Create a directory. */ | ||
| createDirectory: this.createDirectory.bind(this), | ||
| /** Delete a file or directory. */ | ||
| delete: this.delete.bind(this), | ||
| /** Get a file preview with text excerpt. */ | ||
| preview: this.preview.bind(this), |
There was a problem hiding this comment.
we don't need to bind things (it doesn't harm, but appkit takes care of it already 😄 )
fjakobs
left a comment
There was a problem hiding this comment.
We need to re-work the way we are handling absolute paths. I'm not comfortable serving files from volumes that the user didn't explicitly configure.
| - Upload size limits with streaming enforcement | ||
| - Automatic cache invalidation on write operations | ||
| - Custom content type mappings | ||
| - Per-user execution context (OBO) |
There was a problem hiding this comment.
not blocking: we need to test what happens when we hit the 60s request timeout.
|
|
||
| ## HTTP routes | ||
|
|
||
| Routes are mounted at `/api/files/*`. All routes except `/root` execute in user context via `asUser(req)`. |
| The `/raw` endpoint serves files inline for browser display but applies security headers: | ||
| - `X-Content-Type-Options: nosniff` | ||
| - `Content-Security-Policy: sandbox` | ||
| - Unsafe content types (HTML, JS, SVG) are forced to download via `Content-Disposition: attachment` |
There was a problem hiding this comment.
why not do this for all mime types? Then it's clear that it is always a download link.
|
|
||
| Paths can be **absolute** or **relative**: | ||
|
|
||
| - **Absolute** — starts with `/`, must begin with `/Volumes/` (e.g. `/Volumes/catalog/schema/vol/data.csv`) |
There was a problem hiding this comment.
does this mean that you can read files from volumes that are not explicitly configured?
I would prefer a model where we "mount" a folder in a volume into the app and no folder outside of this mount is accessible.
| @@ -0,0 +1,131 @@ | |||
| import { AlertCircle, ArrowLeft, FileIcon } from "lucide-react"; | |||
There was a problem hiding this comment.
the UI components look pretty straightforward. I wonder if they should rather be a skill instead of library code.
| (async function* () {})(), | ||
| ); | ||
|
|
||
| await connector.list(mockClient, "/Volumes/other/path"); |
There was a problem hiding this comment.
I'd consider this behaviour a security bug.
We explicitly configure the plugin to use /Volumes/catalog/schema/vol but now we can also read files from other volumes.
I know that the end user still needs permissions to read these files on the other volume but it's very unexpected and might still not be desireable.
| if (segments.some((s) => s === "..")) { | ||
| throw new Error('Path traversal ("../") is not allowed.'); | ||
| } | ||
| if (filePath.startsWith("/")) { |
There was a problem hiding this comment.
we should not allow reading files that are not in one of the configured volumes. I'd drop support for absolute paths entirely.
Alternatively we would also have to check the next folder level and make sure it matches the configured folder.
| const host = hostValue.startsWith("http") | ||
| ? hostValue | ||
| : `https://${hostValue}`; | ||
| const url = new URL(`/api/2.0/fs/files${resolvedPath}`, host); |
There was a problem hiding this comment.
Ideally this should go into the JS SDK but for now it's fine
This PR implements a first version of the Files API plugin according to the internal RFC.
Features
Important
this PR also touches the server plugin to bypass the
express.jsonmiddleware for upload routes, otherwise the stream does not passthrough