Conversation
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
View Vercel preview at instant-www-js-streams-jsv.vercel.app. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a bidirectional streaming subsystem across client and server: new InstantStream APIs and Reactor wiring, server-side stream model with gRPC bridge/server/client and Nippy serializers, reactive/store/session integration, storage flush logic and S3 test hooks, tests, and sandbox playground UIs; also marks Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Reactor
participant InstantStream
participant gRPCServer
participant StreamModel
participant Storage
Client->>Reactor: createWriteStream(opts)
Reactor->>InstantStream: createWriteStream(opts)
InstantStream-->>Client: WritableStream
Client->>WritableStream: write(chunk)
WritableStream->>InstantStream: trySend(Start/Append)
InstantStream->>gRPCServer: send Start/Append
gRPCServer->>StreamModel: persist/append chunks
StreamModel->>Storage: flush-to-file (on threshold)
Storage-->>StreamModel: file metadata
StreamModel->>gRPCServer: notify stream-flushed
gRPCServer->>InstantStream: stream-flushed
InstantStream->>WritableStream: confirm/clear buffer
Client->>WritableStream: close()
WritableStream->>InstantStream: finalize
InstantStream->>gRPCServer: finalize stream
sequenceDiagram
participant Client
participant Reactor
participant InstantStream
participant gRPCClient
participant gRPCServer
participant StreamModel
participant Storage
Client->>Reactor: createReadStream({streamId})
Reactor->>InstantStream: createReadStream(streamId)
InstantStream-->>Client: ReadableStream
Client->>ReadableStream: start reading
ReadableStream->>InstantStream: register reader
InstantStream->>gRPCClient: subscribe-to-instant-stream(req)
gRPCClient->>gRPCServer: open bidi stream
gRPCServer->>StreamModel: prepare StreamInit (files, offset)
StreamModel->>Storage: get-stream-files
gRPCServer->>gRPCClient: send StreamInit
gRPCClient->>InstantStream: deliver init
InstantStream->>ReadableStream: enqueue init/data
loop updates
StreamModel->>gRPCServer: stream-append(chunks, offset)
gRPCServer->>gRPCClient: StreamAppend
gRPCClient->>InstantStream: receive append
InstantStream->>ReadableStream: enqueue append
Client->>ReadableStream: read chunk
end
Client->>ReadableStream: cancel()
ReadableStream->>InstantStream: unregister reader
InstantStream->>gRPCClient: send unsubscribe
gRPCClient->>gRPCServer: cancel subscription
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 20
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
server/test/instant/reactive/session_test.clj (1)
82-117:⚠️ Potential issue | 🟡 MinorClose
socket-3in fixture cleanup to avoid leaked sessions.
with-sessionopenssocket-3but thefinallyblock only closes two sockets.🧹 Suggested cleanup fix
- (session/on-close store socket) - (session/on-close store socket-2) + (session/on-close store socket) + (session/on-close store socket-2) + (session/on-close store socket-3)
🤖 Fix all issues with AI agents
In `@client/packages/core/src/Reactor.js`:
- Around line 983-999: The switch is reading the wrong key name
msg['orignal-event'] so none of the error cases are matched; update the key to
msg['original-event'] (preserve the optional chaining/access pattern used, e.g.,
msg['original-event']?.op) in the switch expression in Reactor.js so the cases
route to this._syncTable.onResyncError, this._syncTable.onStartSyncError and
this._instantStream.onRecieveError correctly; verify the surrounding switch
block that references these handlers is the one to change.
- Around line 352-357: Reactor currently directly passes global
WritableStream/ReadableStream into new InstantStream which breaks in RN/old
browsers; update Reactor to accept stream constructors via config (or polyfills)
and guard their availability before initializing _instantStream: check the
config first, then fallback to typeof WritableStream/ReadableStream !==
'undefined' before passing them to InstantStream (or throw a clear error if
neither exists), ensuring you still pass trySend: this._trySendAuthed.bind(this)
and log: this._log; reference symbols: InstantStream, _instantStream,
WritableStream, ReadableStream, _trySendAuthed, _log.
In `@client/packages/core/src/Stream.ts`:
- Around line 631-647: The start/restart callback maps (startStreamCbs,
restartStreamCbs) leak entries because onCreateStreamOk and onRestartStreamOk
never remove the callback after calling it; update both handlers
(onCreateStreamOk and onRestartStreamOk) to retrieve and then delete the entry
from their respective maps (delete this.startStreamCbs[msg['client-event-id']]
and delete this.restartStreamCbs[msg['client-event-id']]) before or immediately
after invoking cb, and ensure you still handle the missing-callback case (log
and return) exactly as now to avoid calling undefined.
- Around line 181-206: Closed/aborted write streams (handled by close() and
abort(reason)) are never unregistered from writeStreams so they still get
onConnectionReconnect and leak; update close() and abort() to mark the stream as
terminal (isDone: true) and also remove/unregister the stream from the
writeStreams registry (call the same unregister/remove used elsewhere), and
ensure any terminal flush path also unregisters; additionally, make
onConnectionReconnect skip or ignore streams marked done (check stream.isDone)
to avoid restarting completed streams; locate and modify the close, abort,
appendStream usage and the writeStreams/onConnectionReconnect handlers to add
the unregister call and the isDone check.
- Around line 272-415: The unsubscribe uses a different eventId than subscribe
because runStartStream creates its own eventId; fix by tracking and reusing the
same active subscribe event-id: remove the inner const eventId in runStartStream
(or accept an eventId param) and set a shared variable (e.g., activeEventId) to
the eventId used when calling startStream inside createReadStream (or just reuse
the outer eventId), then change cancel() to call cancelStream({ eventId:
activeEventId }) so the same event-id is sent for subscribe and unsubscribe
(references: createReadStream, runStartStream, cancel, startStream,
cancelStream, eventId).
In `@server/deps.edn`:
- Around line 61-64: Upgrade to io.grpc versions (io.grpc/grpc-netty-shaded,
io.grpc/grpc-stub, io.grpc/grpc-core, io.grpc/grpc-services) at 1.79.0 requires
targeted regression testing and possibly a stepped upgrade path: run unit and
integration tests verifying JSON parsing behavior (ensure duplicate keys are
rejected where expected), validate large header/trailer handling to exercise
metadata size soft-limits, execute load/soak tests to catch Netty-related
corruption changes introduced by grpc-netty-shaded, and check code paths
referencing deprecated APIs (e.g., ProtoReflectionService) for removals; if any
issues surface, perform incremental upgrades through intermediate versions
(1.69.x, 1.75.x, etc.) to isolate behavioral changes and update tests or code
accordingly.
In `@server/src/instant/db/model/triple.clj`:
- Line 430: Remove the debug invocation tool/def-locals from the code (it
captures local bindings for development and can affect production); locate the
occurrence of (tool/def-locals) in triple.clj (the debug helper was added near
the top-level of the file) and delete that call so no debug tooling remains in
the merged codebase.
In `@server/src/instant/grpc.clj`:
- Around line 55-61: The MethodDescriptor named subscribe-method currently sets
(.setType MethodDescriptor$MethodType/SERVER_STREAMING) but the server and
client use bidi calls (ServerCalls/asyncBidiStreamingCall and
ClientCalls/asyncBidiStreamingCall); change the setType to
MethodDescriptor$MethodType/BIDI_STREAMING so subscribe-method's descriptor
matches the bidirectional streaming handlers.
In `@server/src/instant/jdbc/cache_evict.clj`:
- Around line 61-70: The function notify-stream-machine-id-changed (and the
similar code at lines 87-92) currently uses nth on the columns vector without a
default which can throw if columns is shorter or nil; change to guarded access
(e.g., use nth with a default nil or lookup by column name) and only parse and
call app-stream-model/notify-machine-id-changed when both stream-id and
machine-id are present and successfully parsed; ensure
parse-uuid/parse-json-uuid are only invoked on non-nil values to avoid
exceptions.
In `@server/src/instant/model/app_stream.clj`:
- Around line 467-500: The proxy currently creates a per-call executor that is
only shut down in onError/onCompleted, which can leak if neither occurs; modify
server-ify-observer to ensure executor is always cleaned up by either using a
shared executor pool (replace Executors/newSingleThreadExecutor with a shared
executor) or by wrapping the cancel handler to shut it down: when implementing
setOnCancelHandler in the proxy, store the provided runnable and replace it with
a wrapper that calls the provided runnable and then (.shutdown executor) and
also ensure the wrapper is invoked where onError/onCompleted currently set/reset
the on-cancel atom; reference the executor, setOnCancelHandler, onError,
onCompleted, closed? and on-cancel-atom symbols to locate and apply this change.
- Around line 337-343: The current double deref on the flush promise in the
app_stream logic can block forever; change the blocking dereferences of the
promise/value in the (when-let [p (:flush-promise updated)] ...) path to use
timed deref calls with sensible timeouts and a clear fallback path: call deref
with a milliseconds timeout for the outer promise (flush-promise) and then for
the inner future value returned by that promise, check for a timeout result and
handle it (log via your logger, set an error flag on stream-object, or throw a
controlled exception) before proceeding to (flush-to-file stream-object nil
on-flush-to-file). Ensure you reference the same symbols (flush-promise, p,
updated, flush-to-file, stream-object, on-flush-to-file) so the change replaces
the blocking derefs with timed derefs and explicit timeout handling.
- Around line 275-282: The concurrent-flush branch currently only logs an
exception via tracer/record-exception-span! but allows the code path to
continue, risking overlapping writes; modify the synchronization in the
flush-to-file flow (the update of :flush-promise and the use of flush-promise)
so that when (= p flush-promise) is false you either fail-fast by throwing an
IllegalStateException (or return an explicit error) to abort the new flush, or
use an atomic compare-and-set pattern to ensure only one flush promise is
installed and others are rejected; update the code around update :flush-promise
and the app-stream/flush-to-file entry point to either throw on concurrent
detection or return the existing p to prevent any overlapping write operations
and ensure callers handle the failure path.
- Around line 543-555: The error branches for :writer-disconnected,
:stream-missing-on-instance and :instance-missing currently return a retry map
only when `@machine-id-changed` is true, causing nil to be returned (so on-payload
is never called); update the handling in the case dispatch (the clauses that
reference `@machine-id-changed`) to always return an error payload (e.g. {:error
true :error-type :retry} or a distinct {:error true :error-type
:pending-machine-id}) when those cases occur, or explicitly return a non-nil
notification indicating the promise is not yet delivered, so that msg is truthy
and on-payload is invoked; ensure changes are made where these keywords are
matched (the :writer-disconnected, :stream-missing-on-instance and
:instance-missing clauses) and keep the original machine-id-changed check only
to adjust the error-type, not to suppress returning any message.
In `@server/src/instant/nippy.clj`:
- Around line 118-125: The code calls the macro tool/def-locals (in the let
block alongside nippy/thaw-from-in!, read-file, and read-chunks) but the
namespace does not require the tool alias; add a require for the namespace that
provides def-locals (e.g. [your-tool-namespace :as tool]) in the ns declaration
or remove the debug macro call; update the namespace header so tool/def-locals
resolves, then recompile to ensure instant.grpc/->StreamInit usage remains
unchanged.
- Around line 50-115: The custom Nippy serializers are inconsistent: code uses
private helpers (#'nippy/write-long, #'nippy/write-str) to write raw bytes but
reads with nippy/thaw-from-in! (which expects Nippy-framed values), causing
mismatched serialization for StreamRequest, StreamFile, StreamInit and the
helper functions write-file, read-file, write-chunks, read-chunks; fix by making
writes and reads symmetric—replace all uses of #'nippy/write-long and
#'nippy/write-str in write-file, write-chunks and the StreamRequest/StreamInit
extend-freeze blocks with nippy/freeze-to-out! (so fields like offset, size,
location-id, combined chunk byte-size are Nippy-encoded), and ensure
corresponding extend-thaw/read helpers use nippy/thaw-from-in! for those same
fields (or conversely switch to raw DataInput/.readLong/.readUTF and use
.writeLong/.writeUTF), but pick one consistent approach across functions (prefer
nippy/freeze-to-out! / nippy/thaw-from-in!) and update read-file and read-chunks
to match that chosen encoding.
In `@server/src/instant/reactive/session.clj`:
- Around line 739-741: The call to the macro tool/def-locals inside
handle-create-stream! is failing because the namespace alias tool is not
required; either add a require for the namespace that provides def-locals (e.g.,
require it as tool) in the ns declaration or remove/replace the debug macro
call; locate the handle-create-stream! function and the use of tool/def-locals
and update the ns requires to include the correct namespace alias (or delete the
tool/def-locals line) so the code compiles.
- Around line 950-967: The handler cancels the reader but never removes the
reader entity, leaving stale entries; after calling the cancel function
(reader-ent → :stream-reader/reader-object → :cancel with
rs/stream-unsubscribe-reason) you must delete the reader entity from the store
so resubscribe by event-id succeeds. Add a call to a store helper (e.g. a new
rs/remove-stream-reader! or store/remove-stream-reader! helper) immediately
after the cancel, passing the reader entity identifier (use reader-ent's id
field such as :db/id or :stream-reader/id) and sess-id/app-id as needed; keep
the validation/error flow (get-auth!, rs/get-stream-reader) and ensure the
delete is invoked even if cancel succeeds (consider wrapping cancel in
try/finally if necessary).
- Around line 1451-1454: The combine implementation for defmethod combine
[:append-stream :append-stream] currently merges chunks and offset but can drop
a done/abort signal from the earlier event; update the merge to preserve these
flags by carrying over :done and :abort from event1 into the resulting event
(e.g., set :done to (or (:done event1) (:done event2)) and likewise for :abort),
and if there is an abort reason on event1 propagate that as well (e.g., prefer
event1's :abort/:abort-reason when present) while keeping the existing update of
:chunks and assoc of :offset.
In `@server/src/instant/reactive/store.clj`:
- Around line 367-387: The tx-cleanup function uses the macro tool/def-locals
but the namespace alias tool is not required; add the proper require for the
namespace that defines def-locals (e.g., add the corresponding [<tool-namespace>
:as tool] entry to the ns declaration) so tool/def-locals resolves, or
remove/replace the debug macro call inside tx-cleanup if you don't want the
dependency; ensure the change references the tx-cleanup function and the
tool/def-locals symbol.
In `@server/test/instant/reactive/session_test.clj`:
- Around line 1081-1115: Remove the debug-only references to tool/* which are
unresolved: delete the call to tool/def-locals inside with-file-mock's
get-metadata and remove tool/inspect from read-full-stream; instead directly
dereference the result of (rs/get-stream-object-for-subscribe store app-id
stream-id) to access :buffer. Specifically, edit with-file-mock -> get-metadata
to just return {:size (alength ^bytes (get `@files` key))} without
tool/def-locals, and update read-full-stream to use (some->> (deref
(rs/get-stream-object-for-subscribe store app-id stream-id)) :buffer ...) or
equivalent direct deref/access so no tool/inspect is referenced.
🧹 Nitpick comments (7)
server/src/instant/db/permissioned_transaction.clj (1)
61-66: LGTM! The logic correctly extends the $files restriction to $streams.The conditional properly disallows attribute changes for both system namespaces, even for admins.
Minor suggestions:
Docstring update: The
prevent-system-column-updatesdocstring (lines 93-95) only mentions$filesas the special case. Consider updating it to also mention$streamsfor consistency.Optional: Use a set for clarity:
💡 Optional: Use a set for multiple etype checks
- (if (or (= etype "$files") - (= etype "$streams")) + (if (#{"$files" "$streams"} etype) ;; under no circumstances do we let you change $files or $streams attrs falseThis makes it easier to add more restricted namespaces in the future.
server/src/instant/model/app_stream.clj (3)
118-132: Remove debug artifacttool/def-locals.
tool/def-localsappears to be a debugging helper that should be removed before merging. It's present in multiple places throughout this file (lines 120, 283, 318, 324, 422, 449, 524, 593, 617).🔧 Suggested fix
(defn extract-part-num [stream-id file-path] (let [prefix (stream-file-name-prefix stream-id) - _ (tool/def-locals) _ (when-not (clojure.string/starts-with? file-path prefix)
448-449: Remove debug artifacttool/inspect.
tool/inspectappears to be a debugging helper that should be removed before merging.🔧 Suggested fix
(if (instance? StreamRequest v) (deliver cleanup (handle-subscribe store credits v observer)) - ;; XXX: Where are our credits going? - (tool/inspect (swap! credits inc)))) + (swap! credits inc)))
596-601: Clean up debug toggle comments.The
true ;false ;true ;falsecomments appear to be leftover debug toggles and should be cleaned up.🔧 Suggested fix
use-local? (or (not channel) (and (= machine-id config/machine-id) - true ;false ;true ;false ;; In dev, hit the grpc server half the time to exercise the route (or (not (config/dev?)) (= 0 (rand-int 2)))))client/sandbox/react-nextjs/pages/play/streams.tsx (2)
9-11: Add a fallback forcrypto.randomUUIDin older browsers.This page will crash where the API is missing (e.g., older Safari/embedded WebViews). Consider a guarded fallback.
🔧 Example fallback
-function randomUUID() { - return crypto.randomUUID(); -} +function randomUUID() { + return typeof crypto?.randomUUID === 'function' + ? crypto.randomUUID() + : `${Date.now()}-${Math.random().toString(16).slice(2)}`; +}
22-23: Avoid exposing the database onglobalThisin non-dev builds.This is handy for debugging, but it can leak internals if this page ships. Consider gating behind a dev flag or removing.
client/packages/core/src/Stream.ts (1)
71-77: Remove or guard debug globals/logs.
globalThis._stuffand console logs will leak internals and spam logs in production. Consider gating by a dev flag or removing.Also applies to: 175-176
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Fix all issues with AI agents
In `@client/packages/core/src/Stream.ts`:
- Around line 924-944: The start-stream and subscribe-stream error handlers
currently invoke the callback/iterator but leave entries in the maps, causing
memory leaks; after calling cb(...) for the start-stream case (use
startWriteStreamCbs[eventId]) remove the callback entry (delete
startWriteStreamCbs[eventId]) and after pushing the error to the
readStreamIterators[eventId] iterator, ensure you close or finalize the iterator
(call its close/return method if available) and remove the map entry (delete
readStreamIterators[eventId]); make these changes inside the corresponding case
blocks ('start-stream' and 'subscribe-stream') so the resources are cleaned up
on error paths.
- Around line 478-495: The RN arrayBuffer branch currently trims the first chunk
using discardLen but never updates discardLen, causing later files to lose
bytes; in the block handling res.arrayBuffer() (variables: chunk, discardLen,
seenOffset, decoder, controller, fetchAbort, canceled) capture the original
chunk byte length before subarray, then after creating the trimmed chunk
decrement discardLen by (originalLength - chunk.byteLength) (clamped to zero)
just like the body/content paths do, so subsequent iterations use the correct
remaining discardLen and seenOffset remains accurate.
- Around line 545-549: The start() callback passed into the new RStream is
currently invoked but its returned promise is ignored, so async failures won't
propagate to the stream; update the start handler inside the RStream constructor
to return the value of start(controller) (i.e., change the start(controller)
invocation to return start(controller)) so any promise rejection will be handled
by the ReadableStream and error the stream correctly (look for the RStream
instantiation and its start(controller) call in Stream.ts).
In `@server/src/instant/model/app_stream.clj`:
- Around line 149-162: The unlink-file! function is retracting the "$files" ref
using (str file-id) which mismatches the original ref type (UUID) created by
link-file!; inside the transact! vector in unlink-file! replace the stringified
file-id with the original file-id value (i.e., remove the (str ...) wrapper) so
the retract-triple uses the same value type as link-file! (inspect the transact!
call inside update-op in unlink-file! to make this change), ensuring the retract
matches and the old link is removed.
In `@server/src/instant/reactive/session.clj`:
- Around line 731-772: The reconnect path in handle-start-stream! fails to
persist the current machine id to DB; after calling
app-stream-model/new-stream-object-from-stream (in the branch where
app-stream-model/get-stream returned a stream) call
app-stream-model/update-machine-id! with the existing stream id and
config/machine-id before rs/register-stream so the database’s machineId is
updated to the current node; locate the reconnect branch in
handle-start-stream!, insert a call to app-stream-model/update-machine-id!
(using the stream’s id) prior to rs/register-stream to ensure persistence.
| import { InstantError } from './InstantError.ts'; | ||
|
|
||
| export type WritableStreamCtor = { | ||
| new <W = any>( |
stopachka
left a comment
There was a problem hiding this comment.
Beautiful work @dwwoelfel !
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
🤖 Fix all issues with AI agents
Verify each finding against the current code and only fix it if needed.
In `@server/src/instant/config.clj`:
- Around line 299-302: Update get-grpc-server-port to allow an environment
override: if the GRPC_PORT env var is set (parse it as an integer, handling
invalid values gracefully), return that value; otherwise fall back to (+
(get-hz-port) grpc-port-offset). Reference the existing grpc-port-offset var and
the get-grpc-server-port function when making the change so container/platform
deployments can independently map gRPC port via GRPC_PORT.
In `@server/src/instant/reactive/session.clj`:
- Around line 731-779: In handle-start-stream! the newly created stream branch
returns the map key :udpate-machine-id (typo) instead of :update-machine-id,
which mismatches the destructured {:keys [stream-object update-machine-id]} and
the later (when update-machine-id ...) check; fix the typo by changing the
returned key in the create! branch from :udpate-machine-id to :update-machine-id
so the local binding update-machine-id correctly reflects whether to call
app-stream-model/update-machine-id! and rs/register-stream uses the intended
flag.
In `@server/src/instant/storage/s3.clj`:
- Around line 135-137: The current call uses ((:upload mock) ...) when *s3-mock*
is truthy but does not check that (:upload mock) exists or is a function,
causing opaque NPEs; update the branch around *s3-mock* to first check that (fn?
(:upload mock)) (or similar) before invoking it, and if missing/invalid either
throw a clear exception with contextual info (including the mock value and
bucket-name) or fall back to s3-util/upload-stream-to-s3; reference the symbols
*s3-mock*, :upload, upload-stream-to-s3 and s3-async-client when making the
guard and error message so tests fail with an actionable message.
- Around line 150-152: The mock branch returns already-formatted metadata while
the real branch passes the raw head-object through format-object, causing
inconsistent shapes in tests; change the mock handling in the code around
*s3-mock* so that the value returned by ((:get-object-metadata mock) bucket-name
object-key) is passed into format-object (i.e., normalize both branches through
format-object), or alternatively require mocks to return the raw
s3-util/head-object shape and call format-object on that result; update
callers/tests accordingly to expect the formatted shape from format-object and
keep symbols format-object, *s3-mock*, (:get-object-metadata mock),
s3-util/head-object and s3-client as the integration points.
🧹 Nitpick comments (4)
🤖 Fix all nitpicks with AI agents
Verify each finding against the current code and only fix it if needed. In `@server/src/instant/config.clj`: - Around line 299-302: Update get-grpc-server-port to allow an environment override: if the GRPC_PORT env var is set (parse it as an integer, handling invalid values gracefully), return that value; otherwise fall back to (+ (get-hz-port) grpc-port-offset). Reference the existing grpc-port-offset var and the get-grpc-server-port function when making the change so container/platform deployments can independently map gRPC port via GRPC_PORT. In `@server/src/instant/reactive/session.clj`: - Around line 731-779: In handle-start-stream! the newly created stream branch returns the map key :udpate-machine-id (typo) instead of :update-machine-id, which mismatches the destructured {:keys [stream-object update-machine-id]} and the later (when update-machine-id ...) check; fix the typo by changing the returned key in the create! branch from :udpate-machine-id to :update-machine-id so the local binding update-machine-id correctly reflects whether to call app-stream-model/update-machine-id! and rs/register-stream uses the intended flag. In `@server/src/instant/storage/s3.clj`: - Around line 135-137: The current call uses ((:upload mock) ...) when *s3-mock* is truthy but does not check that (:upload mock) exists or is a function, causing opaque NPEs; update the branch around *s3-mock* to first check that (fn? (:upload mock)) (or similar) before invoking it, and if missing/invalid either throw a clear exception with contextual info (including the mock value and bucket-name) or fall back to s3-util/upload-stream-to-s3; reference the symbols *s3-mock*, :upload, upload-stream-to-s3 and s3-async-client when making the guard and error message so tests fail with an actionable message. - Around line 150-152: The mock branch returns already-formatted metadata while the real branch passes the raw head-object through format-object, causing inconsistent shapes in tests; change the mock handling in the code around *s3-mock* so that the value returned by ((:get-object-metadata mock) bucket-name object-key) is passed into format-object (i.e., normalize both branches through format-object), or alternatively require mocks to return the raw s3-util/head-object shape and call format-object on that result; update callers/tests accordingly to expect the formatted shape from format-object and keep symbols format-object, *s3-mock*, (:get-object-metadata mock), s3-util/head-object and s3-client as the integration points.server/src/instant/storage/s3.clj (2)
135-137: Add a guard for missing mock upload fn to avoid opaque NPEs.
This keeps test failures actionable if a mock is partially defined.♻️ Suggested guard
- (if-let [mock *s3-mock*] - ((:upload mock) bucket-name ctx* file) + (if-let [mock *s3-mock*] + (if-let [upload-fn (:upload mock)] + (upload-fn bucket-name ctx* file) + (throw (ex-info "S3 mock missing :upload fn" {:mock-keys (keys mock)}))) (s3-util/upload-stream-to-s3 (s3-async-client) bucket-name ctx* file))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/src/instant/storage/s3.clj` around lines 135 - 137, The current call uses ((:upload mock) ...) when *s3-mock* is truthy but does not check that (:upload mock) exists or is a function, causing opaque NPEs; update the branch around *s3-mock* to first check that (fn? (:upload mock)) (or similar) before invoking it, and if missing/invalid either throw a clear exception with contextual info (including the mock value and bucket-name) or fall back to s3-util/upload-stream-to-s3; reference the symbols *s3-mock*, :upload, upload-stream-to-s3 and s3-async-client when making the guard and error message so tests fail with an actionable message.
150-152: Keep mock and real return shapes consistent.
Right now the mock branch bypassesformat-object; that can mask shape differences in tests. Consider normalizing through the same formatter (and require mocks to return the raw head-object shape).♻️ Suggested normalization
- (if-let [mock *s3-mock*] - ((:get-object-metadata mock) bucket-name object-key) - (format-object (s3-util/head-object (s3-client) bucket-name object-key)))))) + (format-object + (if-let [mock *s3-mock*] + ((:get-object-metadata mock) bucket-name object-key) + (s3-util/head-object (s3-client) bucket-name object-key))))))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/src/instant/storage/s3.clj` around lines 150 - 152, The mock branch returns already-formatted metadata while the real branch passes the raw head-object through format-object, causing inconsistent shapes in tests; change the mock handling in the code around *s3-mock* so that the value returned by ((:get-object-metadata mock) bucket-name object-key) is passed into format-object (i.e., normalize both branches through format-object), or alternatively require mocks to return the raw s3-util/head-object shape and call format-object on that result; update callers/tests accordingly to expect the formatted shape from format-object and keep symbols format-object, *s3-mock*, (:get-object-metadata mock), s3-util/head-object and s3-client as the integration points.server/src/instant/config.clj (1)
299-302: Consider allowing a GRPC_PORT override for deploy flexibility.Tying gRPC to HZ port is fine for defaults, but an env override helps when ports must be mapped independently (e.g., container platforms).
♻️ Suggested change
(def grpc-port-offset 100) (defn get-grpc-server-port [] - (+ (get-hz-port) grpc-port-offset)) + (or (env-integer "GRPC_PORT") + (+ (get-hz-port) grpc-port-offset)))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/src/instant/config.clj` around lines 299 - 302, Update get-grpc-server-port to allow an environment override: if the GRPC_PORT env var is set (parse it as an integer, handling invalid values gracefully), return that value; otherwise fall back to (+ (get-hz-port) grpc-port-offset). Reference the existing grpc-port-offset var and the get-grpc-server-port function when making the change so container/platform deployments can independently map gRPC port via GRPC_PORT.server/src/instant/reactive/session.clj (1)
731-779: Fix the typo in the map key to avoid confusion.The new-stream branch returns
:udpate-machine-id(typo), which makes the intent harder to read and easy to mis‑use later.♻️ Suggested fix
- {:stream-object (app-stream-model/new-stream-object app-id id) - :udpate-machine-id false}))] + {:stream-object (app-stream-model/new-stream-object app-id id) + :update-machine-id false}))]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/src/instant/reactive/session.clj` around lines 731 - 779, In handle-start-stream! the newly created stream branch returns the map key :udpate-machine-id (typo) instead of :update-machine-id, which mismatches the destructured {:keys [stream-object update-machine-id]} and the later (when update-machine-id ...) check; fix the typo by changing the returned key in the create! branch from :udpate-machine-id to :update-machine-id so the local binding update-machine-id correctly reflects whether to call app-stream-model/update-machine-id! and rs/register-stream uses the intended flag.
Adds support for streams in both the client and admin SDKs.
To create a WritableStream, you use
db.streams.createWriteStreamwith a unique clientId:To create a ReadableStream, you use
db.streams.createReadStreamwith an existing client-id (if the client-id doesn't exist, we return an error when you read from the stream):You can write permission rules for the streams. The
createrule will run when a non-admin callscreateWriteStreamand will get a data object with justclientId. Theviewrule will run forcreateReadStreamand gets the full stream object.Implementation
The streams exist in-memory on the machine that the writer is connected to. Once the stream reaches a certain length (configurable via a feature flag), we'll flush the stream to a $file. If you have a very large stream, then it will be continually flushed to files.
When a reader connects to the stream, we figure out which instance holds the stream (via the
machineIdfield on the $stream object), and we subscribe to the stream via a grpc call. The grpc call sends messages to the subscriber with the stream data, and we forward that over the websocket (or sse connection) to the subscriber.A session might look something like this:
I used grpc to sync the stream between machines because it keeps all of the writes on the node with the writer's session. We could have used hazelcast, but it has more overhead when the data node is not on the writer node--we've encountered issues when trying to use it with the single invalidator. We create a subscription per subscriber, so if a single stream gets popular, it will be more expensive, but we can move to a single-subscriber-per-node that fans out to all subscribers on that node if it's an issue.
React native
React native doesn't have built-in web streams support, so we ship a polyfill. The polyfill is required in a
.native.tsfile, so it should only increase the bundle size for native builds.Resilency
The nice thing about streams is that they have a single writer. If you keep track of the number of bytes you've processed, then you know where you are in the stream. We use this feature to keep the streams alive across connection drops.
When the writer loses its connection, it buffers its writes, and then sends all of the data to the server when it reconnects. Whenever we flush the writes to a $file, we send an offset to the writer and it knows that it can discard anything before the offset.
When a reader reconnects, it sends its byte offset and then we starts its subscription from where it picked off.
When something weird happens to the stream, we just tell the reader or writer to reconnect with their last offset.
Admin sdk
We created an sse connection for subscribing to queries via the admin sdk, but it creates one sse connection per query. This PR make it possible to handle multiple requests on the same sse connection (the same way we use the ws/sse connection on the client). Right now it only handles streams, but we could run subscribe-query, and even query/transact through it later.
Future work
I think this is a pretty robust streams implementation with some unique features, but there is still a lot of room for improvement: