feat(paste): precheck src existence before allocating a task#327
Merged
Conversation
PasteMethod used to hand out a task id (HTTP 200) the instant request
parsing succeeded; src existence was only ever discovered later, deep
inside each task phase. The failure modes varied by route: Rsync /
rclone surfaced "no such file" mid-task, GetFromSyncFileCount returned
(0, nil) for a missing sync dirent and let SyncCopy / DownloadFromSync
silently "succeed", and DownloadFromFiles treated an empty cross-node
tree stream as a clean completion. From the client every variant
looked like a phantom Completed (or briefly Running -> Failed) task
with zero bytes moved.
Pre-validation
==============
New pkg/drivers/precheck encapsulates a node-aware src-existence
probe:
- sync -> seafile RPC GetRepo + GetFileIdByPath / GetDirIdByPath
(file vs dir picked strictly from src.Path's trailing slash; no
more silent type coercion that papered over caller bugs)
- cloud (awss3 / tencent / google / dropbox) -> rclone GetFilesSize
- drive (Home/Data) -> local os.Stat (master node only by routing)
- cache / external / internal / smb / usb / hdd -> local os.Stat
when src.Extend matches the current node; otherwise a GET against
the owning files-pod's /api/resources/<fsType>/<extend><path> via
the integration manager (same pod that DownloadFromFiles already
targets)
share=1 requests honor SrcOwner so the existence probe runs as the
share grantor, mirroring the owner-rewrite the task layer applies
downstream.
PasteMethod calls precheck.SourceExists right after PasteParam is
fully populated (including resolved SrcSharePath / DstSharePath). All
precheck failures collapse to a single { "code": 1, "msg":
ErrorMessagePasteSrcNotExists } at HTTP 500, matching the existing
ErrorMessagePasteWrongSourceShare / ErrorMessagePasteSourceExpired
phrasing in pkg/common/constant.go. The raw error chain lands in
klog.Warningf with src/owner/action for ops debugging. The task
pipeline below is unchanged and keeps its own defensive checks.
Related fixes batched in
========================
* sync-to-sync rename: HandleBatchCopy / HandleBatchMove previously
hard-coded the destination filename to equal the source's,
silently dropping any caller-supplied rename target (and
nullifying AddVersionSuffix's conflict-resolution suffix). The
seaserv.CopyFile / MoveFile RPC already accepts independent
src/dst filenames; this wires it through with a trailing
dstDirents []string parameter that falls back to srcDirents
when nil/empty, so the no-rename path stays byte-for-byte
identical for existing callers.
* TrimShareId: the previous implementation truncated any string
longer than 36 chars to its first 36 chars, so URLs like
?path_id=<uuid>asdasd silently normalised back to "<uuid>" and
hit the legitimate share_paths row. Replace with an isKnownNode
lookup (production: global.GlobalNode.CheckNodeExists) and only
strip the suffix when it is exactly "_<a-real-cluster-node-name>".
Bare UUIDs, trailing garbage, and unknown node names are returned
untouched, so the downstream "id = ?" query either matches the
real row or returns an empty set. Updates all 19 call sites
(12 in share_service.go, 4 in router/middleware.go, 1 each in
search_service.go, dynamic_hls_controller.go,
streaming_helpers.go) and pins the contract with TestTrimShareId.
Adds structural unit tests for precheck (nil guards, unsupported
file-type branch, share-owner override path) that run without
seafile / rclone / k8s wired up.
Co-authored-by: Cursor <cursoragent@cursor.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PasteMethodused to return a task id (HTTP 200) the instant request parsing succeeded; src existence was only ever discovered later, deep inside each task phase, where the failure modes varied per route and several looked like phantom Completed tasks to the client.pkg/drivers/precheck— a node-aware src-existence probe that runs immediately afterPasteParamis fully resolved, covering sync, cloud, drive, cache, external/internal/smb/usb/hdd via local stat or a cross-pod/api/resourcesGET.seaserv.CopyFile / MoveFileso sync-to-sync rename works; tightenTrimShareIdso trailing garbage on a share id no longer normalises back to a legitimate row.Problem
Paste failure modes the client saw before this change:
GetFromSyncFileCountreturns(0, nil),SyncCopy/DownloadFromSync"succeed" with zero bytesDownloadFromFilestreats an empty tree stream as a clean completionFrom the UI every variant looked the same: a phantom Completed (or briefly Running → Failed) task with zero bytes moved. There was no early signal that the operation could not possibly succeed.
Pre-validation (
pkg/drivers/precheck)Probes src existence per backend, with node-awareness baked in:
src.FileTypesyncGetRepo+GetFileIdByPath/GetDirIdByPath(file vs dir picked strictly fromsrc.Path's trailing slash; no silent type coercion)awss3/tencent/google/dropboxGetFilesSizedrive(Home / Data)os.Stat(master node only by routing)cache/external/internal/smb/usb/hddos.Statwhensrc.Extendmatches the current node; otherwise a GET against the owning files-pod's/api/resources/<fsType>/<extend><path>via the integration manager (same podDownloadFromFilesalready targets)share=1requests honourSrcOwnerso the existence probe runs as the share grantor, mirroring the owner-rewrite the task layer applies downstream.PasteMethodcallsprecheck.SourceExistsright afterPasteParamis fully populated (including resolvedSrcSharePath/DstSharePath). All precheck failures collapse to a single{ "code": 1, "msg": "..." }at HTTP 500, using a new
ErrorMessagePasteSrcNotExistsconstant whose phrasing matches the existingErrorMessagePasteWrongSourceShare/ErrorMessagePasteSourceExpired. The raw Go error chain stays inklog.Warningfwithsrc/owner/actionfor ops debugging.The task pipeline below is unchanged and keeps its own defensive checks.
Related fixes
sync-to-syncrenameHandleBatchCopy/HandleBatchMovepreviously hard-coded the destination filename to equal the source's, silently dropping any caller-supplied rename target (and nullifyingAddVersionSuffix's conflict-resolution suffix). The underlyingseaserv.CopyFile/MoveFilealready accept independent src/dst filenames; this wires the dst name through with a trailingdstDirents []stringparameter that falls back tosrcDirentswhen nil/empty — the no-rename path stays byte-for-byte identical for existing callers.TrimShareIdtighteningThe previous implementation truncated any string longer than 36 chars to its first 36 chars, so URLs like
?path_id=<uuid>asdasdsilently normalised back to<uuid>and hit the legitimateshare_pathsrow.Replaced with an
isKnownNodelookup (production:global.GlobalNode.CheckNodeExists) that strips the suffix only when it is exactly_<a-real-cluster-node-name>. Bare UUIDs, trailing garbage, and unknown node names are returned untouched, so the downstreamid = ?query either matches the real row or returns an empty set.All 19 call sites updated (12 in
share_service.go, 4 inrouter/middleware.go, 1 each insearch_service.go,dynamic_hls_controller.go,streaming_helpers.go). Contract pinned withTestTrimShareIdcovering bare UUIDs, decorated ids with known / unknown / multi-underscore node names, the<uuid>asdasdregression case, lone trailing_, and a nil callback.Test plan
gofmtcleango vet ./...go test ./pkg/common/...—TestTrimShareIdcovers the regressiongo test ./pkg/drivers/precheck/...— structural tests pass without seafile/rclone/k8s wired up500 { "code": 1, "msg": ErrorMessagePasteSrcNotExists }, no task allocated, no phantom Completed entry in the task listAddVersionSuffixif there's a name conflict)?path_id=<uuid>asdasdand?path_id=<uuid>_<unknown-node>→ both miss the real row instead of silently matching it;?path_id=<uuid>_<known-node>still resolves