Skip to content

feat(paste): precheck src existence before allocating a task#327

Merged
lovehunter9 merged 1 commit into
mainfrom
fix/for_cli
May 21, 2026
Merged

feat(paste): precheck src existence before allocating a task#327
lovehunter9 merged 1 commit into
mainfrom
fix/for_cli

Conversation

@lovehunter9
Copy link
Copy Markdown
Collaborator

@lovehunter9 lovehunter9 commented May 21, 2026

Summary

  • PasteMethod used to return a task id (HTTP 200) the instant request parsing succeeded; src existence was only ever discovered later, deep inside each task phase, where the failure modes varied per route and several looked like phantom Completed tasks to the client.
  • Introduce pkg/drivers/precheck — a node-aware src-existence probe that runs immediately after PasteParam is fully resolved, covering sync, cloud, drive, cache, external/internal/smb/usb/hdd via local stat or a cross-pod /api/resources GET.
  • Two related fixes batched in: pass dst basename through to seaserv.CopyFile / MoveFile so sync-to-sync rename works; tighten TrimShareId so trailing garbage on a share id no longer normalises back to a legitimate row.

Problem

Paste failure modes the client saw before this change:

Backend Symptom for a missing src
Rsync / rclone task surfaces "no such file" mid-phase
Sync (seafile) GetFromSyncFileCount returns (0, nil), SyncCopy / DownloadFromSync "succeed" with zero bytes
Cross-node files-pod DownloadFromFiles treats an empty tree stream as a clean completion

From the UI every variant looked the same: a phantom Completed (or briefly Running → Failed) task with zero bytes moved. There was no early signal that the operation could not possibly succeed.

Pre-validation (pkg/drivers/precheck)

Probes src existence per backend, with node-awareness baked in:

src.FileType Probe
sync seafile RPC GetRepo + GetFileIdByPath / GetDirIdByPath (file vs dir picked strictly from src.Path's trailing slash; no silent type coercion)
awss3 / tencent / google / dropbox rclone GetFilesSize
drive (Home / Data) local os.Stat (master node only by routing)
cache / external / internal / smb / usb / hdd local os.Stat when src.Extend matches the current node; otherwise a GET against the owning files-pod's /api/resources/<fsType>/<extend><path> via the integration manager (same pod DownloadFromFiles already targets)

share=1 requests honour SrcOwner so the existence probe runs as the share grantor, mirroring the owner-rewrite the task layer applies downstream.

PasteMethod calls precheck.SourceExists right after PasteParam is fully populated (including resolved SrcSharePath / DstSharePath). All precheck failures collapse to a single

{ "code": 1, "msg": "..." }

at HTTP 500, using a new ErrorMessagePasteSrcNotExists constant whose phrasing matches the existing ErrorMessagePasteWrongSourceShare / ErrorMessagePasteSourceExpired. The raw Go error chain stays in klog.Warningf with src / owner / action for ops debugging.

The task pipeline below is unchanged and keeps its own defensive checks.

Related fixes

sync-to-sync rename

HandleBatchCopy / HandleBatchMove previously hard-coded the destination filename to equal the source's, silently dropping any caller-supplied rename target (and nullifying AddVersionSuffix's conflict-resolution suffix). The underlying seaserv.CopyFile / MoveFile already accept independent src/dst filenames; this wires the dst name through with a trailing dstDirents []string parameter that falls back to srcDirents when nil/empty — the no-rename path stays byte-for-byte identical for existing callers.

TrimShareId tightening

The previous implementation truncated any string longer than 36 chars to its first 36 chars, so URLs like ?path_id=<uuid>asdasd silently normalised back to <uuid> and hit the legitimate share_paths row.

Replaced with an isKnownNode lookup (production: global.GlobalNode.CheckNodeExists) that strips the suffix only when it is exactly _<a-real-cluster-node-name>. Bare UUIDs, trailing garbage, and unknown node names are returned untouched, so the downstream id = ? query either matches the real row or returns an empty set.

All 19 call sites updated (12 in share_service.go, 4 in router/middleware.go, 1 each in search_service.go, dynamic_hls_controller.go, streaming_helpers.go). Contract pinned with TestTrimShareId covering bare UUIDs, decorated ids with known / unknown / multi-underscore node names, the <uuid>asdasd regression case, lone trailing _, and a nil callback.

Test plan

  • gofmt clean
  • go vet ./...
  • go test ./pkg/common/...TestTrimShareId covers the regression
  • go test ./pkg/drivers/precheck/... — structural tests pass without seafile/rclone/k8s wired up
  • Manual: paste with a deleted src on each backend (drive / sync / cloud / external on another node / share) → request fails with 500 { "code": 1, "msg": ErrorMessagePasteSrcNotExists }, no task allocated, no phantom Completed entry in the task list
  • Manual: paste a sync file/dir to itself under a new name → renamed copy/move lands with the new basename (and with AddVersionSuffix if there's a name conflict)
  • Manual: hit ?path_id=<uuid>asdasd and ?path_id=<uuid>_<unknown-node> → both miss the real row instead of silently matching it; ?path_id=<uuid>_<known-node> still resolves
  • Regression: existing in-cluster paste flows (drive → drive, sync → drive, drive → sync, cross-node external) — precheck passes and task completes as before

PasteMethod used to hand out a task id (HTTP 200) the instant request
parsing succeeded; src existence was only ever discovered later, deep
inside each task phase. The failure modes varied by route: Rsync /
rclone surfaced "no such file" mid-task, GetFromSyncFileCount returned
(0, nil) for a missing sync dirent and let SyncCopy / DownloadFromSync
silently "succeed", and DownloadFromFiles treated an empty cross-node
tree stream as a clean completion. From the client every variant
looked like a phantom Completed (or briefly Running -> Failed) task
with zero bytes moved.

Pre-validation
==============

New pkg/drivers/precheck encapsulates a node-aware src-existence
probe:

- sync  -> seafile RPC GetRepo + GetFileIdByPath / GetDirIdByPath
  (file vs dir picked strictly from src.Path's trailing slash; no
  more silent type coercion that papered over caller bugs)
- cloud (awss3 / tencent / google / dropbox) -> rclone GetFilesSize
- drive (Home/Data) -> local os.Stat (master node only by routing)
- cache / external / internal / smb / usb / hdd -> local os.Stat
  when src.Extend matches the current node; otherwise a GET against
  the owning files-pod's /api/resources/<fsType>/<extend><path> via
  the integration manager (same pod that DownloadFromFiles already
  targets)

share=1 requests honor SrcOwner so the existence probe runs as the
share grantor, mirroring the owner-rewrite the task layer applies
downstream.

PasteMethod calls precheck.SourceExists right after PasteParam is
fully populated (including resolved SrcSharePath / DstSharePath). All
precheck failures collapse to a single { "code": 1, "msg":
ErrorMessagePasteSrcNotExists } at HTTP 500, matching the existing
ErrorMessagePasteWrongSourceShare / ErrorMessagePasteSourceExpired
phrasing in pkg/common/constant.go. The raw error chain lands in
klog.Warningf with src/owner/action for ops debugging. The task
pipeline below is unchanged and keeps its own defensive checks.

Related fixes batched in
========================

* sync-to-sync rename: HandleBatchCopy / HandleBatchMove previously
  hard-coded the destination filename to equal the source's,
  silently dropping any caller-supplied rename target (and
  nullifying AddVersionSuffix's conflict-resolution suffix). The
  seaserv.CopyFile / MoveFile RPC already accepts independent
  src/dst filenames; this wires it through with a trailing
  dstDirents []string parameter that falls back to srcDirents
  when nil/empty, so the no-rename path stays byte-for-byte
  identical for existing callers.

* TrimShareId: the previous implementation truncated any string
  longer than 36 chars to its first 36 chars, so URLs like
  ?path_id=<uuid>asdasd silently normalised back to "<uuid>" and
  hit the legitimate share_paths row. Replace with an isKnownNode
  lookup (production: global.GlobalNode.CheckNodeExists) and only
  strip the suffix when it is exactly "_<a-real-cluster-node-name>".
  Bare UUIDs, trailing garbage, and unknown node names are returned
  untouched, so the downstream "id = ?" query either matches the
  real row or returns an empty set. Updates all 19 call sites
  (12 in share_service.go, 4 in router/middleware.go, 1 each in
  search_service.go, dynamic_hls_controller.go,
  streaming_helpers.go) and pins the contract with TestTrimShareId.

Adds structural unit tests for precheck (nil guards, unsupported
file-type branch, share-owner override path) that run without
seafile / rclone / k8s wired up.

Co-authored-by: Cursor <cursoragent@cursor.com>
@lovehunter9 lovehunter9 merged commit b75f2cd into main May 21, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant