Skip to content

Replace queue-based indexing with pipeline coordinator#180

Merged
SmilyOrg merged 10 commits into
mainfrom
split/pipeline-core
May 23, 2026
Merged

Replace queue-based indexing with pipeline coordinator#180
SmilyOrg merged 10 commits into
mainfrom
split/pipeline-core

Conversation

@SmilyOrg
Copy link
Copy Markdown
Owner

Summary

Replaces the old queue-based indexing system (globalTasks sync.Map, metadataQueue, contentsQueue) with a new internal/image/pipeline package that coordinates indexing as a sequential, prioritised pipeline.

Changes

Backend

  • internal/image/pipeline/ — new package with clean stage separation:

    • coordinator.go — sequential task runner with priority ordering (files → metadata → contents); deduplicates concurrent requests; supports cancellation via DELETE /tasks/{id}
    • files.go — stage 1: walk directories, add/remove DB records
    • metadata.go — stage 2: extract EXIF/metadata with configurable workers
    • contents.go — stage 3: generate thumbnails, extract prominent colour and AI embeddings
    • source.go — file sourcing helpers (force vs. missing-only)
    • thumbnail.go — thumbnail load/generate abstraction
    • progress.go — structured progress logging with rate/ETA
    • types.go — pipeline-internal stage types
    • coordinator_test.go — sequential execution, duplicate prevention, priority ordering, shutdown tests
  • internal/task/ — new task lifecycle package extracted from main.go

  • internal/ai/ — renamed from internal/clip (embedding types/functions)

  • main.go — wires pipeline.Coordinator into HTTP handlers; removes globalTasks, indexCollection, and queue-backed task stubs; adds DELETE /tasks/{id} handler

  • api.yaml — adds force parameter to task creation, INDEX_ALL task type, DELETE /tasks/{id} endpoint; deprecates INDEX_CONTENTS_COLOR / INDEX_CONTENTS_AI

  • internal/image/database.go — adds CountMissing, GetDirsCount helpers used by pipeline stages

Frontend

  • Replace deprecated Reindex * buttons with Index metadata, Index color & AI, Index all
  • Add Force reindex checkbox (passes force: true to the API)
  • Add stop button (×) on each running task (calls DELETE /tasks/{id})
  • api.js: add stopTask(), update createTask() to accept an options object

SmilyOrg added 7 commits May 21, 2026 21:15
- Rename internal/clip to internal/ai
- Add internal/image/pipeline package with sequential coordinator
- Add internal/task package with typed task constants
- Remove old queue-based indexFiles/indexMetadata/indexContents methods
- Add new HTTP task API (POST /tasks, DELETE /tasks/{id})
- Update api.yaml and regenerate openapi bindings
- Add PIPELINE_DESIGN.md
- Update coordinator.go AddFiles comment
- Update coordinator_test.go priority order test comment
- Rewrite pipeline README.md removing Stage 3 (faces), ForceFaces,
  FaceWorkers, FaceDetector, HasFaces/WriteFaces, face performance notes
- Migrate unique content (channel return pattern, no EXIF thumbs rationale)
- Remove duplicate sections from README
- Delete top-level PIPELINE_DESIGN.md
- Add stopTask() API call (DELETE /tasks/{id})
- Update createTask() to accept force option
- Replace deprecated Reindex buttons with Index metadata / color+AI / all
- Add force reindex checkbox to CollectionDebug
- Add stop button to each task in TaskList
- Forward force parameter through CollectionPanel and App reload handler
- Fix stale NewFacesTask reference in NewIndexTask deprecation comment
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR replaces the legacy queue-based indexing flow with a new sequential, prioritized indexing pipeline (internal/image/pipeline) coordinated by a single in-process coordinator, and updates the API + UI to support “force reindex”, “index all”, and task cancellation (DELETE /tasks/{id}).

Changes:

  • Introduces internal/image/pipeline stages (files → metadata → contents) with a coordinator that deduplicates tasks and runs them sequentially.
  • Extracts task lifecycle into internal/task and wires the coordinator into main.go handlers (including task stop endpoint).
  • Updates UI task controls (new indexing buttons, “Force reindex” toggle, and stop button) and updates the tasks API schema/SDK.

Reviewed changes

Copilot reviewed 31 out of 32 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
ui/src/components/TaskList.vue Adds per-task stop button calling DELETE /tasks/{id}.
ui/src/components/CollectionPanel.vue Updates reload event forwarding to include (type, force).
ui/src/components/CollectionDebug.vue Replaces old reindex buttons with new task types + force checkbox.
ui/src/App.vue Updates reload handler to pass {force:true} to task creation.
ui/src/api.js Adds stopTask() and extends createTask() to accept options (e.g. force).
SPLIT_PLAN.md Documents plan to split the broader refactor branch into multiple PRs.
main.go Removes global task queues; wires pipeline coordinator into /tasks handlers and adds DELETE /tasks/{id}.
internal/task/task.go New task + registry types used by the pipeline coordinator.
internal/render/scene.go Renames clip import/types to internal/ai.
internal/openapi/api.gen.go Regenerated OpenAPI server/types (force flag, INDEX_ALL, DELETE /tasks/{id}).
internal/image/source.go Removes queue-backed indexing; adds accessors used by the pipeline; renames clip→ai.
internal/image/search.go Renames clip→ai and updates similarity helpers.
internal/image/pipeline/types.go Defines typed stage payload structs for the pipeline.
internal/image/pipeline/thumbnail.go Implements thumbnail load/generate stage with worker pool + progress.
internal/image/pipeline/source.go Implements DB-sourced file candidate streams (forced vs missing-only) + metadata batching.
internal/image/pipeline/README.md Adds design/usage documentation for the new pipeline.
internal/image/pipeline/progress.go Adds structured progress logging with rate/ETA.
internal/image/pipeline/metadata.go Implements metadata extraction stage writing directly to DB.
internal/image/pipeline/files.go Implements directory walk + DB add/remove logic for file scanning.
internal/image/pipeline/coordinator.go Implements sequential coordinator, priority ordering, and task stop support.
internal/image/pipeline/coordinator_test.go Adds coordinator tests (sequential run, dedupe, priority order, shutdown).
internal/image/pipeline/contents.go Implements color extraction + AI embedding from thumbnails.
internal/image/indexMetadata.go Deletes old queue-worker metadata indexer.
internal/image/indexFiles.go Deletes old file walk helper (replaced by pipeline stage).
internal/image/indexContents.go Deletes old queue-worker contents indexer.
internal/image/database.go Renames clip→ai and adds CountMissing helper for pipeline stage totals.
internal/collection/collection.go Renames clip→ai in similarity API.
internal/ai/embedding.go New embedding type + similarity/dot product helpers (renamed from clip).
internal/ai/client.go New AI client (embeddings + faces API client code) under internal/ai.
go.mod Adjusts prometheus client_model requirement classification (now indirect).
config.go Renames config type from clip.AI to ai.AI.
api.yaml Adds force flag, INDEX_ALL task type, and DELETE /tasks/{id} endpoint; wraps POST /tasks responses in {items:[...]}.
Files not reviewed (1)
  • internal/openapi/api.gen.go: Language not supported

Comment thread internal/image/pipeline/files.go Outdated
Comment on lines +20 to +52
func walkFiles(dir string, extensions []string, maxFiles int) <-chan string {
out := make(chan string)
go func() {
finished := metrics.Elapsed(fmt.Sprintf("index %s", dir))
defer finished()

progress := newProgress("files", 0)
defer progress.Done()

files := 0
err := godirwalk.Walk(dir, &godirwalk.Options{
Unsorted: true,
Callback: func(path string, _ *godirwalk.Dirent) error {
if strings.Contains(path, "@eaDir") {
return filepath.SkipDir
}

matched := false
for _, ext := range extensions {
if strings.HasSuffix(strings.ToLower(path), ext) {
matched = true
break
}
}
if !matched {
return nil
}

files++
progress.Inc(1)
out <- path
if maxFiles > 0 && files >= maxFiles {
return errSkip
Comment on lines +88 to +93
candidates = c
} else {
// Query for missing contents
log.Println("index source files missing contents")
candidates = db.ListMissing(dirs, maxPhotos, img.Missing{Color: true, Embedding: true})
}
Comment on lines +100 to +132
batchSize := 100
ids := make([]img.ImageId, 0, batchSize)
idToPath := make(map[img.ImageId]string)
idToMissing := make(map[img.ImageId]img.Missing)

candidateCount := 0
for candidate := range candidates {
candidateCount++
ids = append(ids, candidate.Id)
idToPath[candidate.Id] = candidate.Path
idToMissing[candidate.Id] = candidate.Missing

// Process batch when full
if len(ids) >= batchSize {
log.Printf("index source batch %d files\n", len(ids))
results := db.GetBatch(ids)
resultCount := 0
for result := range results {
resultCount++
missingInfo := idToMissing[result.Id]
select {
case out <- fileWithMeta{
fileRef: fileRef{ID: result.Id, Path: idToPath[result.Id]},
Info: result.Info,
Tags: nil,
Missing: missingInfo,
}:
case <-ctx.Done():
return
}
}
ids = ids[:0]
}
Comment thread main.go Outdated
ThumbnailGenerators: pipelineThumbGens,
ThumbnailSink: imageSource.ThumbSink(),
AIService: imageSource.Clip,
ImageDecoder: imageSource.ThumbSink(),
Comment thread internal/task/task.go
Comment on lines +41 to +49
// Counter returns a channel for incrementing the task's Done counter
func (t *Task) Counter() chan<- int {
counter := make(chan int, 10)
go func() {
for add := range counter {
t.Done += add
}
}()
return counter
Comment on lines +32 to +38
<ui-icon-button
v-if="task.id"
icon="close"
class="stop-button"
title="Stop task"
@click="stopTask(task.id)"
></ui-icon-button>
Comment thread internal/image/pipeline/README.md Outdated
Comment on lines +343 to +346
All stages return error channels. Coordinator collects all errors:
- Logs first 10 errors
- Continues processing remaining items
- Returns error if any occurred
Comment on lines +118 to +131
} else {
if count, ok := cfg.DB.CountMissing(dirs, img.Missing{Color: true, Embedding: true}); ok {
if maxPhotos > 0 && count > maxPhotos {
count = maxPhotos
}
t.Total = count
if count > 0 {
log.Printf("index contents extract %d files\n", count)
}
}
}

metaOut := fileSourceWithMetadata(ctx, cfg.DB, dirs, maxPhotos, force)

…ty, and UI stop handling

Agent-Logs-Url: https://github.com/SmilyOrg/photofield/sessions/12500dec-3761-4a3f-ab32-3f791d94d6ab

Co-authored-by: SmilyOrg <1451391+SmilyOrg@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI commented May 21, 2026

Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • git.sr.ht
    • Triggering command: /dependabot-proxy /dependabot-proxy --gdwarf2 /libwebp-1.4.0 ache/go/1.24.13/-frandom-seed=LO7JbpdUR1QFN9zPwAmk /lib�� bp@v1.4.0/internal/libwebp-1.4.0-I /include it a93381b7f80e901882c260dd4e9d75e787b/log.json google.golang.or--wait @v1.4.0/internal-t /opt/hostedtoolcfilter @v1.�� @v1.4.0 @v1.4.0/internal-o u/13/cc1 -goversion go1.24.13 -c=4 tmp/go-build (dns block)
    • Triggering command: /dependabot-proxy /dependabot-proxy --gdwarf2 @v1.4.0/internalHEAD /bin/java @v1.�� b:/usr/lib -trimpath it -p al/libwebp-1.4.0-d al/libwebp-1.4.0-c al/include -uns�� @v1.4.0 ack on pipeline cancellation, AI-Wl,--no-gc-sections 3387/b300/vet.cfg -goversion @v1.4.0/internal--root t ipt (dns block)
  • go.uber.org
    • Triggering command: /dependabot-proxy /dependabot-proxy --gdwarf2 /libwebp-1.4.0 ache/go/1.24.13/-frandom-seed=LO7JbpdUR1QFN9zPwAmk /lib�� bp@v1.4.0/internal/libwebp-1.4.0-I /include it a93381b7f80e901882c260dd4e9d75e787b/log.json google.golang.or--wait @v1.4.0/internal-t /opt/hostedtoolcfilter @v1.�� @v1.4.0 @v1.4.0/internal-o u/13/cc1 -goversion go1.24.13 -c=4 tmp/go-build (dns block)
    • Triggering command: /dependabot-proxy /dependabot-proxy --gdwarf2 @v1.4.0/internalHEAD /bin/java @v1.�� b:/usr/lib -trimpath it -p al/libwebp-1.4.0-d al/libwebp-1.4.0-c al/include -uns�� @v1.4.0 ack on pipeline cancellation, AI-Wl,--no-gc-sections 3387/b300/vet.cfg -goversion @v1.4.0/internal--root t ipt (dns block)
  • gonum.org
    • Triggering command: /dependabot-proxy /dependabot-proxy --gdwarf2 /libwebp-1.4.0 ache/go/1.24.13/-frandom-seed=LO7JbpdUR1QFN9zPwAmk /lib�� bp@v1.4.0/internal/libwebp-1.4.0-I /include it a93381b7f80e901882c260dd4e9d75e787b/log.json google.golang.or--wait @v1.4.0/internal-t /opt/hostedtoolcfilter @v1.�� @v1.4.0 @v1.4.0/internal-o u/13/cc1 -goversion go1.24.13 -c=4 tmp/go-build (dns block)
    • Triggering command: /dependabot-proxy /dependabot-proxy --gdwarf2 @v1.4.0/internalHEAD /bin/java @v1.�� b:/usr/lib -trimpath it -p al/libwebp-1.4.0-d al/libwebp-1.4.0-c al/include -uns�� @v1.4.0 ack on pipeline cancellation, AI-Wl,--no-gc-sections 3387/b300/vet.cfg -goversion @v1.4.0/internal--root t ipt (dns block)
  • google.golang.org
    • Triggering command: /dependabot-proxy /dependabot-proxy --gdwarf2 /libwebp-1.4.0 ache/go/1.24.13/-frandom-seed=LO7JbpdUR1QFN9zPwAmk /lib�� bp@v1.4.0/internal/libwebp-1.4.0-I /include it a93381b7f80e901882c260dd4e9d75e787b/log.json google.golang.or--wait @v1.4.0/internal-t /opt/hostedtoolcfilter @v1.�� @v1.4.0 @v1.4.0/internal-o u/13/cc1 -goversion go1.24.13 -c=4 tmp/go-build (dns block)
    • Triggering command: /dependabot-proxy /dependabot-proxy --gdwarf2 @v1.4.0/internalHEAD /bin/java @v1.�� b:/usr/lib -trimpath it -p al/libwebp-1.4.0-d al/libwebp-1.4.0-c al/include -uns�� @v1.4.0 ack on pipeline cancellation, AI-Wl,--no-gc-sections 3387/b300/vet.cfg -goversion @v1.4.0/internal--root t ipt (dns block)
  • modernc.org
    • Triggering command: /dependabot-proxy /dependabot-proxy --gdwarf2 /libwebp-1.4.0 ache/go/1.24.13/-frandom-seed=LO7JbpdUR1QFN9zPwAmk /lib�� bp@v1.4.0/internal/libwebp-1.4.0-I /include it a93381b7f80e901882c260dd4e9d75e787b/log.json google.golang.or--wait @v1.4.0/internal-t /opt/hostedtoolcfilter @v1.�� @v1.4.0 @v1.4.0/internal-o u/13/cc1 -goversion go1.24.13 -c=4 tmp/go-build (dns block)
    • Triggering command: /dependabot-proxy /dependabot-proxy --gdwarf2 @v1.4.0/internalHEAD /bin/java @v1.�� b:/usr/lib -trimpath it -p al/libwebp-1.4.0-d al/libwebp-1.4.0-c al/include -uns�� @v1.4.0 ack on pipeline cancellation, AI-Wl,--no-gc-sections 3387/b300/vet.cfg -goversion @v1.4.0/internal--root t ipt (dns block)
  • zombiezen.com
    • Triggering command: /dependabot-proxy /dependabot-proxy --gdwarf2 /libwebp-1.4.0 ache/go/1.24.13/-frandom-seed=LO7JbpdUR1QFN9zPwAmk /lib�� bp@v1.4.0/internal/libwebp-1.4.0-I /include it a93381b7f80e901882c260dd4e9d75e787b/log.json google.golang.or--wait @v1.4.0/internal-t /opt/hostedtoolcfilter @v1.�� @v1.4.0 @v1.4.0/internal-o u/13/cc1 -goversion go1.24.13 -c=4 tmp/go-build (dns block)
    • Triggering command: /dependabot-proxy /dependabot-proxy --gdwarf2 @v1.4.0/internalHEAD /bin/java @v1.�� b:/usr/lib -trimpath it -p al/libwebp-1.4.0-d al/libwebp-1.4.0-c al/include -uns�� @v1.4.0 ack on pipeline cancellation, AI-Wl,--no-gc-sections 3387/b300/vet.cfg -goversion @v1.4.0/internal--root t ipt (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

SmilyOrg added 2 commits May 22, 2026 00:50
- Add TaskRunner hook to Config for test injection
- Rewrite TestCoordinatorDuplicatePrevention to verify in-flight dedup
  using a blocking TaskRunner with started/block channels
- Rewrite TestCoordinatorList to assert post-completion empty state
  instead of a mid-flight snapshot
@SmilyOrg SmilyOrg merged commit 533220a into main May 23, 2026
5 checks passed
@SmilyOrg SmilyOrg deleted the split/pipeline-core branch May 23, 2026 15:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants