[TEST] feat/pullfactory — PullFactory extension point + widened LLM payload overrides#47
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changed
PullFactoryextension point in@bytebell/ingest-github, mirroring the existingSourceFactoryfor index jobs.registerGithubWorkers({ sourceFactory, pullFactory })now accepts both; OSS leaves both undefined and falls back to the default disk-backed clone +git diffpath.runPull(pipeline/pull.ts) refactored:pullFactoryis provided, it returns{ source, diff, targetCommit, archiveSink? }and the worker skips the localensureReposRoot/syncRepository/materialiseEndpoints/assertReachableFromBranch/computePullDiff/checkoutCommit/createDiskSourceReaderchain entirely.analyseChangedFilesnow takes aSourceReader(and optionalArchiveSink) instead of a rawrepoDir, so the pull path no longer assumes a local checkout exists.strategies/flat-folder/analyse-changed.tsupdated to read throughSourceReaderinstead of the filesystem directly.@bytebell/ingest-github'sindex.tsso downstream wrappers can compose their own runner/handlers:createPipelineRunner+CreatePipelineRunnerDepscreateGithubIngestHandler,createLocalIngestHandler,IngestJobHandlerDepsrunPull,reposRootIngestRunnerDeps,IngestRunnerInputPullFactory,PullFactoryInput,PullFactoryResultDiffResult,RenamedFile@bytebell/typesPayloadLlmOverrides:llmProviderwidened from the closed union"openrouter" | "ollama"tostringso downstream taxonomies ("anthropic","gemini","mistral", …) round-trip through the payload. OSS still only routes onopenrouter/ollama; other values are ignored by the OSS client.llmKeyId— opaque to OSS, used by downstream resolvers as an audit pointer back to the source of truth.GithubPullPayloadgains an optionalorgId(parallel toGithubIndexPayload.orgId). OSS leaves it unset and readsConfig.OrgId("local") as before; multi-tenant wrappers stamp it from the request.context.mdupdated everywhere touched:ingest-github/,ingest-github/src/,pipeline/,strategies/flat-folder/,types/, plus@bytebell/typespackage + src.Why
The per-call LLM credentials work landed the data path; this PR lands the matching control path for pulls. Downstream multi-tenant deployments need to:
api-streaminginstead of a localgit cloneper worker, or a pre-materialised mount).Without
PullFactory, every downstream had to forkrunPull. WithoutllmKeyId/ widenedllmProvider/orgId, downstream payloads couldn't round-trip cleanly through the OSS type. OSS standalone behaviour is unchanged — all new fields are optional and the default branch inrunPullis the old code path verbatim.Builds on top of
feat/per-call-llm-creds, hence the base.