Skip to content

refactor(admin-api): generalize scheduler trait over job kinds#1842

Merged
LNSD merged 1 commit intomainfrom
lnsd/refactor-admin-api-job-kinds
Feb 23, 2026
Merged

refactor(admin-api): generalize scheduler trait over job kinds#1842
LNSD merged 1 commit intomainfrom
lnsd/refactor-admin-api-job-kinds

Conversation

@LNSD
Copy link
Contributor

@LNSD LNSD commented Feb 20, 2026

Separate job descriptor construction from scheduling so the scheduler becomes a generic job registration service, pushing dataset-kind awareness to the API handler where request context lives.

  • Introduce JobDescriptor bridge type, decoupling the scheduler trait from typed worker crates and metadata-db storage
  • Rename schedule_dataset_sync_job to schedule_job accepting JobDescriptor
  • Move raw vs. derived job descriptor construction into the deploy handler
  • Shift amp-worker-datasets-{raw,derived} deps from controller to admin-api
  • Remove SerializeJobDescriptor error variant from scheduler

Note

Medium Risk
Touches the job scheduling interface and how job descriptors are constructed/stored, so mis-conversion or wrong dataset-kind selection could schedule incorrect jobs despite being largely a refactor.

Overview
Refactors job scheduling so the scheduler is a generic job-registration service: schedule_dataset_sync_job is replaced by schedule_job which accepts a pre-built JobDescriptor.

Dataset-kind specific descriptor construction (raw vs derived) is moved into the dataset deploy handler, and a new JobDescriptor bridge type is introduced to convert between typed worker descriptors and metadata-db JSON storage (including adding JobDescriptorRaw::into_inner). Dependencies are shifted accordingly (worker descriptor crates move from controller to admin-api), and the scheduler error surface is simplified by removing JSON serialization failures.

Written by Cursor Bugbot for commit 9936ef2. This will update automatically on new commits. Configure here.

@LNSD LNSD self-assigned this Feb 21, 2026
@LNSD LNSD force-pushed the lnsd/refactor-admin-api-job-kinds branch 5 times, most recently from 5c143ed to 17cc612 Compare February 23, 2026 11:12
@LNSD LNSD marked this pull request as ready for review February 23, 2026 11:13
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

This is the final PR Bugbot will review for you during this billing cycle

Your free Bugbot reviews will reset on March 21

Details

Your team is on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle for each member of your team.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

Copy link
Contributor

@shiyasmohd shiyasmohd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ✅

Comment on lines +150 to +167
let job_descriptor = if dataset.kind() == DerivedDatasetKind {
MaterializeDerivedDatasetJobDescriptor {
end_block: end_block.into(),
dataset_namespace: reference.namespace().clone(),
dataset_name: reference.name().clone(),
manifest_hash: reference.hash().clone(),
}
.into()
} else {
MaterializeRawDatasetJobDescriptor {
end_block: end_block.into(),
max_writers: parallelism,
dataset_namespace: reference.namespace().clone(),
dataset_name: reference.name().clone(),
manifest_hash: reference.hash().clone(),
}
.into()
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this logic to a JobDescriptor::materialise_dataset() fn ?

Copy link
Contributor Author

@LNSD LNSD Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping this in the handler intentionally.

JobDescriptor is a thin, type-erased bridge. It shouldn't know about MaterializeRawDatasetJobDescriptor or MaterializeDerivedDatasetJobDescriptor internal details. Construction must use the actual typed worker descriptors at the call site where the dataset-kind context lives.

Eventually, these will be extracted directly from the request body, so the handler is the right place for this logic.

Separate job descriptor construction from scheduling so the scheduler becomes a generic job registration service, pushing dataset-kind awareness to the API handler where request context lives.

- Introduce `JobDescriptor` bridge type decoupling the scheduler trait from typed worker crates and metadata-db storage
- Rename `schedule_dataset_sync_job` to `schedule_job` accepting `JobDescriptor`
- Move raw vs. derived job descriptor construction into the deploy handler
- Shift `amp-worker-datasets-{raw,derived}` deps from controller to admin-api
- Remove `SerializeJobDescriptor` error variant from scheduler

Signed-off-by: Lorenzo Delgado <lorenzo@edgeandnode.com>
@LNSD LNSD force-pushed the lnsd/refactor-admin-api-job-kinds branch from 17cc612 to 9936ef2 Compare February 23, 2026 12:30
@LNSD LNSD merged commit e6af81f into main Feb 23, 2026
8 checks passed
@LNSD LNSD deleted the lnsd/refactor-admin-api-job-kinds branch February 23, 2026 12:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants