Skip to content

[ML] Migrate to Backstage for BuildKite pipeline definitions#2497

Merged
edsavage merged 2 commits intoelastic:mainfrom
edsavage:buildkite_backstage_migration
May 31, 2023
Merged

[ML] Migrate to Backstage for BuildKite pipeline definitions#2497
edsavage merged 2 commits intoelastic:mainfrom
edsavage:buildkite_backstage_migration

Conversation

@edsavage
Copy link
Copy Markdown
Contributor

@edsavage edsavage commented May 30, 2023

Add a "catalog-info.yaml" file that contains definitions of the ml-cpp BuildKite pipelines. These will take precedence of those defined in the CI repo. Once the catalog is merged to main and detected by the CI Backstage system the old ml-cpp* pipeline definition files in the CI repo can be deleted.

Also note that the specification of the pipeline files has been changed to refer to the python scripts directly. This is taking advantage of the fact that the CI infrastructure will run such executable scripts if detected.

Add a "catalog-info.yaml" file that contains definitions of the ml-cpp
BuildKite pipelines. These will take precedence of those defined in the
CI repo. Once the catalog is merged to main and detected by the CI
Backstage system the old ml-cpp* pipeline definition files in the CI
repo can be deleted.
@edsavage
Copy link
Copy Markdown
Contributor Author

retest

Copy link
Copy Markdown

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment thread catalog-info.yaml Outdated
branch: '7.17'
cronline: 30 03 * * *
message: Daily SNAPSHOT build for 7.17
Daily 8_7:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8.7 can be deleted now 8.8 is released.

@edsavage
Copy link
Copy Markdown
Contributor Author

Merging this now, without waiting for Jenkins CI, as these changes by their nature do not relate to Jenkins builds

@edsavage edsavage merged commit f58cd7f into elastic:main May 31, 2023
@edsavage edsavage deleted the buildkite_backstage_migration branch June 2, 2023 12:36
edsavage added a commit to edsavage/ml-cpp that referenced this pull request Apr 12, 2026
Build elastic#2497 timed out because a HuggingFace model download stalled at
0% for 58 minutes (unauthenticated rate limiting). Two fixes:

1. Add HF_TOKEN injection to the validation step via post-checkout hook,
   reading from vault (secret/ci/elastic-ml-cpp/huggingface/hf_token).
   Authenticated requests get higher rate limits and more reliable
   downloads from HuggingFace Hub.

2. Add per-model timeout (default 10 minutes, configurable via
   --model-timeout) using SIGALRM. Models that can't be downloaded
   and traced within the timeout are skipped rather than consuming
   the entire step timeout. This prevents a single stalled download
   from failing the whole validation run.

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants