Skip to content

Conversation

@Camyll
Copy link
Contributor

@Camyll Camyll commented Mar 11, 2025

Addresses issue pytorch/pytorch#143041

Adds scaleupchron job to check queue for jobs that have been queued for long periods of time. Directly calls scale up for them

cherry picked from Zain's original PR #6018

@vercel
Copy link

vercel bot commented Mar 11, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Updated (UTC)
torchci ⬜️ Ignored (Inspect) Visit Preview Mar 11, 2025 6:12pm

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 11, 2025
SCALE_CONFIG_REPO = var.scale_config_repo
SCALE_CONFIG_REPO_PATH = var.scale_config_repo_path
SCALE_UP_MIN_QUEUE_TIME_MINUTES = 30
SCALE_UP_RECORD_QUEUE_URL = "https://hud.pytorch.org/api/clickhouse/queued_jobs_aggregate?parameters=%5B%5D"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: need to get actual URL from Zain

@Camyll Camyll force-pushed the camyllh/scale_up_chron branch from a4d879b to 1ab676a Compare March 11, 2025 17:49
@Camyll Camyll force-pushed the camyllh/scale_up_chron branch from 1ab676a to d53b9bc Compare March 11, 2025 18:12
@Camyll Camyll closed this May 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants