Upgrade Dockerfiles to Python 3.13 and handlers 2.0.0b8#433
Merged
Conversation
Move do_relinquish() from except to finally block in both the BlockedJoin close path and the probe_pending() Closing path so the Broker is notified regardless of whether the Authority close RPC succeeds or fails. do_relinquish() is idempotent so the duplicate call in update_lease(CloseWait) is safe.
occupied_node_capacity() now defaults to start=now, end=now when no time range is specified. Previously, all Ticketed reservations (including future advance reservations) were counted as currently occupied, causing cores_allocated to exceed cores_capacity at sites like UCSD (657 > 640). Calendar/scheduling callers that pass explicit start/end are unaffected.
…work into resource-calendar
…cate broker queries When PeriodicProcessor starts a refresh, concurrent user requests were bypassing the cache and each sending expensive queries to the broker. Now non-forced requests return the stale cached model while a refresh is already in progress, eliminating redundant broker round-trips.
Add timeout-based recovery to BqmWrapper.can_refresh(): if refresh_in_progress has been True longer than refresh_interval_in_seconds, treat it as a failed refresh and allow a new one. This prevents the cache from freezing permanently when save() is never called due to an exception, hung broker query, or thread issue.
In plug_produce_bqm_summary(), component allocations were looked up per-device and accumulated per type/model, causing workers with N devices of the same type/model to report N× the actual allocation. Split into two passes: first accumulate capacity per device, then set allocations once per type/model from DB query results—matching the pattern already used in plug_produce_bqm().
Read-only methods (get_reservations, get_components, get_links, etc.) were not rolling back the session on exception. Since sessions are cached per thread, a failed query left PostgreSQL in a failed transaction state, causing all subsequent queries on the same thread to fail with InFailedSqlTransaction until the process was restarted.
- Base image: python:3.11.0 → python:3.12.12 - Create /opt/venv in all Dockerfiles to comply with PEP 668 - Update all python/pip paths to use /opt/venv/bin - AM handlers: 1.9.1 → 2.0.0b2 - ansible: unversioned → 13.6.0 - fabric_fss_utils: 1.6.2 → 1.7.0 - fabrictestbed: 2.0.6 → 2.0.7 - requires-python: >=3.9 → >=3.12 - Version bump to 2.0.0b1
Install community.libvirt, openstack.cloud, and cisco.nso collections that are not bundled with ansible 13.6.0 but required by AM handlers.
paul-ruth
approved these changes
May 20, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
fabric-am-handlersto 2.0.0b8 (ansible 13.6.0 / ansible-core 2.20)openstack.cloud:<2.0.0for compatibility with openstacksdk 0.61.0 on remote hostscommunity.libvirtansible collection in Dockerfile-authfabric-cfto 2.0.0b2Test plan