Upgrade Dockerfiles to Python 3.13 and handlers 2.0.0b8 by kthare10 · Pull Request #433 · fabric-testbed/ControlFramework

kthare10 · 2026-05-20T16:15:14Z

Summary

Upgrade all Dockerfiles from Python 3.12.12 to 3.13.13
Bump fabric-am-handlers to 2.0.0b8 (ansible 13.6.0 / ansible-core 2.20)
Pin openstack.cloud:<2.0.0 for compatibility with openstacksdk 0.61.0 on remote hosts
Install community.libvirt ansible collection in Dockerfile-auth
Bump fabric-cf to 2.0.0b2

Test plan

VM create/delete tested on UKY site
Build and verify all Docker images (auth, broker, orchestrator, cf)
End-to-end slice provisioning test

Move do_relinquish() from except to finally block in both the BlockedJoin close path and the probe_pending() Closing path so the Broker is notified regardless of whether the Authority close RPC succeeds or fails. do_relinquish() is idempotent so the duplicate call in update_lease(CloseWait) is safe.

occupied_node_capacity() now defaults to start=now, end=now when no time range is specified. Previously, all Ticketed reservations (including future advance reservations) were counted as currently occupied, causing cores_allocated to exceed cores_capacity at sites like UCSD (657 > 640). Calendar/scheduling callers that pass explicit start/end are unaffected.

…work into resource-calendar

…cate broker queries When PeriodicProcessor starts a refresh, concurrent user requests were bypassing the cache and each sending expensive queries to the broker. Now non-forced requests return the stale cached model while a refresh is already in progress, eliminating redundant broker round-trips.

Add timeout-based recovery to BqmWrapper.can_refresh(): if refresh_in_progress has been True longer than refresh_interval_in_seconds, treat it as a failed refresh and allow a new one. This prevents the cache from freezing permanently when save() is never called due to an exception, hung broker query, or thread issue.

In plug_produce_bqm_summary(), component allocations were looked up per-device and accumulated per type/model, causing workers with N devices of the same type/model to report N× the actual allocation. Split into two passes: first accumulate capacity per device, then set allocations once per type/model from DB query results—matching the pattern already used in plug_produce_bqm().

Read-only methods (get_reservations, get_components, get_links, etc.) were not rolling back the session on exception. Since sessions are cached per thread, a failed query left PostgreSQL in a failed transaction state, causing all subsequent queries on the same thread to fail with InFailedSqlTransaction until the process was restarted.

- Base image: python:3.11.0 → python:3.12.12 - Create /opt/venv in all Dockerfiles to comply with PEP 668 - Update all python/pip paths to use /opt/venv/bin - AM handlers: 1.9.1 → 2.0.0b2 - ansible: unversioned → 13.6.0 - fabric_fss_utils: 1.6.2 → 1.7.0 - fabrictestbed: 2.0.6 → 2.0.7 - requires-python: >=3.9 → >=3.12 - Version bump to 2.0.0b1

Install community.libvirt, openstack.cloud, and cisco.nso collections that are not bundled with ansible 13.6.0 but required by AM handlers.

…on remote hosts

kthare10 added 20 commits April 9, 2026 13:33

Merge branch 'main' of https://github.com/fabric-testbed/ControlFrame…

542acf8

…work into resource-calendar

Add over-allocation warning logging to BQM summary and graph paths

d5ba99c

extend slivers

0fb48c2

update export

656b210

Add ansible-galaxy collection installs and bump handlers to 2.0.0b3

dc69aae

Install community.libvirt, openstack.cloud, and cisco.nso collections that are not bundled with ansible 13.6.0 but required by AM handlers.

Bump AM handlers to 2.0.0b4 for ansible-core 2.20 callback fix

2775336

Bump AM handlers to 2.0.0b5

8ab5df9

Pin openstack.cloud<2.0.0 for compatibility with openstacksdk 0.61.0 …

b6240b6

…on remote hosts

Remove cisco.nso collection, bump handlers to 2.0.0b6

ec1e976

Upgrade all Dockerfiles to Python 3.13.13 and bump handlers to 2.0.0b7

61e7e83

Bump to 2.0.0b2 and update handlers to 2.0.0b8

695ab7f

Add cisco.nso collection to Dockerfile-auth for nso_config FQCN

ef34e7b

Bump handlers to 2.0.0b9 in Dockerfile-auth

9599c78

kthare10 requested a review from paul-ruth May 20, 2026 16:26

kthare10 self-assigned this May 20, 2026

paul-ruth approved these changes May 20, 2026

View reviewed changes

kthare10 merged commit 93fdd0f into main May 20, 2026
4 checks passed

kthare10 deleted the resource-calendar branch May 20, 2026 18:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade Dockerfiles to Python 3.13 and handlers 2.0.0b8#433

Upgrade Dockerfiles to Python 3.13 and handlers 2.0.0b8#433
kthare10 merged 20 commits into
mainfrom
resource-calendar

kthare10 commented May 20, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kthare10 commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kthare10 commented May 20, 2026 •

edited

Loading