2.8.0rc2: Bug fixes and document updates
Pre-release
Pre-release
What's Changed
- [2.8] Clear retained training result after accept events by @pcnudde in #4525
- [2.8] Align CLI Python support with 3.10-3.14 by @pcnudde in #4533
- [2.8] Add run-scoped tensor offload temp cleanup by @pcnudde in #4534
- [doc][2.8] Backport LLM-friendly docs outputs by @pcnudde in #4539
- [2.8] Fix job pods stuck in certain states by @IsaacYangSLA in #4527
- [2.8] Add parent pod python path by @IsaacYangSLA in #4526
- [2.8] Clarify simulator and API doc signposts [skip ci] by @YuanTingHsieh in #4535
- [2.8] Validate k8s deploy namespace by @YuanTingHsieh in #4529
- [2.8] Add warnings for missing study data mappings by @YuanTingHsieh in #4528
- [2.8] Avoid k8s pod name collisions across sites by @pcnudde in #4547
- [2.8] Warn on deploy prepare storage rewrites by @YuanTingHsieh in #4530
- [2.8] Remove Auto-FL example from release branch by @holgerroth in #4553
- [2.8] Refresh stale integration tests by @YuanTingHsieh in #4549
- [2.8] Fix job timeout status in list_jobs by @IsaacYangSLA in #4552
- [2.8] Update deploy prepare launcher docs by @YuanTingHsieh in #4538
- [2.8] Add clean_up parameter to recipe Run.get_result() by @YuanTingHsieh in #4550
- [2.8] Update brev user guide and scripts to follow deploy prepare command by @IsaacYangSLA in #4562
- [2.8] Clarify remove_client token cleanup semantics by @YuanTingHsieh in #4561
- [2.8] Fix _LogTailProducer EXECUTION_EXCEPTION ERROR at job startup by @nvidianz in #4558
- [2.8] Cyclic CIFAR10 example: set start_task_timeout=300 to avoid 10s… by @nvshaxie in #4567
- [2.8] ccwf swarm: fix SwarmServerConfig.min_clients default + start_t… by @nvshaxie in #4568
- [2.8] Respect visible GPUs in resource manager by @YuanTingHsieh in #4563
- [2.8] Fix StorageException after every job from empty SystemLogStreamer context by @nvidianz in #4559
- [2.8] Skip auth filter for cellnet protocol-level bye messages by @nvshaxie in #4569
- [2.8] Fix a potential issue on the server by @IsaacYangSLA in #4575
- [2.8] Fix Docker SJ workspace tmpfs permissions by @YuanTingHsieh in #4574
- [2.8] Reorg devops folder by @IsaacYangSLA in #4571
- [2.8] Fix poc cli 22 sec timeout issue by @IsaacYangSLA in #4579
- [2.8] Narrow client failure reporting by @YuanTingHsieh in #4576
- Backport job metadata path containment to 2.8 by @pcnudde in #4581
- [2.8] Fix tracking recipe integration test by @YuanTingHsieh in #4583
- [2.8] Add Flower run_config regression test by @holgerroth in #4588
- [2.8] Fix JobLogReceiver not registering in server-job subprocess by @nvidianz in #4586
- [2.8] Fix Auto-FedRL CIFAR10 paths by @holgerroth in #4589
- [2.8] Fix FedBPT CMA fobs registration by @holgerroth in #4590
- [2.8] Rework integration test CI layout by @YuanTingHsieh in #4585
- [2.8] Fix BioNeMo inference recipe by @holgerroth in #4591
- [2.8] Map running job delete to JOB_NOT_DONE by @YuanTingHsieh in #4596
- [2.8] Add premerge license check by @pcnudde in #4606
- [2.8] Fix PT persistor bootstrap from empty checkpoint by @holgerroth in #4602
- [2.8] Bind auth tokens to runtime origins by @pcnudde in #4605
- [2.8] Backport FLARE-2952 aborted job download race fix by @pcnudde in #4607
- [2.8] Use uv for pre-merge unit test installs by @pcnudde in #4617
- [2.8] Make K8s server service name configurable by @pcnudde in #4616
- [2.8]: Cherry pick of PR 4608 (Mock more classes/functions) by @IsaacYangSLA in #4618
- [2.8]: Fix session.list_job_components() error by @IsaacYangSLA in #4622
- [2.8] Backport job not found admin error normalization by @pcnudde in #4623
- [2.8] Fix aborted job status publication race by @YuanTingHsieh in #4613
- [2.8] Fix admin CLI login rejection handling by @pcnudde in #4624
- [2.8] Allow CellPipe stream aliases for auth binding by @pcnudde in #4627
- [2.8] Add FedBPT Job API entry point by @holgerroth in #4629
- [2.8] Update integration CI suite setup by @YuanTingHsieh in #4626
- [2.8] Fix XGBoost review diagnostics by @YuanTingHsieh in #4635
- [2.8] Add 2.8.0 release notes [skip ci] by @chesterxgchen in #4630
Full Changelog: 2.8.0rc1...2.8.0rc2