Skip to content

Sync master into feature/trusted-certs#6903

Merged
changlei-li merged 107 commits intofeature/trusted-certsfrom
master
Feb 10, 2026
Merged

Sync master into feature/trusted-certs#6903
changlei-li merged 107 commits intofeature/trusted-certsfrom
master

Conversation

@changlei-li
Copy link
Copy Markdown
Contributor

No description provided.

Stephen Cheng and others added 30 commits January 15, 2026 10:33
DNF5 logs command-line arguments to /var/log/dnf5.log, exposing
proxy_password when passed via `dnf config-manager setopt`.

Write proxy credentials directly to the .repo file (mode 0o400) and
remove them after sync completes to avoid password exposure in logs.

Signed-off-by: Stephen Cheng <stephen.cheng@citrix.com>
Signed-off-by: Stephen Cheng <stephen.cheng@citrix.com>
Signed-off-by: Lin Liu <lin.liu01@citrix.com>
…ne (#6836)

DNF5 logs command-line arguments to /var/log/dnf5.log, exposing
proxy_password when passed via `dnf config-manager setopt`.

Write proxy credentials directly to the .repo file (mode 0o400) and
remove them after sync completes to avoid password exposure in logs.

Tested:
4530865 CloudRepoUpdate
With the fix, threre's no password.
```
# grep password /var/log/dnf5*
[root@genus-34-01d ~]#
```

4530867 CloudRepoUpdate
Without the fix, there're passwords in the logs.
```
# grep password /var/log/dnf5*
/var/log/dnf5.log:2026-01-14T09:44:42+0000 [33030] INFO --- DNF5 launched with arguments: "/usr/bin/dnf config-manager setopt remote-49e2ea36-154c-9fd4-51bb-4b319d25d05a.proxy=http://10.62.50.94:3128 remote-49e2ea36-154c-9fd4-51bb-4b319d25d05a.proxy_username=debian remote-49e2ea36-154c-9fd4-51bb-4b319d25d05a.proxy_password=JwNm8I2rAxlB" ---
/var/log/dnf5.log.1:2026-01-14T09:44:42+0000 [33023] INFO --- DNF5 launched with arguments: "/usr/bin/dnf config-manager setopt remote-9b300a0d-765a-3426-5ad2-0384bc427c28.proxy=http://10.62.50.94:3128 remote-9b300a0d-765a-3426-5ad2-0384bc427c28.proxy_username=debian remote-9b300a0d-765a-3426-5ad2-0384bc427c28.proxy_password=JwNm8I2rAxlB" ---
```
This patch adds a helper to compute the free space on a SR. It is used to
check that the suspend SR has enough space when creating a snapshot with
memory. If there is not enough space, SR_SUSPEND_SPACE_INSUFFICIENT is raised.

Signed-off-by: Guillaume <guillaume.thouvenin@vates.tech>
If you do not know the amount of space required when calling this
function, you can pass None.

This patch allows us to detect, before attempting to save the VM state,
that there is not enough available space on the SR.
We have seen failures for some customers when snapshotting with memory,
and it was not easy to determine that the SR did not have enough space,
because it was not the SR on which the VM was resident. With this patch,
the error is clearer and is raised much earlier.
Adding a new CPU RRD metric: "numa_node_nonaffine_vcpus" per domain
as fraction of vCPU time running outside of vCPU affinity.

Signed-off-by: Marcus Granado <marcus.granado@cloud.com>
Merge branch 'cp-310822-upstream-patchqueue-1-rrd3' into cp-310822-upstream-patchqueue

Signed-off-by: Marcus Granado <marcus.granado@cloud.com>
Now the numa node needs to be passed. A special value of (~0U) is used to
signify that no node is meant to be used. Since this is arch-dependent,
and contained in a long in x86_64, an int is used to encode the value.

Also remove the exception that was guarding the codepath to use this case

Signed-off-by: Pau Ruiz Safont <pau.ruizsafont@cloud.com>
Signed-off-by: Christian Lindig <christian.lindig@citrix.com>
…rsion-in-xen-4.21-wi

Merge branch 'cp-310822-upstream-patchqueue-2-cp-53658' into cp-310822-upstream-patchqueue

Signed-off-by: Marcus Granado <marcus.granado@cloud.com>
…odes (CA-411684)

Free memory is now properly accounted for because the memory pages are claimed
within the NUMA mutex, so there's no need to have double tracking.

On top of that, this code never increased the free memory, which means that it
always reached a point where it was impossible to allocate a domain into a
single numa node.

Signed-off-by: Pau Ruiz Safont <pau.ruizsafont@cloud.com>
…k-of-free-memory-when

Merge branch 'cp-310822-upstream-patchqueue-3-ca-411684' into cp-310822-upstream-patchqueue

Signed-off-by: Marcus Granado <marcus.granado@cloud.com>
Signed-off-by: Marcus Granado <marcus.granado@cloud.com>
…node

Signed-off-by: Marcus Granado <marcus.granado@cloud.com>
Signed-off-by: Marcus Granado <marcus.granado@cloud.com>
Merge branch 'cp-310822-upstream-patchqueue-4-cp-54238' into cp-310822-upstream-patchqueue

Signed-off-by: Marcus Granado <marcus.granado@cloud.com>
Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
this avoids a link error in systems that don't have this function:

/usr/bin/ld:
ocaml/libs/xenctrl-ext/libxenctrl_ext_stubs.a(xenctrlext_stubs.o): in
function `stub_xenctrlext_domain_claim_pages':
xen-api/_build/default/ocaml/libs/xenctrl-ext/xenctrlext_stubs.c:688:
undefined reference to `xc_domain_claim_pages_node'

set errno to ENOSYS if not defined, to keep behaviour
consistent with other stubs in this file.

keep CAMLparam4 in both cases to avoid unused parameter warnings.

Signed-off-by: Marcus Granado <marcus.granado@cloud.com>
Signed-off-by: Marcus Granado <marcus.granado@cloud.com>
Signed-off-by: Marcus Granado <marcus.granado@cloud.com>
The target pool has leaved AD, the joining host leave AD as well.
However, the AD status is somehow corrupt
- external_auth_type is empty, this is expected
- external_auth_service_name is a valid domain

This confused pool.join as it thinks AD is not enabled, but
somehow joined to a domain.

- Normal domain leave does not resolve the issue, and it does not
join domain
- Join domain again(failed) does not resolve it neither, as xapi will
restore to the current value before join on failed.

This commit introduce force option to host.disable_external_auth API
to force clean up to recover host

BTW, current code try to keep them consistent already, but not atomic.

Signed-off-by: Lin Liu <lin.liu01@citrix.com>
The target pool has leaved AD, the joining host leave AD as well.
However, the AD status is somehow corrupt
- external_auth_type is empty, this is expected
- external_auth_service_name is a valid domain

This confused pool.join as it thinks AD is not enabled, but somehow
joined to a domain.

- Normal domain leave does not resolve the issue, and it does not join
domain
- Join domain again(failed) also does not resolve it, as xapi will
restore to the current value before join on failed.

 This commit introduce force option to host.disable_external_auth API
 to force clean up to recover host
BTW, current code try to keep them consistent already, but not atomic.
This PR contains the same patches for XS9 that were present in the
patchqueue.

I organised the PR so that each merge corresponds to an entry in the
patchqueue:
* 0004-rrd3.patch
* 0003-CP-53658-adapt-claim_pages-to-version-in-xen-4.21-wi.patch
* 0005-xenopsd-xc-do-not-try-keep-track-of-free-memory-when.patch
* 0005-rrd4.patch

The behaviour is unchanged compared to the patchqueue (the code is the
same).

Tested with BVT 233166.
Verified that the RRD metrics in the patches are preserved as expected,
using:
```
# xe vm-list params=all| grep numa
                        numa-optimised ( RO): true
                            numa-nodes ( RO): 1
                      numa-node-memory (MRO): 0: 2149572608; 1: 0

# rrd2csv AVERAGE:::numa_nodes
timestamp, AVERAGE:vm:5d9459d1-7209-ee36-e404-c6b537aa2cd3:numa_nodes
2026-01-19T13:30:15Z, 1
2026-01-19T13:30:20Z, 1
^C
[# rrd2csv AVERAGE:::memory_numa_node
timestamp, AVERAGE:vm:5d9459d1-7209-ee36-e404-c6b537aa2cd3:memory_numa_node_1, AVERAGE:vm:5d9459d1-7209-ee36-e404-c6b537aa2cd3:memory_numa_node_0
2026-01-19T13:30:25Z, 0, 2149572608
2026-01-19T13:30:30Z, 0, 2149572608
2026-01-19T13:30:35Z, 0, 2149572608
^C
```
When a CDR is removed from an ISO SR the corresponding VDI is deleted.
So far we relied on the DB GC to mark the VBD as empty. This creates a
window for a race where the VDI/CD is reported as present when in fact
it is not. So mark the VBD as empty as early as possible.

Signed-off-by: Christian Lindig <christian.lindig@citrix.com>
When a CDR is removed from an ISO SR the corresponding VDI is deleted.
So far we relied on the DB GC to mark the VBD as empty. This creates a
window for a race where the VDI/CD is reported as present when in fact
it is not. So mark the VBD as empty as early as possible.
Also improve the logging a little (e.g. log "suspend" rather than
"shutdown" when suspending).

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
In the event we add a package to one of the yum repository groups, that
has no packages requiring it, dnf5 will not install it by default (as it
has not implemented the `upgrade_group_objects_upgrade` configuration
option - see
https://dnf5.readthedocs.io/en/latest/dnf5.conf-todo.5.html).

We do not wish to be reliant on having to ensure all new packages are
required by an existing package, as such (at least until an
implementation of the above option is done in dnf5) trigger a `group
upgrade` in addition to the `upgrade`.

Signed-off-by: Alex Brett <alex.brett@citrix.com>
The error can be produced while doing a suspend, not only on
checkpoints. Remove the last sentence so it applied in both cases

Signed-off-by: Pau Ruiz Safont <pau.safont@vates.tech>
Signed-off-by: Alex Brett <alex.brett@citrix.com>
liulinC and others added 25 commits February 3, 2026 06:10
… for master branch (#6875)

Some XCP-ng customers have observed xapi recording duplicate XOStore SM
backends.

On these cases the `host_pending_feature` map present in both SM objects
by the xapi objects were different: one was empty, and the other one had
an empty entry for each of the hosts in the pool.

This PR tries to address this by detecting the latter case and
transforming into the former. It also has changes to the startup process
where the processing of the SM objects could miss updating some of them,
and now makes a point of handling SM objects and SM types differently to
avoid missing as many cases as possible.

Since I've been unable to reproduce the issue of duplicates, there's
also code added to detect and log duplicates after the processing at the
startup has ended to be able to better identify when the situation
happens.

Port of #6873 to master
Unfortunately, the SR size and utilisation fields are only set after an
SR scan, and the new space check has been incorrectly rejecting suspends
in some cases.

Fix this by just catching the error from `VDI.create` and raising the
new error to make it clear that it is the suspend SR that is out of
space.

Fixes 77c6bf3.

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Unfortunately, the SR size and utilisation fields are only set after an
SR scan, and the new space check has been incorrectly rejecting suspends
in some cases.

Fix this by just catching the error from `VDI.create` and raising the
new error to make it clear that it is the suspend SR that is out of
space.

Fixes 77c6bf3.
…y in node for new VM (#6867)

The available memory in the node that can actually be further claimed in
the node for a new VM needs to be calculated as
(meminfo.memfree - meminfo.claimed). Considering only meminfo.memfree
does not consider VMs that may have claimed in the node but not yet
finished allocating the claimed amount.

A new debug line during the calculation makes it easier to understand
the different node memory values being considered and calculated for the
VMs being created, eg:
```
2026-01-27T00:03:44.788754+00:00 orca xenopsd-xc: [debug||22 |Async.VM.start_on R:4533f11cb6bf|xenops]
mem_claimable_for_new_vm: NUMA nodeid=0, domid=69: memfree=281696272384 memsize=825707462656 claimed=249459507200: available=32236765184
2026-01-27T00:03:44.788770+00:00 orca xenopsd-xc: [debug||22 |Async.VM.start_on R:4533f11cb6bf|xenops]
mem_claimable_for_new_vm: NUMA nodeid=1, domid=69: memfree=336155111424 memsize=824633720832 claimed=303146598400: available=33008513024
...
```

Tested with:
* SR-233407 (xs9 numa functional regression test)
* job 4541679, where this change calculates the correct nodes to start
the VMs in a host with 8 nodes and high contention of VMs starting
simultaneously.
rrdp-squeezed currently has a threshold of >4096 to count the number
of nodes. This is to ignore small number of internal data structures
that xen or other kernel devices may sometimes allocate for the VM
outside the node where the VM's main memory is allocated.

This is a temporary fix until we account in CP-311303 about these
small number of pages that sometimes appear out of the VM's main node.
In experiments, it's usually a single-digit number like 1. The maximum
number observed was around ~2200 pages.

Without this fix, the VM.numa_nodes calculation is different of the
one returned by rrdp-squeezed, and VM.numa_nodes is over-sensitive to
these small number of pages in other nodes.

Signed-off-by: Marcus Granado <marcus.granado@cloud.com>
This is an optimisation for the calculation of the number of nodes until
we fix all the small xen internal allocations for the VM outside the
main guest's memory. This synchronises the behaviour in the RRD
VM.numa_nodes calculation.

    rrdp-squeezed currently has a threshold of >4096 to count the number
    of nodes. This is to ignore small number of internal data structures
    that xen or other kernel devices may sometimes allocate for the VM
    outside the node where the VM's main memory is allocated.

    This is a temporary fix until we account in CP-311303 about these
small number of pages that sometimes appear out of the VM's main node.
In experiments, it's usually a single-digit number like 1. The maximum
    number observed was around ~2200 pages.

    Without this fix, the VM.numa_nodes calculation is different of the
one returned by rrdp-squeezed, and VM.numa_nodes is over-sensitive to
    these small number of pages in other nodes.
Signed-off-by: Edwin Török <edwin.torok@citrix.com>
Signed-off-by: Edwin Török <edwin.torok@citrix.com>
…ng built

When a domain takes a long time to be built (e.g. >1TiB) then squeezed might run and attempt to change maxmem, causing the domain build to fail to complete.

```
2026-02-04T13:59:12.915844+00:00 orca squeezed: [debug||9 ||squeeze_xen] Xenctrl.domain_setmaxmem domid=717 max=6370254848 (was=0)
2026-02-04T13:59:22.878301+00:00 orca squeezed: [debug||3 ||squeeze_xen] Xenctrl.domain_setmaxmem domid=717 max=2075287552 (was=6370254848)
```

Squeezed shouldn't change the maxmem setting on domains that have never been run (other than to initialize it if 0).
In fact another module in Squeezed had code to detect whether a domain has ever
been run, which has been replaced with checking whether it has an active
balloon driver (if it hasn't reported a balloon driver it is still not very
safe to change it too early).

But that check missed one place that was still setting maxmem, ignoring the
balloon driver's presence. Fix this (hopefully last!) place: if there is no
balloon driver and we attempt to decrease maxmem then just log a message
instead.

Fixes: 9819bdb ("CA-32810: prevent the memory ballooning daemon capping a domain's memory usage before it has written feature-balloon.")

Signed-off-by: Edwin Török <edwin.torok@citrix.com>
When calling `VDI.copy` or `VDI.pool_migrate` with `vm_power_admin` role,
xapi may forward the operation to a remote host. In this case, xapi creates
a pool session on the remote host and create a new task. When the
operation completes, `try_internal_async` uses the user's session to
destroy the task that was created by an internal pool session, but the
user doesn't have the permission to destory other user's task
(task.destroy/any), so it fails.

Solution:
This is an internal cleanup operation, so it doesn't need user RBAC
restriction and checking. Ignore RBAC when destroying internal tasks by
calling Db_actions.DB_Action.Task.destroy directly.

Signed-off-by: Bengang Yuan <bengang.yuan@citrix.com>
This will be reused by some test code.

Signed-off-by: Edwin Török <edwin.torok@citrix.com>
Got some `String.blit` exceptions, which could've happened if the calculated
time was negative, and thus exceeded 10 digits when printed with `eta`.

Use monotonic time instead, so that we don't crash when the clock is adjusted.

Signed-off-by: Edwin Török <edwin.torok@citrix.com>
The ETA is copied into a preallocated bytes buffer of fixed length.
When the ETA exceeds 99h then drop the seconds and report just `hh:mm`.
If it still doesn't fit then replace it with a static string '++:++:++'
(this would only happen if the ETA is >11 years, although it could happen if an
operation became stuck. The operation could still recover and then the ETA will
print a number).

This could also happen if a task makes near 0 progress by the time we attempt to
print progress, which would result in a near infinite ETA.

Signed-off-by: Edwin Török <edwin.torok@citrix.com>
Currently total time is printed as `hh:mm:ss`, and that will be 00:00:00 if
an operation takes <1s.
But that is confusing when an operation takes 0.9s for
example.

Using Mtime.Span.pp instead, which prints values at an appropriate scale based
on their magnitude.

Signed-off-by: Edwin Török <edwin.torok@citrix.com>
Was trying to debug a Xenctrl exception in xenopsd, but all I was seeing
was the exception, not the location it was raised.
With these 2 changes I was able to see the full stacktrace.

Here is how a working stacktrace looks like:
```
 Raised Server_error(INTERNAL_ERROR, [ xenopsd internal error: Domain.Domain_build_pre_failed("Calling 'NUMA placement' failed: Xenctrlext.Unix_error(12, \\"Error when trying to claim memory pages\\")")
 1/60 xenopsd-xc Raised at file ocaml/xenopsd/xc/domain.ml, line 1159
 2/60 xenopsd-xc Called from file ocaml/xenopsd/xc/domain.ml, line 1210
 3/60 xenopsd-xc Called from file ocaml/xenopsd/xc/domain.ml, line 1441
 4/60 xenopsd-xc Called from file ocaml/xenopsd/xc/xenops_server_xen.ml, line 2589
 5/60 xenopsd-xc Called from file ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml, line 24
 6/60 xenopsd-xc Called from file ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml, line 39
 7/60 xenopsd-xc Called from file ocaml/xenopsd/xc/xenops_server_xen.ml, line 2622
 8/60 xenopsd-xc Called from file ocaml/xenopsd/xc/xenops_server_xen.ml, line 2653
 9/60 xenopsd-xc Called from file ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml, line 24
 10/60 xenopsd-xc Called from file ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml, line 39
 11/60 xenopsd-xc Called from file ocaml/xenopsd/lib/xenops_task.ml, line 103
 12/60 xenopsd-xc Called from file ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml, line 39
 13/60 xenopsd-xc Called from file ocaml/xenopsd/lib/xenops_task.ml, line 103
 14/60 xenopsd-xc Called from file ocaml/xenopsd/lib/xenops_task.ml, line 112
 15/60 xenopsd-xc Called from file ocaml/xenopsd/lib/xenops_server.ml, line 2575
 16/60 xenopsd-xc Called from file list.ml, line 121
 17/60 xenopsd-xc Called from file ocaml/xenopsd/lib/xenops_server.ml, line 2568
 18/60 xenopsd-xc Called from file ocaml/xenopsd/lib/xenops_server.ml, line 2727
 19/60 xenopsd-xc Called from file ocaml/xenopsd/lib/xenops_task.ml, line 103
 20/60 xenopsd-xc Called from file ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml, line 39
 21/60 xenopsd-xc Called from file ocaml/xenopsd/lib/xenops_task.ml, line 103
 22/60 xenopsd-xc Called from file ocaml/xenopsd/lib/xenops_task.ml, line 112
 23/60 xenopsd-xc Called from file ocaml/xenopsd/lib/xenops_server.ml, line 2575
 24/60 xenopsd-xc Called from file list.ml, line 121
 25/60 xenopsd-xc Called from file ocaml/xenopsd/lib/xenops_server.ml, line 2568
 26/60 xenopsd-xc Called from file ocaml/xenopsd/lib/xenops_server.ml, line 2727
 27/60 xenopsd-xc Called from file ocaml/xenopsd/lib/xenops_task.ml, line 103
 28/60 xenopsd-xc Called from file ocaml/xenopsd/lib/xenops_task.ml, line 112
 29/60 xenopsd-xc Called from file ocaml/xenopsd/lib/xenops_server.ml, line 3428
 30/60 xenopsd-xc Called from file ocaml/xenopsd/lib/xenops_server.ml, line 3438
 31/60 xenopsd-xc Called from file ocaml/xenopsd/lib/xenops_server.ml, line 3459
 32/60 xenopsd-xc Called from file ocaml/xapi-idl/lib/task_server.ml, line 192
 33/60 xapi Called from file ocaml/xapi/xapi_xenops.ml, line 3522
 34/60 xapi Called from file ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml, line 24
 35/60 xapi Called from file ocaml/xapi/xapi_xenops.ml, line 3921
 36/60 xapi Called from file ocaml/xapi/xapi_xenops.ml, line 3531
 37/60 xapi Called from file lib/backtrace.ml, line 210
 38/60 xapi Called from file ocaml/xapi/xapi_xenops.ml, line 3537
 39/60 xapi Called from file ocaml/xapi/context.ml, line 565
 40/60 xapi Called from file ocaml/xapi/context.ml, line 572
 41/60 xapi Called from file ocaml/xapi/xapi_xenops.ml, line 3928
 42/60 xapi Called from file ocaml/xapi/xapi_xenops.ml, line 3531
 43/60 xapi Called from file ocaml/xapi/xapi_xenops.ml, line 3640
 44/60 xapi Called from file ocaml/xapi/context.ml, line 565
 45/60 xapi Called from file ocaml/xapi/context.ml, line 572
 46/60 xapi Called from file ocaml/xapi/xapi_vm.ml, line 347
 47/60 xapi Called from file ocaml/xapi/message_forwarding.ml, line 141
 48/60 xapi Called from file ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml, line 24
 49/60 xapi Called from file ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml, line 39
 50/60 xapi Called from file ocaml/xapi/message_forwarding.ml, line 1990
 51/60 xapi Called from file ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml, line 24
 52/60 xapi Called from file ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml, line 39
 53/60 xapi Called from file ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml, line 24
 54/60 xapi Called from file ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml, line 39
 55/60 xapi Called from file ocaml/xapi/message_forwarding.ml, line 1974
 56/60 xapi Called from file ocaml/xapi/rbac.ml, line 228
 57/60 xapi Called from file ocaml/xapi/rbac.ml, line 238
 58/60 xapi Called from file ocaml/xapi/server_helpers.ml, line 78
 59/60 xapi Called from file ocaml/xapi/server_helpers.ml, line 97
 60/60 xapi Called from file ocaml/libs/log/debug.ml, line 270

```
When calling `VDI.copy` or `VDI.pool_migrate` with `vm_power_admin`
role, xapi may forward the operation to a remote host. In this case,
xapi creates a pool session on the remote host and create a new task.
When the operation completes, `try_internal_async` uses the user's
session to destroy the task that was created by an internal pool
session, but the user doesn’t have the permission to destory other
user’s task (task.destroy/any), so it fails.

Solution:
This is an internal cleanup operation, so it doesn’t need user RBAC
restriction and checking. Ignore RBAC when destroying internal tasks by
calling Db_actions.DB_Action.Task.destroy directly.
cli_progress_bar is used by `xe --progress`, and I've reused it in my
test code in #6858.
However >90% of my test runs failed on various machines due to a
`String.blit` exception from `cli_progress_bar`.

There are 2 possible reasons, not sure which one caused the failure, but
I've fixed both, and now I have a lot more green tests (and the failures
are due to actual bugs in the product, not bugs in the progress bar):
* if the ETA printed would be >99h (even just temporarily) then we'd
overflow the buffer's size and raise an exception. `%02d` means at least
2 digits, not at most!
* if time goes backwards then we'd get a negative ETA and try to print a
`-` and overflow the buffer size again and raise an exception. Replaced
it with monotonic time

This also contains an improvement I've made on the other PR to print
total time in `ms` (to avoid having to solve rebase conflicts twice in
the 2 PRs). This avoids printing awkward looking lines like Total time
00:00:00, when it actually took 0.9s maybe.
Xenopsd reraises some exceptions in a different, simplified form.
But this needs to retain the stacktrace from the original place that raised the
first exception, otherwise it might be hard to debug.

Signed-off-by: Edwin Török <edwin.torok@citrix.com>
Xenopsd reraises some exceptions in a different, simplified form. But
this needs to retain the stacktrace from the original place that raised
the first exception, otherwise it might be hard to debug.

This is a followup to [the previous PR
](#6891), I found a few more
places that would lose a stacktrace while working on another code change
in the area.

Recommended to review with 'ignore whitespaces' enabled.

To avoid losing the backtrace we need to mark it with
`Backtrace.is_important` and to use `Backtrace.reraise` instead of
`raise`.

(or we could do this using purely the stdlib with
`Printexc.get_raw_backtrace()` and `Printexc.raise_with_bt`, but the
rest of the code around here uses Backtrace already, so for consistency
use that)
External authentication using Active Directory can use a cache to
improve performance. The corresponding fields on the pool object so far
were hidden. This patch makes them visible, and as a consequence more
easily accessible from the CLI and its autocompletion.

Signed-off-by: Christian Lindig <christian.lindig@citrix.com>
External authentication using Active Directory can use a cache to
improve performance. The corresponding fields on the pool object so far
were hidden. This patch makes them visible, and as a consequence more
easily accessible from the CLI and its autocompletion.
This allows it to proceed in parallel with the parse side, not deadlocking on
the filled pipe.

Signed-off-by: Andrii Sultanov <andriy.sultanov@vates.tech>
…ng… (#6890)

… built

When a domain takes a long time to be built (e.g. >1TiB) then squeezed
might run and attempt to change maxmem, causing the domain build to fail
to complete.

```
2026-02-04T13:59:12.915844+00:00 orca squeezed: [debug||9 ||squeeze_xen] Xenctrl.domain_setmaxmem domid=717 max=6370254848 (was=0)
2026-02-04T13:59:22.878301+00:00 orca squeezed: [debug||3 ||squeeze_xen] Xenctrl.domain_setmaxmem domid=717 max=2075287552 (was=6370254848)
```

Squeezed shouldn't change the maxmem setting on domains that have never
been run (other than to initialize it if 0). In fact another module in
Squeezed had code to detect whether a domain has ever been run, which
has been replaced with checking whether it has an active balloon driver
(if it hasn't reported a balloon driver it is still not very safe to
change it too early).

But that check missed one place that was still setting maxmem, ignoring
the balloon driver's presence. Fix this (hopefully last!) place: if
there is no balloon driver and we attempt to decrease maxmem then just
log a message instead.

Fixes: 9819bdb ("CA-32810: prevent the memory ballooning daemon
capping a domain's memory usage before it has written feature-balloon.")
This allows it to proceed in parallel with the parse side, not
deadlocking on the filled pipe.
@changlei-li changlei-li merged commit 90dc197 into feature/trusted-certs Feb 10, 2026
55 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.