[SANIBEL] Fix memory utilisation reporting (migration race condition) #1528

Merged
merged 1 commit into from Mar 11, 2014

Conversation

Projects
None yet
1 participant
Collaborator

simonjbeaumont commented Oct 30, 2013

Fixes a race condition in pool migration whereby the memory-actual field may be incorrectly set to a spuriously low value and never updated.

The change is described in commit f0f94eb.

This has been tested by adding a delay on the sender side of the migration prior to the destruction of the domain and the spoofing of a memory value on the sender side triggered by a FIST file.

It is likely this will also need fixing on trunk which will not just be a simple forward-port due to the rrdd and xenopsd disaggregation.

@simonjbeaumont simonjbeaumont CA-112880: Fix race condition writing memory-actual
There was a race condition between the sender and receiver during migration.
This is due to the interplay with the monitor threads. Monitor_rrds.update_rrds
is run every 5 seconds and checks for changes. This module also has a cache of
values and only writes to the database if the cache is dirtied to avoid
excessive database writes. There is such a monitor thread running on each host.

The monitor thread on the sender may wake and write a value to the database
after the domain has been resumed on the receiver but before it is destroyed on
the sender. This could be a spuriously low value in the case where Xen is
reclaiming the memory on the sender side during the domain destroy.

The receiver however may have already updated its cache and written its value
to the database. It never writes on subsequent runs of the monitor thread since
it thinks the value that is in its cache (the correct value) is already in the
database (which it is not).

To remedy this we need to ensure that the monitor for a given host only updates
the VM_metrics in the database for VMs that are resident on that host. We also
provide a means to mark the cache as dirty when the VM.resident_on field
changes since the receiver may have its cache updated before the resident_on
field is set.

Signed-off-by: Si Beaumont <simon.beaumont@citrix.com>
f0f94eb

simonjbeaumont merged commit 13ea4d3 into xapi-project:sanibel-lcm Mar 11, 2014

1 check passed

default Merged build finished.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment