Skip to content


Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
branch: go-faster-damn…
Commits on Jul 26, 2012
  1. CA-85940: avoid spinning in the cancel_test

    It will suffice to check for state stabilisation every 1s or so.
    We could eventually convert this to use events from xen, xenops
    and xapi but it would be a lot more code for little return.
    Signed-off-by: David Scott <>
Commits on Jul 25, 2012
  1. CA-85940: fix a race between deleting the guest metrics and updating it

    If the VM exposes guest agent data very early then we will delete
    the old guest metrics (because a reboot has happened) but fail to
    notice the new values are present and fail to create the new guest metrics.
    We now check the data from the guest agent immediately after we
    spot a reboot and delete the old guest agent data.
    Signed-off-by: David Scott <>
Commits on Jul 24, 2012
  1. CA-85940: extract a common 'wait_for_all_tasks' function in a new "Ta…

    …sks" module
    Also remove the knowledge of the structure of the event.from token
    from code that links with the client -- all clients should treat this
    as opaque.
    Signed-off-by: David Scott <>
  2. CA-85940: add an 'events from xapi' in parallel with an 'events from …

    Rename 'Events.wait' to 'Events_from_xenops.wait' and add
    'Events_from_xapi.wait' which injects an event into the pool event
    queue and waits for it to be processed.
    Signed-off-by: David Scott <>
  3. CA-83940: generalise "with_migrating_away" to "with_events_suppressed"

    This function temporarily stops events from xenopsd from being processed
    for a particular VM. It is useful to prevent races between a foreground
    thread and the background event thread e.g.
    In Xapi_xenops.start:
      1. VBDs are set to currently_attached = true
      2. metadata is pushed to xenopsd
         <- at this point an event on the xenopsd VM would cause
            the VBD currently_attached field to be set to false so
            we block events
      3. the VM is started
         <- at this point it is safe to re-enable syncing.
    Signed-off-by: David Scott <>
    Whenever we want to stop events from xenopsd for a particula
  4. CA-85940: add event.eject(session, class, objref) which generates an …

    …event on a specific object
    The return result of the event.eject will compare as strictly less-than
    under a lexicographic sort of any event received by event.from which
    is either this event or a strictly later one.
    So to make sure all events have been flushed through your event.from loop,
    it suffices to call event.inject, and wait for event.from to return a
    strictly greater token.
    Signed-off-by: David Scott <>
  5. CA-85940: Add a database-level operation to bump the generation count…

    … on a row
    This is useful for the side-effect of generating an extra event on an object.
    Note that we bump the generation count twice, which means that the new
    count - 1 will guarantee to compare less than to an event generated
    afterewards with the new count.
    Signed-off-by: David Scott <>
  6. CA-85940: For all XenAPI calls with wire name "Class.method", add an …

    …alias "Class_method"
    This provides an easy way for python's xmlrpclib to call functions
    whose names clash with builtin keywords e.g.
      x = xmlrpclib.Server(url)
      x.event_from(session, [ "VM/%s" % vm ], "", 60.)
    Signed-off-by: David Scott <>
  7. CA-85940: Use xenstore rather than the VmExtra for active_{vbds,vifs}

    Changing the VmExtra effectively changes the migrate protocol, which
    is a bad idea at this point in the development cycle.
    We also automatically delete the VBD and VIF metadata from xenopsd
    when currently_attached is set to false. To make this work we:
      1. set currently_attached to true in Xapi_xenops.start
      2. only push down currently_attached VBDs and VIFs in the Xenopsd_metadata
    Signed-off-by: David Scott <>
  8. Remove leaks in the xenopsd <-> xapi caches

    We have two caches:
      Xapi_cache: contains the latest pool VM metadata, translated into
         xenopsd format. Whenever the user changes the VM's name_label
         (for example), this is used to update the xenopsd metadata for
         next reboot.
      Xenops_cache: contains the latest runtime VM state, as queried
         from xenopsd. This is used to minimise the number of writes
         to the pool database.
    The cache lifetime is tied to the lifetime of the metadata in xenopsd:
    whenever we add to xenopsd we initialise the cache; whenever we remove
    from xenopsd we remove from the cache.
    Signed-off-by: David Scott <>
Commits on Jul 23, 2012
  1. CA-85940: update following Xenctrl interface change

    Signed-off-by: David Scott <>
  2. CA-85940: when "cancelling" a xenguest subprocess, raise the Cancelle…

    …d exception
    Signed-off-by: David Scott <>
  3. CA-85940: add the concept of 'active' VBDs and VIFs to xenopsd

    During a VM.start we set all VBDs and VIFs to active. Within
    xenopsd a {VBD,VIF}.plug will be ignored if the device is not
    active. This means that on resume, when we bring back exactly
    those devices which were marked as "active" at the time of
    the preceeding VM.start.
    We now distinguish between a VBD_plug as part of a compound
    operation and a user-requested VBD_hotplug, where the latter
    also makes the device active.
    We set XenAPI {VIF,VBD}.currently_attached = active || plugged
     -- this means that on suspend, currently_attached stays true
        since we never set active to "false"
    Unfortunately the active flags get wiped in the middle of a
    reboot, so a reboot behaves the same way as a start i.e. it
    connects all devices, not just those which were connected before
    the shutdown.
    Signed-off-by: David Scott <>
  4. CA-85940: add a test-case to check each "cancel point"

    The test-case measures the number of "cancel points" traversed by
    each lifecycle operation. It then checks that, if a "cancel point"
    is triggered, that the system ends up getting into a valid state.
    In particular it looks for:
     * xen domains and devices
     * xenopsd VM metadata
     * xapi VM metadata
    all being in sync.
    Run the test on an arbitrary test VM as follows (in dom0):
    /root/cancel_tests -pw <password> -vm <VM name>
    Signed-off-by: David Scott <>
  5. CA-85940: never directly remove VIF/VBD metadata in failure paths

    A background thread constantly synchronises XenAPI metadata with xenopsd
    metadata, so it doesn't make sense to remove objects which will be
    immediately re-added.
    It is ok to 'refresh' the metadata in a {VIF,VBD}.plug operation.
    Redefinite what 'plugged' means for VIFs so that a VIF is always
    plugged while there is still pending cleanup action (e.g. in xenstore).
    This avoids a cancelled unplug leaving stale state lying around.
    Clarify the "epoch_{begin,end}" APIs -- these need to know the disk
    but not the whole Vbd.t
    Signed-off-by: David Scott <>
  6. CA-85940: VIF.unplug now passes the cancellation test

    This required a redefinition of what "plugged" means.
    We now consider a VIF to be "plugged" if the resources
    are still allocated by the host, as reflected by
    in xenstore.
    When the udev remove event fires and the "hotplug-status"
    node is removed, we consider the device as needing an
    explicit unplug via the "device action request" mechanism.
    Signed-off-by: David Scott <>
  7. CA-85940: omit spurious error List.failure("hd") when a VM isn't running

    We expect there to be no domids when a VM isn't running-- this is a
    common case and doesn't deserve to generate an exception and a warning.
    Signed-off-by: David Scott <>
  8. CA-85940: on error in suspend, clean up the xapi metadata and force-s…

    …hutdown the domain if it has already 'suspended'
    The xenops event thread in xapi now handles the 'power_state = Suspended' case by
    pulling the metadata from xenopsd and calling VM.remove.
    If a failure happens during suspend, xapi will wait for the event sync
    and check the power-state. If the VM is suspended then it will be
    force_shutdown because the failure will have left the VM in an invalid
    state (probably through a missing or truncated suspend VDI). If the
    VM is in any other state then it can be safely left.
    Signed-off-by: David Scott <>
  9. CA-85940: on xapi start, startup storage before syncing with xenopsd

    The xenopsd sync will require access to the storage layer, to detach
    and deactivate VDIs.
    Signed-off-by: David Scott <>
  10. CA-85940: xenopsd can now be set to trigger the cancellation of a tas…

    …k at a numbered cancellation point
    This allows a test harness to verify that cancellation is safe.
    Signed-off-by: David Scott <>
  11. CA-85940: Copy HTTP headers "X-Http-other-config-K=V" into Task.other…

    This is the first step towards allowing a XenAPI client to propagate
    arbitrary information (tokens, debug configuration, fault injection)
    through the whole stack.
    Signed-off-by: David Scott <>
  12. CA-85940: add a Task.debug_info (string * string list) and return the…

    … number of "cancel points" seen
    The Task.debug_info can be used to return arbitrary data without
    officially extending (and polluting) the interface.
    The number of "cancel points" refers to the number of times the
    task could have been cancelled.
    Signed-off-by: David Scott <>
  13. CA-85940: rename Task.debug_info to Task.dbg for consistency

    We use the argument name "dbg" to refer to the debug token passed
    along with each API call. This value is then stored in the Task.dbg
    Signed-off-by: David Scott <>
  14. CA-85940: add "on failure" code to all the VM lifecycle operations

    After a partial failure (e.g. arbitrary process restart) xenopsd always
    leaves each VM in a state which is either
      * clearly marked as broken ("domain action request = poweroff")
      * intact (Running, Paused, Suspended etc)
    Therefore all we need to do after detecting a failure (or a restart)
    is to trigger a {VM,VBD,VIF,PCI}_check_state operation, which will
    then enqueue the appropriate corrective action i.e.
      * VM_poweroff if broken
      * nothing if intact
    In particular we should not try to perform bespoke, per-operation
    cleanup (e.g. VM_start -> VM_poweroff). Instead we simply invoke
    the correct trigger.
    Signed-off-by: David Scott <>
  15. CA-85940: log when we're about to raise a task Cancelling exception

    Signed-off-by: David Scott <>
  16. CA-85940: in VM.{start,resume} rely on the xapi<->xenops event sync i…

    …n the error path
    If an error happens during VM.start then we wish
    1. xenopsd to be in a consistent state: a VM should be either completely
       intact or shutdown (see separate commit)
    2. xapi to synchronise its state with xenopsd, not try to override it
    To cause (2), we:
    1. ensure that VM.resident_on is set in both success and failure cases
       -- this activates the event sycnrhonisation
    2. inject a barrier into the event queue to block the XenAPI thread
       until the synchronisation completes.
    If an error happens during VM.resume we don't set resident_on because we
    wish to protect the power_state, currently_attached fields, so we can try
    again later.
    Signed-off-by: David Scott <>
Something went wrong with that request. Please try again.