Switch branches/tags
Commits on Aug 1, 2012
  1. CA-87459: if an event from xapi is received for a VM which has been d…

    David Scott committed Aug 1, 2012
    …estroyed, skip and don't pause for 15s
    It's possible to have a VM be shutdown and uninstalled before all
    events have been processed. If this happens then we log the
    event and avoid calling "Thread.delay 15."
    When a VM is actually shutdown, we trigger a xapi event re-registration
    since we don't need to receive events on it any more.
    Signed-off-by: David Scott <>
  2. CA-87324: when starting a PV VM, don't "plug" empty VBDs

    David Scott committed Aug 1, 2012
    We respect the convention that empty VBDs are currently_attached = false
    for PV guests.
    Signed-off-by: David Scott <>
Commits on Jul 31, 2012
  1. CA-87324: simulate VBD eject/insert as plug/unplug in xapi rather tha…

    David Scott committed Jul 31, 2012
    …n xenopsd
    Simulating this in xenopsd leads to states where a VDI has been
    ejected but the device is still 'active' and therefore 'currently_attached=true'
    Although this is self-consistent, it confuses xapi and the regression
    tests using the XenAPI. Instead we fall back to transforming eject into
    unplug and insert into plug at the XenAPI level.
    Signed-off-by: David Scott <>
  2. CA-87324: return whether a VM is running HVM in VM.stat

    David Scott committed Jul 31, 2012
    Signed-off-by: David Scott <>
  3. CA-87324: a non-existent xenstore device doesn't require any maintena…

    David Scott committed Jul 31, 2012
    …nce action
    So 'device_action_request' should return None rather than throwing an exception.
    Signed-off-by: David Scott <>
Commits on Jul 30, 2012
  1. Merge pull request #784 from amarao/master

    jonludlam committed Jul 30, 2012
    Fix to xe-edit-bootloader
  2. Merge pull request #734 from santoshj/master

    jonludlam committed Jul 30, 2012
    Make vif script robust by building XENBUS_PATH from domain and device ids when its not set.
  3. Merge pull request #774 from djs55/CA-85940

    jonludlam committed Jul 30, 2012
    CA-85940: bugs in cancellation are hereby cancelled
Commits on Jul 26, 2012
  1. CA-85940: avoid spinning in the cancel_test

    David Scott committed Jul 26, 2012
    It will suffice to check for state stabilisation every 1s or so.
    We could eventually convert this to use events from xen, xenops
    and xapi but it would be a lot more code for little return.
    Signed-off-by: David Scott <>
  2. CA-85940: fix a race between deleting the guest metrics and updating it

    David Scott committed Jul 25, 2012
    If the VM exposes guest agent data very early then we will delete
    the old guest metrics (because a reboot has happened) but fail to
    notice the new values are present and fail to create the new guest metrics.
    We now check the data from the guest agent immediately after we
    spot a reboot and delete the old guest agent data.
    Signed-off-by: David Scott <>
  3. CA-85940: extract a common 'wait_for_all_tasks' function in a new "Ta…

    David Scott committed Jul 23, 2012
    …sks" module
    Also remove the knowledge of the structure of the event.from token
    from code that links with the client -- all clients should treat this
    as opaque.
    Signed-off-by: David Scott <>
  4. CA-85940: add an 'events from xapi' in parallel with an 'events from …

    David Scott committed Jul 20, 2012
    Rename 'Events.wait' to 'Events_from_xenops.wait' and add
    'Events_from_xapi.wait' which injects an event into the pool event
    queue and waits for it to be processed.
    Signed-off-by: David Scott <>
  5. CA-83940: generalise "with_migrating_away" to "with_events_suppressed"

    David Scott committed Jul 20, 2012
    This function temporarily stops events from xenopsd from being processed
    for a particular VM. It is useful to prevent races between a foreground
    thread and the background event thread e.g.
    In Xapi_xenops.start:
      1. VBDs are set to currently_attached = true
      2. metadata is pushed to xenopsd
         <- at this point an event on the xenopsd VM would cause
            the VBD currently_attached field to be set to false so
            we block events
      3. the VM is started
         <- at this point it is safe to re-enable syncing.
    Signed-off-by: David Scott <>
    Whenever we want to stop events from xenopsd for a particula
  6. CA-85940: add event.eject(session, class, objref) which generates an …

    David Scott committed Jul 20, 2012
    …event on a specific object
    The return result of the event.eject will compare as strictly less-than
    under a lexicographic sort of any event received by event.from which
    is either this event or a strictly later one.
    So to make sure all events have been flushed through your event.from loop,
    it suffices to call event.inject, and wait for event.from to return a
    strictly greater token.
    Signed-off-by: David Scott <>
  7. CA-85940: Add a database-level operation to bump the generation count…

    David Scott committed Jul 20, 2012
    … on a row
    This is useful for the side-effect of generating an extra event on an object.
    Note that we bump the generation count twice, which means that the new
    count - 1 will guarantee to compare less than to an event generated
    afterewards with the new count.
    Signed-off-by: David Scott <>
  8. CA-85940: For all XenAPI calls with wire name "Class.method", add an …

    David Scott committed Jul 20, 2012
    …alias "Class_method"
    This provides an easy way for python's xmlrpclib to call functions
    whose names clash with builtin keywords e.g.
      x = xmlrpclib.Server(url)
      x.event_from(session, [ "VM/%s" % vm ], "", 60.)
    Signed-off-by: David Scott <>
  9. CA-85940: Use xenstore rather than the VmExtra for active_{vbds,vifs}

    David Scott committed Jul 19, 2012
    Changing the VmExtra effectively changes the migrate protocol, which
    is a bad idea at this point in the development cycle.
    We also automatically delete the VBD and VIF metadata from xenopsd
    when currently_attached is set to false. To make this work we:
      1. set currently_attached to true in Xapi_xenops.start
      2. only push down currently_attached VBDs and VIFs in the Xenopsd_metadata
    Signed-off-by: David Scott <>
  10. Remove leaks in the xenopsd <-> xapi caches

    David Scott committed Jul 19, 2012
    We have two caches:
      Xapi_cache: contains the latest pool VM metadata, translated into
         xenopsd format. Whenever the user changes the VM's name_label
         (for example), this is used to update the xenopsd metadata for
         next reboot.
      Xenops_cache: contains the latest runtime VM state, as queried
         from xenopsd. This is used to minimise the number of writes
         to the pool database.
    The cache lifetime is tied to the lifetime of the metadata in xenopsd:
    whenever we add to xenopsd we initialise the cache; whenever we remove
    from xenopsd we remove from the cache.
    Signed-off-by: David Scott <>
  11. CA-85940: update following Xenctrl interface change

    David Scott committed Jul 17, 2012
    Signed-off-by: David Scott <>
  12. CA-85940: when "cancelling" a xenguest subprocess, raise the Cancelle…

    David Scott committed Jul 17, 2012
    …d exception
    Signed-off-by: David Scott <>
  13. CA-85940: add the concept of 'active' VBDs and VIFs to xenopsd

    David Scott committed Jul 17, 2012
    During a VM.start we set all VBDs and VIFs to active. Within
    xenopsd a {VBD,VIF}.plug will be ignored if the device is not
    active. This means that on resume, when we bring back exactly
    those devices which were marked as "active" at the time of
    the preceeding VM.start.
    We now distinguish between a VBD_plug as part of a compound
    operation and a user-requested VBD_hotplug, where the latter
    also makes the device active.
    We set XenAPI {VIF,VBD}.currently_attached = active || plugged
     -- this means that on suspend, currently_attached stays true
        since we never set active to "false"
    Unfortunately the active flags get wiped in the middle of a
    reboot, so a reboot behaves the same way as a start i.e. it
    connects all devices, not just those which were connected before
    the shutdown.
    Signed-off-by: David Scott <>
  14. CA-85940: add a test-case to check each "cancel point"

    David Scott committed Jul 17, 2012
    The test-case measures the number of "cancel points" traversed by
    each lifecycle operation. It then checks that, if a "cancel point"
    is triggered, that the system ends up getting into a valid state.
    In particular it looks for:
     * xen domains and devices
     * xenopsd VM metadata
     * xapi VM metadata
    all being in sync.
    Run the test on an arbitrary test VM as follows (in dom0):
    /root/cancel_tests -pw <password> -vm <VM name>
    Signed-off-by: David Scott <>
  15. CA-85940: never directly remove VIF/VBD metadata in failure paths

    David Scott committed Jul 16, 2012
    A background thread constantly synchronises XenAPI metadata with xenopsd
    metadata, so it doesn't make sense to remove objects which will be
    immediately re-added.
    It is ok to 'refresh' the metadata in a {VIF,VBD}.plug operation.
    Redefinite what 'plugged' means for VIFs so that a VIF is always
    plugged while there is still pending cleanup action (e.g. in xenstore).
    This avoids a cancelled unplug leaving stale state lying around.
    Clarify the "epoch_{begin,end}" APIs -- these need to know the disk
    but not the whole Vbd.t
    Signed-off-by: David Scott <>
  16. CA-85940: VIF.unplug now passes the cancellation test

    David Scott committed Jul 16, 2012
    This required a redefinition of what "plugged" means.
    We now consider a VIF to be "plugged" if the resources
    are still allocated by the host, as reflected by
    in xenstore.
    When the udev remove event fires and the "hotplug-status"
    node is removed, we consider the device as needing an
    explicit unplug via the "device action request" mechanism.
    Signed-off-by: David Scott <>
  17. CA-85940: omit spurious error List.failure("hd") when a VM isn't running

    David Scott committed Jul 16, 2012
    We expect there to be no domids when a VM isn't running-- this is a
    common case and doesn't deserve to generate an exception and a warning.
    Signed-off-by: David Scott <>
  18. CA-85940: on error in suspend, clean up the xapi metadata and force-s…

    David Scott committed Jul 16, 2012
    …hutdown the domain if it has already 'suspended'
    The xenops event thread in xapi now handles the 'power_state = Suspended' case by
    pulling the metadata from xenopsd and calling VM.remove.
    If a failure happens during suspend, xapi will wait for the event sync
    and check the power-state. If the VM is suspended then it will be
    force_shutdown because the failure will have left the VM in an invalid
    state (probably through a missing or truncated suspend VDI). If the
    VM is in any other state then it can be safely left.
    Signed-off-by: David Scott <>
  19. CA-85940: on xapi start, startup storage before syncing with xenopsd

    David Scott committed Jul 16, 2012
    The xenopsd sync will require access to the storage layer, to detach
    and deactivate VDIs.
    Signed-off-by: David Scott <>
  20. CA-85940: xenopsd can now be set to trigger the cancellation of a tas…

    David Scott committed Jul 14, 2012
    …k at a numbered cancellation point
    This allows a test harness to verify that cancellation is safe.
    Signed-off-by: David Scott <>
  21. CA-85940: Copy HTTP headers "X-Http-other-config-K=V" into Task.other…

    David Scott committed Jul 14, 2012
    This is the first step towards allowing a XenAPI client to propagate
    arbitrary information (tokens, debug configuration, fault injection)
    through the whole stack.
    Signed-off-by: David Scott <>
  22. CA-85940: add a Task.debug_info (string * string list) and return the…

    David Scott committed Jul 14, 2012
    … number of "cancel points" seen
    The Task.debug_info can be used to return arbitrary data without
    officially extending (and polluting) the interface.
    The number of "cancel points" refers to the number of times the
    task could have been cancelled.
    Signed-off-by: David Scott <>
  23. CA-85940: rename Task.debug_info to Task.dbg for consistency

    David Scott committed Jul 14, 2012
    We use the argument name "dbg" to refer to the debug token passed
    along with each API call. This value is then stored in the Task.dbg
    Signed-off-by: David Scott <>
  24. CA-85940: add "on failure" code to all the VM lifecycle operations

    David Scott committed Jul 14, 2012
    After a partial failure (e.g. arbitrary process restart) xenopsd always
    leaves each VM in a state which is either
      * clearly marked as broken ("domain action request = poweroff")
      * intact (Running, Paused, Suspended etc)
    Therefore all we need to do after detecting a failure (or a restart)
    is to trigger a {VM,VBD,VIF,PCI}_check_state operation, which will
    then enqueue the appropriate corrective action i.e.
      * VM_poweroff if broken
      * nothing if intact
    In particular we should not try to perform bespoke, per-operation
    cleanup (e.g. VM_start -> VM_poweroff). Instead we simply invoke
    the correct trigger.
    Signed-off-by: David Scott <>
  25. CA-85940: log when we're about to raise a task Cancelling exception

    David Scott committed Jul 14, 2012
    Signed-off-by: David Scott <>
  26. CA-85940: in VM.{start,resume} rely on the xapi<->xenops event sync i…

    David Scott committed Jul 14, 2012
    …n the error path
    If an error happens during VM.start then we wish
    1. xenopsd to be in a consistent state: a VM should be either completely
       intact or shutdown (see separate commit)
    2. xapi to synchronise its state with xenopsd, not try to override it
    To cause (2), we:
    1. ensure that VM.resident_on is set in both success and failure cases
       -- this activates the event sycnrhonisation
    2. inject a barrier into the event queue to block the XenAPI thread
       until the synchronisation completes.
    If an error happens during VM.resume we don't set resident_on because we
    wish to protect the power_state, currently_attached fields, so we can try
    again later.
    Signed-off-by: David Scott <>