Skip to content

Fix races in event.from and add a test case #1719

Closed
wants to merge 2 commits into from

4 participants

@djs55
Xapi Project member
djs55 commented Apr 30, 2014
  • CloudStack uses 1 session and multiple threads, each calling event.from. Unfortunately each thread shared state on the server, causing some of these threads to spuriously fail with SESSION_INVALID (triggering task cleanup and cascading errors).
  • CloudStack expects an empty event set to signify an error. Unfortunately a badly-timed Modify followed by a Delete could cause this to happen

The new test case captures the first failure triggered by CloudStack. It fails without the fix and passes afterwards.

This code needs some careful review from @jonludlam

David Scott added some commits Apr 29, 2014
David Scott event.from: prevent interference between parallel calls
Since the new event.from API implementation was based on top
of the existing event.next implementation it was possible
for one thread to interfere with another because each event.from
*shared* the 'subscription' record. In particular CloudStack
suffered from the following interleaving:

thread 1: event.from ["vm/"]
thread 2: event.from ["task/"]
 <- event on vm
thread 1: return successfully
thread 2: fails with SESSION_INVALID because the subscription
  is marked as invalid in the on_session_deleted cleanup function

In this patch we cleanly separate the event.next implementation
(in the submodule Next) from the event.from implementation
(in the submodule From). It's now clear that all calls to
event.from use *separate* state, sharing only the session_id
which is used to allow session.logout to be used to unblock
blocked threads.

This patch strengthens the event.from postcondition: the only time
when an empty event set is returned is when the timeout has fired.
Previously empty sets could be triggered by the race:

1.     object is modified
2. event.from sees the modification
3.     object is destroyed
4. event.from cannot find the snapshot, omits the modify event

Signed-off-by: David Scott <dave.scott@eu.citrix.com>
abb5327
David Scott quicktest: add a test case for parallel calls to event.from
Sophisticated clients like CloudStack will issue parallel calls to
event.from, looking for different conditions. These calls should
not interfere.

Signed-off-by: David Scott <dave.scott@eu.citrix.com>
08dffe7
@johnelse johnelse added the For hotfix label Jul 8, 2014
@johnelse
Xapi Project member

While this is a good fix, xapi is relying on the buggy behaviour in Clearwater, so this fix will have to go in at the same time:

#1953

@simonjbeaumont
Xapi Project member

@jonludlam can you advise whether this is still required? It mentions it will need an ACK from you. Do we have any associated tickets for this, otherwise it is going to be pointless putting it in -bugfix since it will never be cherry picked out.

@djs55
Xapi Project member
djs55 commented Feb 27, 2015

It looks like the patch is in trunk:

commit 75a4e079ef915650bf0b3d8c0c48cde3b6ba1a69
Author: David Scott <dave.scott@eu.citrix.com>
Date:   Thu May 1 18:25:32 2014 +0100

    quicktest: add a test case for parallel calls to event.from

    Sophisticated clients like CloudStack will issue parallel calls to
    event.from, looking for different conditions. These calls should
    not interfere.

    Signed-off-by: David Scott <dave.scott@eu.citrix.com>

commit 6675872e7a201ec38b1d76fa0d272e1a9bb35741
Author: David Scott <dave.scott@eu.citrix.com>
Date:   Thu May 1 18:23:10 2014 +0100

    event.from: prevent interference between parallel calls

I believe cloudstack worked around this by reverting their use of the event system :( but this means there's no point making a hotfix

@djs55 djs55 closed this Feb 27, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.