Fix races in event.from and add a test case #1719

Closed
wants to merge 2 commits into
from

Conversation

Projects
None yet
4 participants
@djs55
Collaborator

djs55 commented Apr 30, 2014

  • CloudStack uses 1 session and multiple threads, each calling event.from. Unfortunately each thread shared state on the server, causing some of these threads to spuriously fail with SESSION_INVALID (triggering task cleanup and cascading errors).
  • CloudStack expects an empty event set to signify an error. Unfortunately a badly-timed Modify followed by a Delete could cause this to happen

The new test case captures the first failure triggered by CloudStack. It fails without the fix and passes afterwards.

This code needs some careful review from @jonludlam

David Scott added some commits Apr 29, 2014

David Scott
event.from: prevent interference between parallel calls
Since the new event.from API implementation was based on top
of the existing event.next implementation it was possible
for one thread to interfere with another because each event.from
*shared* the 'subscription' record. In particular CloudStack
suffered from the following interleaving:

thread 1: event.from ["vm/"]
thread 2: event.from ["task/"]
 <- event on vm
thread 1: return successfully
thread 2: fails with SESSION_INVALID because the subscription
  is marked as invalid in the on_session_deleted cleanup function

In this patch we cleanly separate the event.next implementation
(in the submodule Next) from the event.from implementation
(in the submodule From). It's now clear that all calls to
event.from use *separate* state, sharing only the session_id
which is used to allow session.logout to be used to unblock
blocked threads.

This patch strengthens the event.from postcondition: the only time
when an empty event set is returned is when the timeout has fired.
Previously empty sets could be triggered by the race:

1.     object is modified
2. event.from sees the modification
3.     object is destroyed
4. event.from cannot find the snapshot, omits the modify event

Signed-off-by: David Scott <dave.scott@eu.citrix.com>
David Scott
quicktest: add a test case for parallel calls to event.from
Sophisticated clients like CloudStack will issue parallel calls to
event.from, looking for different conditions. These calls should
not interfere.

Signed-off-by: David Scott <dave.scott@eu.citrix.com>
@johnelse

This comment has been minimized.

Show comment
Hide comment
@johnelse

johnelse Oct 22, 2014

Collaborator

While this is a good fix, xapi is relying on the buggy behaviour in Clearwater, so this fix will have to go in at the same time:

#1953

Collaborator

johnelse commented Oct 22, 2014

While this is a good fix, xapi is relying on the buggy behaviour in Clearwater, so this fix will have to go in at the same time:

#1953

@simonjbeaumont

This comment has been minimized.

Show comment
Hide comment
@simonjbeaumont

simonjbeaumont Feb 27, 2015

Collaborator

@jonludlam can you advise whether this is still required? It mentions it will need an ACK from you. Do we have any associated tickets for this, otherwise it is going to be pointless putting it in -bugfix since it will never be cherry picked out.

Collaborator

simonjbeaumont commented Feb 27, 2015

@jonludlam can you advise whether this is still required? It mentions it will need an ACK from you. Do we have any associated tickets for this, otherwise it is going to be pointless putting it in -bugfix since it will never be cherry picked out.

@djs55

This comment has been minimized.

Show comment
Hide comment
@djs55

djs55 Feb 27, 2015

Collaborator

It looks like the patch is in trunk:

commit 75a4e079ef915650bf0b3d8c0c48cde3b6ba1a69
Author: David Scott <dave.scott@eu.citrix.com>
Date:   Thu May 1 18:25:32 2014 +0100

    quicktest: add a test case for parallel calls to event.from

    Sophisticated clients like CloudStack will issue parallel calls to
    event.from, looking for different conditions. These calls should
    not interfere.

    Signed-off-by: David Scott <dave.scott@eu.citrix.com>

commit 6675872e7a201ec38b1d76fa0d272e1a9bb35741
Author: David Scott <dave.scott@eu.citrix.com>
Date:   Thu May 1 18:23:10 2014 +0100

    event.from: prevent interference between parallel calls

I believe cloudstack worked around this by reverting their use of the event system :( but this means there's no point making a hotfix

Collaborator

djs55 commented Feb 27, 2015

It looks like the patch is in trunk:

commit 75a4e079ef915650bf0b3d8c0c48cde3b6ba1a69
Author: David Scott <dave.scott@eu.citrix.com>
Date:   Thu May 1 18:25:32 2014 +0100

    quicktest: add a test case for parallel calls to event.from

    Sophisticated clients like CloudStack will issue parallel calls to
    event.from, looking for different conditions. These calls should
    not interfere.

    Signed-off-by: David Scott <dave.scott@eu.citrix.com>

commit 6675872e7a201ec38b1d76fa0d272e1a9bb35741
Author: David Scott <dave.scott@eu.citrix.com>
Date:   Thu May 1 18:23:10 2014 +0100

    event.from: prevent interference between parallel calls

I believe cloudstack worked around this by reverting their use of the event system :( but this means there's no point making a hotfix

@djs55 djs55 closed this Feb 27, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment