Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring of waitset #341

Closed
12 of 13 tasks
orecham opened this issue Nov 9, 2020 · 6 comments
Closed
12 of 13 tasks

Refactoring of waitset #341

orecham opened this issue Nov 9, 2020 · 6 comments
Assignees
Labels
enhancement New feature
Milestone

Comments

@orecham
Copy link
Contributor

orecham commented Nov 9, 2020

Brief feature description

  • The current wait set API does not provide any mechanism for wait sets to identify the type of conditions that caused them to wake up.

  • The thread safety of the waitset is unclear. Do we support attach/detachCondition while another tread is doing the wait()?

  • to be discussed, Would it make sense to have the conditions coupled to an (custom) ID and return a container of IDs instead of a container of conditions

  • Destroying a condition that is still attached to a waitset results in a error handler call. Shall we make this more robust and have a callable stored in the condition for things like detaching in d'tor?

  • The condition variable must be cleaned up in shared memory when the waitset is deleted (Ports are using the destroy() call from the BasePort and are cleaned up in the discovery loop, we need something similar for the condition variable), see also Clean up all shared memory resources when corresponding user objects go out of scope #369

  • Attach classes to a waitset which are copy- and/or movable.

  • Decrease the size of TriggerState to 8 bytes (if possible)

  • WaitSet triggerCapacity as template parameter

  • overload attachTo in subscriber and user trigger so that a callback can be attached without providing a trigger id

  • rename triggerCapacity into capacity in waitset

  • rename attachTo in subscriber into attachEvent

  • rename acquireTrigger into acquireTriggerHandle in waitset

  • +1 for usertrigger, add this to WaitSet readme

Detailed information

Current wait set usage is as follows:

auto vectorOfFulfilledConditions = myWaitSet.wait();
for(auto& element : vectorOfFulfilledConditions)
{
     if(element == &mySubscriber1)
     {
         // Subscriber1 has received new data
         ChunkHeader myData;
         mySubscriber1.tryGetChunk(myData);
         doSomeThingWithTheNewData(myData);
    }
}

Here you can check each individual condition, but this can quickly become tedious if there are many attached conditions.
It would be beneficial to be able to identify the type of condition so that generic handling logic can be implemented.

@orecham orecham self-assigned this Nov 9, 2020
@orecham orecham changed the title Identify triggered conditions when using waitsets Identify the type of triggered conditions when using waitsets Nov 9, 2020
@budrus
Copy link
Contributor

budrus commented Nov 9, 2020

There are some more topics to discuss for the waitset:

  • thread safety for attach/detachCondition vs wait. I.e. is it allowed to attach or detach conditions while another thread is executing a wait loop. I feel we need this if we would use the waitset for handling callbacks in a posh runtime thread.
  • Allow the usage of custom IDs to optimize the way how the waitset is processed. I have the idea to e.g. use function refs as IDs for a waitset. Then we do not need to search for the function ref that maps to a condition when the wait returns but just can call the returned function refs directly.
  • Currently the d'tor of a condition calls the error handler if it is still attached to a waitset. Should we have something like a function ref that the waitset can set and which the condition calls in the d'tor to detach from the waitset? When I have a look on the current example, I see no detach and we somehow rely on the order of object destruction.

@ithier Would it be fine for you to extend the scope of this issue to a more general waitset issue?

@budrus budrus added this to To do in v1.0 Nov 9, 2020
@budrus budrus added the enhancement New feature label Nov 9, 2020
@budrus budrus changed the title Identify the type of triggered conditions when using waitsets Refactoring of waitset Nov 10, 2020
@budrus budrus assigned budrus and unassigned orecham Nov 10, 2020
@orecham
Copy link
Contributor Author

orecham commented Nov 10, 2020

I have the idea to e.g. use function refs as IDs for a waitset. Then we do not need to search for the function ref that maps to a condition when the wait returns but just can call the returned function refs directly.
How would this look exactly ?

When I read this, I got the idea that the condition types could allow users to provide some callback via a setCallback() function or something which can be directly called from the function refs returned by the waitset (i.e. by implementing the call operator of the condition to call the provided callback).
e.g.

auto subscriberCondition = Subscriber<ValueType>();
subscriberCondition.setCallback([](Condition condition){
    auto subscriber = static_cast<Subscriber<ValueType>>(condition);
    subscriber->take()
        .and_then([](Sample<ValueType> sample){
            // Work with sample.
        })
        .if_empty([]{
            // Handle no new data.
        })
        .or_else([](iox::popo::ChunkReceiveError){
            // Handle error.
        });
});

auto waitset = WaitSet();
waitset.attachCondition(subscriberCondition);
auto triggeredConditions = waitset.wait();
for(auto& callback : triggeredConditions)
{
    callback();
}

I might have gone in the complete wrong direction with this though, am I close ?
My one worry is that the scoping of the callback might be a problem. I can't think of any concrete examples though. Maybe it's not actually an issue.

@budrus
Copy link
Contributor

budrus commented Nov 10, 2020

Hmm, when the subscriber stores a callback reference it maybe makes no sense. Then we could do something like subscriber.executeCallback() and do not need an extra ID in the waitset. I was more thinking about callbacks and subscribers stored individually and that the searching/mapping could be saved. But make it makes more sense to have callback refs stored in the subscribers or subscriber ports.

@budrus budrus added this to the Prio 1 milestone Nov 11, 2020
@elfenpiff elfenpiff assigned elfenpiff and unassigned budrus Nov 11, 2020
elfenpiff added a commit to ApexAI/iceoryx that referenced this issue Nov 12, 2020
Signed-off-by: Christian Eltzschig <me@elchris.org>
elfenpiff added a commit to ApexAI/iceoryx that referenced this issue Nov 12, 2020
… conditions are not depending on each other anymore

Signed-off-by: Christian Eltzschig <me@elchris.org>
elfenpiff added a commit to ApexAI/iceoryx that referenced this issue Nov 12, 2020
…cope

Signed-off-by: Christian Eltzschig <me@elchris.org>
elfenpiff added a commit to ApexAI/iceoryx that referenced this issue Nov 12, 2020
…mpty tests

Signed-off-by: Christian Eltzschig <me@elchris.org>
elfenpiff added a commit to ApexAI/iceoryx that referenced this issue Nov 12, 2020
Signed-off-by: Christian Eltzschig <me@elchris.org>
@MatthiasKillat
Copy link
Contributor

MatthiasKillat commented Nov 13, 2020

While I was thinking about a more general to implement a waitset I encountered the following issue. Note that it may not apply here, but you may have to consider it even here.

Assume we have a waitset W and have registered conditions, say C1 and C2 (which are all objects here).
Now we call W.wait(); and hence wait (blocking) until notified. In another thread (or process) some actions are taken and condition C2 becomes true. The thread also notifies the waitset W by calling trigger() (I think, but the exact method name is of no concern).
Now the problem arises. The wait call wakes up and checks the conditions. However, the condition can still change concurrently. That does not only include waking up and seeing the condition to be false. In principle, given sufficiently complex conditions we could read entirely invalid states (especially depending on multiple variables, think x == y for two integers but even for a single bool it is in general not guaranteed to not have torn or outdated reads) .

This is why conditions are supposed to be only manipulated under some kind of mutex protection to ensure proper operation. The bad part is of course that the window of time were those problems can happen is very small, so usually you will not observe this. This makes bugs related to it very hard to find. The consequences of this can be false positives and especially false negatives on wake-up.

Now, there may be hope to not need a mutex under special circumstances, such as having "monotonic" conditions which may be set to true by other contexts, but can only be reset by the waitset itself. This may not be enough though, I have not fully thought about it. However, having non-monotonic conditions will surely cause these subtle problems.

Just keep this in mind when you define the semantics of the waitset and implement it.
If this is not an issue, let me know (ideally with a short reason why not).

elfenpiff added a commit to ApexAI/iceoryx that referenced this issue Nov 13, 2020
Signed-off-by: Christian Eltzschig <me@elchris.org>
elfenpiff added a commit to ApexAI/iceoryx that referenced this issue Nov 13, 2020
…prints a warning since you violate best practice

Signed-off-by: Christian Eltzschig <me@elchris.org>
elfenpiff added a commit to ApexAI/iceoryx that referenced this issue Nov 13, 2020
…was triggered

Signed-off-by: Christian Eltzschig <me@elchris.org>
elfenpiff added a commit to ApexAI/iceoryx that referenced this issue Nov 13, 2020
…condition was triggered

Signed-off-by: Christian Eltzschig <me@elchris.org>
@elfenpiff elfenpiff moved this from To do to In progress in v1.0 Nov 13, 2020
@MatthiasKillat
Copy link
Contributor

MatthiasKillat commented Nov 14, 2020

After further thought this is what I came up with. Note that this is only a very rough prototype but illustrates the use case as I see it https://github.com/MatthiasKillat/concurrency_primitives/blob/master/test_waitset_pub_sub.cpp

A goal was to not need an explicit mutex, there may be an implicit one in a semaphore.

One of my main gripes with the current waitset is unnecessary inheritance with those conditions (making it inflexible) and potential races (not sure!) in the condition checking. I believe it can be simplified but the use case needs to be better understood.

I extracted the core functionality but designed it in a more generic way. My assumption is: we have one waiter (more is a misuse for now but can probably be extended), and our conditions are essentially monotonic (e.g. if there was data and nobody took it, then it is still there, it cannot dissapear).

Now, my approach is to just say a waitset can register conditions bool(void) and optionally a callback void(void) for each of them. Doing so returns a Token from the Waitset, essentially a proxy object which allows interaction with the waitset but carries a bit more internal information about the specific condition it belongs to etc. A token has a unique id, which the holder of the token can query.

The waitset may call wait in some thread T1, were it will block until it is notified.
It can be notified via a token or directly by calling notify on itself, in the token case it is only really notified if the condition asscoiated with the token is actually true, otherwise nothing happens.

Assume the waitset was notifed. When it wakes up it has to check which internal conditions became true (by iterating, this seems unavoidable if one uses just one semaphore/condition variable internally for all conditions of the waitset). Here it is crucial that these conditions are monotonic, if they can become false concurrently (e.g. while iterating) we would need a mutex to protect anything that can change the value of the conditions(!).
It now knows which conditions were true, stores their ids in a result vector (called WakeUpSet) and resets them (here it is important thatwe have only one waiter who can do this).
It then proceeds to execute associated callbacks (if any) and returns from the wait with the ids of the conditions it found to be true. The ids can in principle be used to identify the condition, I wanted to return something very lightweight. If everything is done in the callbacks nothing needs to be returned...

In pseudocode:

Waitset w;
Token t1 = w.add(condition1, callback1);
Token t2 = w.add(condition2, callback2);

Thread1:
wakeUpSet = w.wait(); 
//when it wakes up and finds conditions that are true, 
//it executes the callbacks and returns the ids in the wakeUpSet
//we could consider returning something heavier, like references to the tokens or similar
//in principle the user has all information about which conditions caused the wake up
...
Thread2: 
t1.notify(); //causes the waitset to wake up and check conditions only if condition1 is true
...
Thread3:
t2.notify(); //causes the waitset to wake up and check conditions only if condition2 is true
...
Thread4:
w.notify(); //causes the waitset to wake up and check conditions

The interface is not perfect/final, but I think that is all there is to do, no inheritance needed. Currently removal of conditions is not possible and dynamic memory is used, both can be changed (the conditions must be a bit more restricted in this case, std::function is not an option then).

A subscriber can now simply hold a token (optionally) and interact with the waitset. This holds for any object that wants to work with a waitset.

Remember:
Prefer composition over inheritance.
Inheritance ideally only conveys a is_a relation, and gets weird if used incorrectly (I know, you can do interface enrichment etc. but it should be used sparingly.)

I only wanted to sketch the use scenario which I consider ideal, without losing generality if possible. :-)

elfenpiff added a commit to ApexAI/iceoryx that referenced this issue Dec 17, 2020
…cumented

Signed-off-by: Christian Eltzschig <me@elchris.org>
elfenpiff added a commit to ApexAI/iceoryx that referenced this issue Dec 17, 2020
…hing works only via waitset

Signed-off-by: Christian Eltzschig <me@elchris.org>
elfenpiff added a commit to ApexAI/iceoryx that referenced this issue Dec 17, 2020
…er to provide default arguments

Signed-off-by: Christian Eltzschig <me@elchris.org>
elfenpiff added a commit to ApexAI/iceoryx that referenced this issue Dec 17, 2020
Signed-off-by: Christian Eltzschig <me@elchris.org>
elfenpiff added a commit to ApexAI/iceoryx that referenced this issue Dec 17, 2020
Signed-off-by: Christian Eltzschig <me@elchris.org>
elfenpiff added a commit to ApexAI/iceoryx that referenced this issue Dec 17, 2020
Signed-off-by: Christian Eltzschig <me@elchris.org>
elfenpiff added a commit to ApexAI/iceoryx that referenced this issue Dec 17, 2020
… fixed an issue in the gateway example

Signed-off-by: Christian Eltzschig <me@elchris.org>
elfenpiff added a commit to ApexAI/iceoryx that referenced this issue Dec 17, 2020
…tset destructor

Signed-off-by: Christian Eltzschig <me@elchris.org>
elfenpiff added a commit to ApexAI/iceoryx that referenced this issue Dec 17, 2020
Signed-off-by: Christian Eltzschig <me@elchris.org>
elfenpiff added a commit to ApexAI/iceoryx that referenced this issue Dec 17, 2020
… renamed trigger left overs into event and adjusted docu

Signed-off-by: Christian Eltzschig <me@elchris.org>
elfenpiff added a commit to ApexAI/iceoryx that referenced this issue Dec 17, 2020
…lot of Us

Signed-off-by: Christian Eltzschig <me@elchris.org>
elfenpiff added a commit to ApexAI/iceoryx that referenced this issue Dec 17, 2020
Signed-off-by: Christian Eltzschig <me@elchris.org>
elfenpiff added a commit to ApexAI/iceoryx that referenced this issue Dec 17, 2020
Signed-off-by: Christian Eltzschig <me@elchris.org>
elfenpiff added a commit that referenced this issue Dec 17, 2020
iox-#341 implemented condition variable cleanup in waitset destructor
elfenpiff added a commit to ApexAI/iceoryx that referenced this issue Dec 17, 2020
Signed-off-by: Christian Eltzschig <me@elchris.org>
elfenpiff added a commit to ApexAI/iceoryx that referenced this issue Dec 17, 2020
Signed-off-by: Christian Eltzschig <me@elchris.org>
elfenpiff added a commit to ApexAI/iceoryx that referenced this issue Dec 17, 2020
Signed-off-by: Christian Eltzschig <me@elchris.org>
elfenpiff added a commit to ApexAI/iceoryx that referenced this issue Dec 18, 2020
elfenpiff added a commit to ApexAI/iceoryx that referenced this issue Dec 18, 2020
Signed-off-by: Christian Eltzschig <me@elchris.org>
elfenpiff added a commit to ApexAI/iceoryx that referenced this issue Dec 18, 2020
Signed-off-by: Christian Eltzschig <me@elchris.org>
elfenpiff added a commit to ApexAI/iceoryx that referenced this issue Dec 18, 2020
elfenpiff added a commit to ApexAI/iceoryx that referenced this issue Dec 18, 2020
Signed-off-by: Christian Eltzschig <me@elchris.org>
elfenpiff added a commit to ApexAI/iceoryx that referenced this issue Dec 18, 2020
Signed-off-by: Christian Eltzschig <me@elchris.org>
elfenpiff added a commit that referenced this issue Dec 18, 2020
…for-better-performance

Iox #341 improve waitset wait performance by returning TriggerInfo pointer
@budrus budrus moved this from Review to In progress in Sprints Jan 15, 2021
@elfenpiff
Copy link
Contributor

Attach classes to a waitset which are copy- and/or movable.

At the moment this is not a use case and I would pursue this when the need arises.

v1.0 automation moved this from In progress to Done Jan 19, 2021
Sprints automation moved this from In progress to Done Jan 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature
Projects
No open projects
v1.0
  
Done
Development

No branches or pull requests

4 participants