Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring of event queue #390

Merged
merged 20 commits into from
Apr 11, 2024
Merged

Refactoring of event queue #390

merged 20 commits into from
Apr 11, 2024

Conversation

byeonggiljun
Copy link
Collaborator

@byeonggiljun byeonggiljun commented Mar 8, 2024

In this PR, the event queue stores events using pqueue_tag to sort them in tagged order. Consequently, events scheduled with a tag with a non-zero microstep are handled the same as other events (next_q is deleted).

Some functionalities in pqueue_tag.c and pqueue_base.c are also changed.

The motivation for this refactoring is:

  • to prevent a federate sends NET (0, 1) when the federate has received T_MSG (0, 0). Normally, we can infer that the next event's tag is one microstep later than the current tag if the next event's time is the same as the current time. For example, if a next event's time is 1 sec and the current tag is (1 sec, 1), that event needs to be executed at (1 sec, 2) because every event at (1 sec, 1) must be popped before advancing to (1 sec, 1). The code below (in get_next_event_tag) shows that inference. However, an exception occurs at the start tag. At the start tag (0, 0), the event queue can contain an event with (0, 0). We still infer that the event's tag is (0, 1) and send NET (0, 1), which is wrong. If events are sorted by tag, not time, we can use each event's tag as the target tag.

if (next_tag.time == env->current_tag.time) {
LF_PRINT_DEBUG("Earliest event matches current time. Incrementing microstep. Event is dummy: %d.",
event->is_dummy);
next_tag.microstep = env->current_tag.microstep + 1;

  • to make the scheduling process simpler. Because the events are sorted by time, not tag, we had to put chained dummy events to implement an event with a positive microstep. After this refactoring, we can schedule events with a tag because events are sorted by their tag. Thus, we can remove those chained dummy events and the next queue.

@byeonggiljun byeonggiljun changed the title Refactoring of event_q and scheduling for tags with non-zero microsteps Refactoring of event queue Mar 13, 2024
@byeonggiljun byeonggiljun marked this pull request as ready for review March 13, 2024 03:01
Copy link
Member

@hokeun hokeun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

lib/schedule.c Outdated Show resolved Hide resolved
@byeonggiljun byeonggiljun marked this pull request as draft March 19, 2024 15:28
@byeonggiljun
Copy link
Collaborator Author

Convert this to draft because mac-os tests are failing.

@byeonggiljun byeonggiljun added mac and removed mac labels Apr 1, 2024
@petervdonovan
Copy link
Contributor

petervdonovan commented Apr 2, 2024

The test that is failing has both decentralized execution and physical actions. It is very hard to make such a test both useful (in the sense of catching bugs) and non-flaky. In particular, DistributedLoopedPhysicalActionDecentralized checks that messages from the sender are processed in the correct order, which won't (necessarily) be true if there are STP violations.

I don't have a strong opinion about how to fix this. We have discussed pipe-dream kinds of improvements to the testing framework that could solve this sort of thing, but in the near term, I guess we can choose from a few unappealing options, such as moving the test to failing, not checking that messages are received in the right order, or deferring the check until the end after the tardy message has been received. I suppose I would lean toward the second option. The third of these could be rather complex, and it is already difficult to interpret the tests.

@byeonggiljun
Copy link
Collaborator Author

byeonggiljun commented Apr 2, 2024

Thank you, @petervdonovan! I also agree that the test doesn't have to check the order of messages. I'll make a simple LF PR if it's the reason for the error.

Update: I found an error and fixed it. The problem was not in the test. And I think now this PR is ready for merging.

@byeonggiljun byeonggiljun marked this pull request as ready for review April 3, 2024 15:06
Copy link
Contributor

@edwardalee edwardalee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to review this more carefully, but I have one high-level question:
You changed the semantics of the cmppri comparison functions.
They used to return 0 for a < b and non-zero otherwise. You now have them returning -1 for a < b, 0 for a==b, and 1 for a > b.

Why did you make this change? The pqueue implementation doesn't need this, I think, and this change is what resulted in most of the changes in the base classes. Is this change necessary?

core/utils/pqueue_tag.c Outdated Show resolved Hide resolved
@byeonggiljun
Copy link
Collaborator Author

byeonggiljun commented Apr 10, 2024

Why did you make this change? The pqueue implementation doesn't need this, I think, and this change is what resulted in most of the changes in the base classes. Is this change necessary?

Thank you for starting the review, @edwardalee! I have a reason for that change although I'm not sure that change is the only solution.

Please look at the code below. This is a part of find_equal_same_priority in pqueue_base.c. This function is needed here (lib/schedule.c#L201-L205) to check if there is an already scheduled event with the same trigger and the same tag.

if (q->getpri(curr) == q->getpri(e) && q->eqelem(curr, e)) {

We use the == operator to check whether two priorities are identical. This works for the time-ordered queue. Nonetheless, pqueue_tag's priority is a pointer to a tag and the == operator does not work. Also, currently the cmp_pri (thiz, that) function only checks whether thiz > that or not. This is why I made the cmp_pri function return 0 if two priorities are the same. As a result, we can check the equality of the two priorities as below.

if (q->cmppri(q->getpri(curr), q->getpri(e)) == 0 && q->eqelem(curr, e)) {

Copy link
Contributor

@edwardalee edwardalee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very nice piece of work! I love that quite a bit complicated code is removed. I've made some suggestions for further code removal and some comment updates. All of these are optional.

core/utils/pqueue_support.h Outdated Show resolved Hide resolved
include/core/utils/pqueue_base.h Show resolved Hide resolved
include/core/utils/pqueue_base.h Outdated Show resolved Hide resolved
include/core/utils/pqueue_base.h Outdated Show resolved Hide resolved
include/core/utils/pqueue_base.h Outdated Show resolved Hide resolved
lib/schedule.c Outdated Show resolved Hide resolved
include/core/reactor_common.h Outdated Show resolved Hide resolved
include/core/lf_types.h Outdated Show resolved Hide resolved
core/utils/pqueue_tag.c Outdated Show resolved Hide resolved
core/utils/pqueue_tag.c Outdated Show resolved Hide resolved
@byeonggiljun
Copy link
Collaborator Author

Seems this PR is ready to be merged. Let me put this into the merge queue.

@byeonggiljun byeonggiljun added this pull request to the merge queue Apr 11, 2024
Merged via the queue into main with commit 64c3f01 Apr 11, 2024
30 checks passed
@byeonggiljun byeonggiljun deleted the event-queue-refactoring branch April 11, 2024 23:21
Copy link
Collaborator

@erlingrj erlingrj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @byeonggiljun and @edwardalee, I have a few questions about this. This seems like a good design, but there is something weird about how the priority queue is implemented where we are forced to cast between points and integers and it is not really portable.

I think we need to have pqueue_pri_t as a void *, this would work with no problem for the pqueue_tag.c as you have implemented it here, but for the pqueue.c that is used for the reaction queue, it would not work directly. But it should also adapt this design where reactions are sorted based on the values of their fields and never based on the integer values of the pointers.

Comment on lines +41 to +65
//////////////////
// Local functions, not intended for use outside this file.

/**
* @brief Callback function to determine whether two events have the same trigger.
* This function is used by event queue and recycle.
* Return 1 if the triggers are identical, 0 otherwise.
* @param event1 A pointer to an event.
* @param event2 A pointer to an event.
*/
static int event_matches(void* event1, void* event2) {
return (((event_t*)event1)->trigger == ((event_t*)event2)->trigger);
}

/**
* @brief Callback function to print information about an event.
* This function is used by event queue and recycle.
* @param element A pointer to an event.
*/
static void print_event(void* event) {
event_t* e = (event_t*)event;
LF_PRINT_DEBUG("tag: " PRINTF_TAG ", trigger: %p, token: %p", e->base.tag.time, e->base.tag.microstep,
(void*)e->trigger, (void*)e->token);
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These function definition seem a little out-of-place here in environment.c why not have them in pqueue.c or pqueue_tag.c?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rationale for this is to avoid creating unnecessary dependencies.
Neither pqueue.c nor pqueue_tag.c have any dependence on event_t.
They are more generic. It is in evironment.c that pqueue_tag is adapted to contain items of type event_t. The way it does that is by providing these two functions as arguments to pqueue_tag_init_customize.
These functions are made static, so this design of the event queue is entirely defined in environment.c.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I see, thanks for the explanation.

@@ -75,11 +64,21 @@ static void pqueue_tag_print_element(void* element) {
//////////////////
// Functions defined in pqueue_tag.h.

int pqueue_tag_compare(pqueue_pri_t priority1, pqueue_pri_t priority2) {
return (lf_tag_compare(((pqueue_tag_element_t*)priority1)->tag, ((pqueue_tag_element_t*)priority2)->tag));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here an unsigned long long is casted to an event_t * this will result in warnings on any architecture where sizeof(void *) != sizeof(unsigned long long)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that pqueue_base should be changed so pqueue_pri_t is a void*. This, however, requires redesigning the reaction queue, which has a (probably premature) optimization that combines the deadline and the level into a single index that it uses for ordering reactions. I think this should be done as a separate PR. Do you want to open an issue? Probably we should benchmark the performance impact. There is likely to be a cost.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is covered by this issue: #388

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants