Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update scheduling policy in C++ target #236

Closed
cmnrd opened this issue Nov 12, 2020 · 23 comments
Closed

Update scheduling policy in C++ target #236

cmnrd opened this issue Nov 12, 2020 · 23 comments
Assignees
Labels
bug Something isn't working cpp Related to C++ target runtime Related to the runtime implementation

Comments

@cmnrd
Copy link
Collaborator

cmnrd commented Nov 12, 2020

If an action is scheduled twice at the same tag, the C++ runtime currently overwrites the already scheduled event with the new event. This should be updated to schedule the second event (and any subsequent events) with an additional microstep delay in order to align with the policy implemented in the C target.

@cmnrd cmnrd added the bug Something isn't working label Nov 12, 2020
@cmnrd cmnrd self-assigned this Nov 12, 2020
@lhstrh
Copy link
Member

lhstrh commented Nov 12, 2020

Sidenote: this is not the full picture. We also need support for minimum spacing enforcement. Here's an updated algorithm:

image

@edwardalee
Copy link
Collaborator

edwardalee commented Nov 13, 2020 via email

@cmnrd
Copy link
Collaborator Author

cmnrd commented Nov 13, 2020

Minimum spacing only applies to physical actions, right? I was only referring to logical actions here. However, physical actions also need to be adjusted.

@lhstrh
Copy link
Member

lhstrh commented Nov 13, 2020 via email

@lhstrh
Copy link
Member

lhstrh commented Nov 13, 2020

@Soroosh129 and I just constructed an example that we think illustrates why the current implementation of schedule may be considered problematic.

It consists of two source reactors that create short bursts of outputs in superdense time such that their outputs never coincide. A downstream reactor, a logical AND, prints its output. Aside from the tags changing when an after is added (which is to be expected), the result is now a sequence of 1s instead of a sequence of 0s.

Here's the output without the after:

---- Start execution at time Fri Nov 13 14:15:43 2020
---- plus 794518999 nanoseconds.
output: 0, tag: (0, 0)
output: 0, tag: (0, 1)
output: 0, tag: (0, 2)
output: 0, tag: (0, 3)
output: 0, tag: (0, 4)
output: 0, tag: (0, 5)
output: 0, tag: (0, 6)
output: 0, tag: (0, 7)
output: 0, tag: (0, 8)
output: 0, tag: (0, 9)
output: 0, tag: (100000000, 0)
output: 0, tag: (100000000, 1)
output: 0, tag: (100000000, 2)
output: 0, tag: (100000000, 3)
output: 0, tag: (100000000, 4)
output: 0, tag: (100000000, 5)
output: 0, tag: (100000000, 6)
output: 0, tag: (100000000, 7)
output: 0, tag: (100000000, 8)
output: 0, tag: (100000000, 9)
output: 0, tag: (200000000, 0)
---- Elapsed logical time (in nsec): 200,000,000
---- Elapsed physical time (in nsec): 200,094,736

And with the after:

---- Start execution at time Fri Nov 13 14:16:19 2020
---- plus 403409050 nanoseconds.
output: 1, tag: (1000000, 0)
output: 1, tag: (1000000, 1)
output: 1, tag: (1000000, 2)
output: 1, tag: (1000000, 3)
output: 1, tag: (1000000, 4)
output: 1, tag: (101000000, 0)
output: 1, tag: (101000000, 1)
output: 1, tag: (101000000, 2)
output: 1, tag: (101000000, 3)
output: 1, tag: (101000000, 4)
WARNING: Token being freed that has already been freed: 0x55fdff7ed0a0
---- Elapsed logical time (in nsec): 200,000,000
---- Elapsed physical time (in nsec): 200,153,940

@edwardalee
Copy link
Collaborator

I don't really see why this is problematic. This is the semantics of "after" as realized now. If you are going to mix microstep delays with time delays, you need to understand the semantics. Before we jump overboard here, we need to understand the implications of the alternative. I suspect we will find ourselves having to expose microsteps in everything, adding considerable complexity to the language (and possibly a significant performance hit). Are we willing to pay that price? The tutorial intro to Lingua Franca will have include a complete explanation of and defense of superdense time. Programmers will have to understand what this means, for example:

  a.out -> b.in after (10 msec, -3);

@lhstrh
Copy link
Member

lhstrh commented Nov 13, 2020

To me, it is strange that a construct that is parameterized only by a time value would not only affect the time value but also the microstep. How do we justify this? Why would it be reasonable to expect this unrequested side-effect?

I'm probably missing something here, but why do you suppose that preserving the microstep implies that we need microsteps to be featured in any of our API functions or syntax? I've never had no such intention at all, and I've certainly never proposed to allow negative microstep adjustments...

@Soroosh129
Copy link
Contributor

Soroosh129 commented Nov 13, 2020

I would argue that our current semantics might be even harder to explain without microsteps because by adding after, you could get a different logic (i.e., logical simultaneity where there isn't one) in your program, as the example shows. With a different logic that preserves microsteps, the behavior of the program with respect to microstep ordering will not alter by adding an after.

@edwardalee
Copy link
Collaborator

I'm probably missing something here, but why do you suppose that preserving the microstep implies that we need microsteps to be featured in any of our API functions or syntax? I've never had no such intention at all, and I've certainly never proposed to allow negative microstep adjustments...

How do you get back to microstep 0? How do prevent microsteps from diverging?

@lhstrh
Copy link
Member

lhstrh commented Nov 14, 2020

Microstep indexes will creep up whenever a reactor makes iterations in superdense time by repeatedly scheduling a logical action with zero delay. This is a problem in any program where unbounded iterations in superdense time happen. I fail to see why inserting an after should be considered a cure for this. I think it makes more sense to warn against doing this and encourage the use of a physical action in scenarios where such iterations are necessary, in which case it is also abundantly clear that there should be no presumption of simultaneity.

Also notice that, in our example, if the producers take no "rest time," then no output will be produced at all; the messages will never be seen by the downstream consumer and the program will accumulate events in the event queue until it runs out of memory. Subtle problems arise with superdense time iterations no matter what semantics of schedule we choose.

@edwardalee
Copy link
Collaborator

I don't understand the second paragraph. Can you give an example?

On the first paragraph, I think the key difference is in the current semantics, microsteps are more relative than absolute. If you have an iteration created by repeated calls to schedule with zero delay, the microstep increases, but only one step at a time. With the proposed revision, what concerns me is the divergence of microsteps. At logical time t, I could have an event at microsteps 0, 1042, and 2,456,562.

@Soroosh129
Copy link
Contributor

Soroosh129 commented Nov 14, 2020

I might be able to explain because we discovered the issue @lhstrh raised in the second paragraph together. Using the same Logic.lf example, the source sends messages at each microstep until microstep 9, then takes a break for 100 msec:

reactor Source(even:bool(true)) {
    output out:bool;
    ...
    state count:int(0);
    logical action next;
    reaction(startup, next) -> next, out {=
        ...
        // Set out every other microstep
        ...
        if (self->count < 9) {
            self->count++;
            schedule(next, 0);         // Increment the microstep
        } else {
            schedule(next, MSEC(100)); // Take a break
            self->count = 0;
        }
    =}
}

This is similar to the sender in DistributedLoopedAction.lf. In this case, the receiver will see 10 microsteps (0-9) if there is no after on the connection and 5 microsteps (0-4) if there is an after delay.

However, what happens if the sender doesn't take a break? In the case of no after delay, the downstream reactor will be triggered as expected, seeing microsteps 0 to infinity. However, when we use an after delay, the downstream reactor will never get a chance to see the output of the Sender because the outputs are in the future (e.g., by 1 msec in the example), and thus the program will freeze. Interestingly, it doesn't even timeout because the logical time does not make it past 0.

@edwardalee
Copy link
Collaborator

That is a stuttering Zeno program. You can make such a program under the proposed semantics just as easily. In fact, any reactor that reschedules indefinitely with zero delay will block the advancement of time under either semantics.

@lhstrh
Copy link
Member

lhstrh commented Nov 14, 2020 via email

@edwardalee
Copy link
Collaborator

I’m confused. I thought this issue was trying to come to a resolution on whether microsteps should be preserved by after and schedule (proposed “new semantics”). I agree that doing fancy things with microsteps requires understanding the semantics. That is true with either semantics.

In Ptolemy II, microsteps are most commonly used in modal models. The designer of the modal model infrastructure had to thoroughly understand the semantics of microsteps, but not the users of modal models. I think the proposed semantics would change that. In effect, a modal model in some mode A may have events at microstep N, where N is the number of mode changes that have been taken since the program started. This would occur, for example, if each mode uses either schedule or a feedback loop with after. Suppose each mode also has a timer. Then it will have events at microstep 0 and N. Events whose timing is determined by a feedback loop with after will not be simultaneous with timer events except in the initial mode, and the spacing between these events will diverge.

One way to fix this problem would be to say that a mode change takes more than zero time. In effect, this discards the major advantage of microsteps. Another way to fix it would to allow negative microstep delays on after and schedule. But this won’t work for modal models because you want the negative delay only at the first tag after entering a mode. This means after would have a conditional delay. E.g.,

A.out -> B.in after if just_entered (10 msec, -1) else (10 msec, 0);

Where “just_entered” is some built in variable.

Every writer of modal models would have understand these nuances...

It is certainly possible in Ptolemy II to build odd models with unexpected behavior by using microsteps combined with logical delays. But I think this will be true of the proposed semantics as well. The price we pay with the proposed semantics, I think, is that the default behavior is unlikely to be what you want in most cases, and that getting what you want will require new, complex syntax. And it will require introducing the nuances of microsteps right from the start.

One possible solution is to keep the current semantics, but augment the syntax to allow explicit tag_intervals as arguments to after and schedule. E.g.

A.out -> B.in after 10 msec; // Discards microstep.
A.out -> B.in after (10 msec, 0); // Preserves microstep.
A.out -> B.in after (10 msec, -1); // Modifies microstep.

I could live with this, although I have no use cases. Since it’s unlikely to be used, we probably don’t need to change skip events to carry a payload. I’m not sure anyone would ever see the resulting performance hit. (Actually, I’m not even sure skip events with payloads will perform better than what we have now.)

Note that, fundamentally, this is syntactic sugar. If want to preserve microsteps, that can be done by hand with the current semantics by carrying the microstep in the payload of the message and writing the receiving reaction to reschedule the event until the microstep matches. This is a manual version of our skip events.

Also, currently, skip events are only used for federated communication. The proposed semantics will result in them being used for all executions, federated or not. Will this be a performance hit?

@Soroosh129
Copy link
Contributor

Soroosh129 commented Nov 15, 2020

Then it will have events at microstep 0 and N. Events whose timing is determined by a feedback loop with after will not be simultaneous with timer events except in the initial mode, and the spacing between these events will diverge.

I understand why it is important to lineup with microstep 0. Nevertheless, in doing so, the current semantics would also collapse the N states of the modal model into an arbitrarily smaller set downstream depending on the structure of the upstream reactors. This in turn will create logical simultaneity where there isn't one. I think this problem outweighs the nicety of aligning with 0.

In addition to that, Lingua Franca is based on a reactive model, meaning that even with the proposed new semantics, downstream reactions will get triggered at the desired logical time. Therefore, without knowing the nuances of microsteps, the proposed new semantics would work as expected. I don't think a programmer can reasonably expect the following two reactions to occur simultaneously:

reaction(timer) {=
    // Do something at microstep 0
=}
reaction(port_from_upstream) {=
   // Do something at microstep x
=}

Even with a reaction(timer, port_from_upstream) you wouldn't (and shouldn't) expect both triggers to be present at microstep 0. I do realize however that if, for some reason, you would want timer and port_from_upstream to be simultaneous, you would need to have an after delay with negative microsteps. But is this a use-case that should be cared for in a reactive model?

Also, currently, skip events are only used for federated communication. The proposed semantics will result in them being used for all executions, federated or not. Will this be a performance hit?

Currently, we do not increment the logical time one by one. Rather, we pop an event from the event queue and directly advance the current logical time to the time of that event. There is no technical limitation that would prevent us from doing so for microsteps.

@edwardalee
Copy link
Collaborator

These are valid points. I guess the question is, how much do we want to be “all in” on super dense time? This discussion reminds me a bit of the debate we had over how to handle time in modal models in Ptolemy II. The multi form model we ended up with, I think, was not expected by anyone when we started...

@cmnrd
Copy link
Collaborator Author

cmnrd commented Nov 16, 2020

Since we agreed we want to be more disciplined about how we organize our discussions, I want to point out that this is not the right issue for having a general discussion about LF scheduling semantics. The issue was intended as a reminder for myself to eventually bring the C++ runtime up to date with the current semantics. However, I don't think GitHub provides a mechanism for moving posts around, so I will also add my bits here...

Is there a concrete proposal on how to solve the issue? As far as I can see, any mechanism that preserves microsteps will require a full API that allows to specify the precise logical time and micro step delays in all places. I think if we go done this path, our time type actually needs to always include the signed microstep. Without going all in, there will be other examples that behave weirdly and where we are not able to express the simultaneity of certain events. However, making microstep a first order citizen in LF is a very high price to pay for an alleged gain in expressiveness.

I cannot see anything wrong with the current semantics. Although the result might be unexpected to some people, it follows clear rules. Apart from a contract outside the LF semantics, the two source reactors schedule events independently and send messages at certain tags. Due to the after, these messages lead to the scheduling of new events. Thereby it can happen that two new logical simultaneous events are scheduled, although the original events were not simultaneous. I don't see anything wrong with that. Actually, this could be seen as an optimization, as simultaneous events are cheaper to handle than two consecutive events. Only because we specify a contract outside of LF that puts the two events in relation to each other, we consider the behavior to be weird. I argue that microsteps simply should not be used in that way.

I actually think that the example Marten posted is erroneous. It assumes that not setting an output port carries information, which it clearly does not under current semantics. The program would work as expected if it actively sets the output port to false in the off state. This is something you would need to do in any other language that I can think of, especially in an HDL implementing logic circuits. So why should we want something different?

If we really want to preserve the information of absent messages, we could do so with another mechanism that does not require an API for scheduling events with a microstep delay. We could simply send null messages when a reaction does not set a port which it lists under its effects. While this certainly comes with an overhead, it would naturally solve the problem at hand without introducing new problems in other places (at least as far as I can see).

@cmnrd
Copy link
Collaborator Author

cmnrd commented Nov 16, 2020

I don't think a programmer can reasonably expect the following two reactions to occur simultaneously:

reaction(timer) {=
    // Do something at microstep 0
=}
reaction(port_from_upstream) {=
   // Do something at microstep x
=}

If we cannot expect logical simultaneity in the case above, why can we expect it in the case below?

reaction(port_a_from_upstream) {=
    // Do something at microstep x
=}
reaction(port_b_from_upstream) {=
   // Do something at microstep y
=}

@lhstrh
Copy link
Member

lhstrh commented Nov 16, 2020

Here's an updated algorithm:
image

@lhstrh lhstrh added cpp Related to C++ target runtime Related to the runtime implementation labels Jan 14, 2021
@cmnrd cmnrd added this to the RC 0.1 milestone Mar 23, 2021
@cmnrd cmnrd removed this from the 0.1.0 milestone Feb 25, 2022
@cmnrd cmnrd added this to the 0.2.0 milestone Feb 25, 2022
@lhstrh
Copy link
Member

lhstrh commented Oct 11, 2022

@cmnrd is there still anything actionable with respect to this issue, or can we close this?

@cmnrd
Copy link
Collaborator Author

cmnrd commented Oct 11, 2022

The posted scheduling algorithm is still not implemented in C++... but I also don't know how it can be implemented efficiently without either searching the entire event queue or keeping a local copy of all events that trigger the particular action. The strategy currently used in the C++ runtime is still to overwrite any existing events with the same tag instead of deferring to the next microstep. To me overwriting the event still appears to be the more intuitive (and predictable) way of resolving conflicts, but this is a whole other discussion...

@cmnrd
Copy link
Collaborator Author

cmnrd commented Jun 14, 2024

Closing this. The discussion in #1464 concluded that the current replace behavior should be the default.

@cmnrd cmnrd closed this as completed Jun 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cpp Related to C++ target runtime Related to the runtime implementation
Projects
None yet
Development

No branches or pull requests

4 participants