Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output semantics and implementation #26

Open
j-hui opened this issue Jun 21, 2021 · 3 comments
Open

Output semantics and implementation #26

j-hui opened this issue Jun 21, 2021 · 3 comments

Comments

@j-hui
Copy link
Collaborator

j-hui commented Jun 21, 2021

I realize that we talk about I/O implementation as somehow one topic, but the implementation for input vs output turn out to be rather orthogonal. Thus, I'm starting this issue to specifically discuss output variables (and will open another issue to discuss input variables when the issues arise).

We want SSM programs (and their generated code) to be I/O agnostic, but it leads to this question: what should the behavior of the following program be:

f :: Ref Bool -> SSM ()
f o = do
  o <~ False

g :: Ref Bool -> SSM ()
g o = do
  o <~ True

m :: Ref Bool -> SSM ()
m o = fork [ f o, g o ]

Within SSM's semantics, o should have the value True at the end of the instant where m is called (assuming nothing else assigns to o), since the priority of g is lower than f due to how the two child processes are forked by m.

However, it's somewhat unclear what the behavior should be if o is an output variable. If we were to naively overload the assignment operation and eagerly transmit its value to the external environment, the environment would briefly see a glitch value of False before seeing the value settling to True, even though the semantics of the language say that these updates are happening at the same logical time. Worse, if the output variable has stream/continuous semantics (e.g., it represents a socket file descriptor), both False and True would be emitted (rather than overwriting one another), apparently at the same time.

(Note, though, that this does not happen for delayed assignments, because SSM's semantics specify that there can be at most one scheduled update in the event queue for each scheduled variable.)

To me, the intuitive behavior of only emitting True at the end of the instant suggests to me that we should take a "lazy" approach to output, where the intra-instant updates are accumulated in the scheduled variable, and only emitted at the end of the instant. But this incurs more latency than necessary, especially in the presence of long-running, lower priority continuations; that latency is less than desirable for a real-time language.

A more extreme option is to make instantaneous assignments to output completely non-effectful, which I am interested in entertaining. After all, what does it even mean to instantly compute and emit output?

@j-hui
Copy link
Collaborator Author

j-hui commented Jun 21, 2021

On the subject of implementing output variables, I came up with what I think is a reasonable eager output implementation that doesn't rely on vtables or special output ref types, i.e., the type implementations and codegen remain entirely agnostic to whether a variable happens to affect the external environment.

Here's some pseudocode for what the tick function looks, currently:

# start tick by performing all delayed assignments for current time
while event_queue.head().time is now:
  let sv = event_queue.dequeue_head()

  # perform update
  sv.update() # virtual function

  # wake up listening processes
  for t in sv.triggers:
    act_queue.enqueue(t.act)

# etc..

What we want is the value update to be "felt" by the external environment only if an event is an output variable, by calling out to some platform-specific output handler. So far, implementation suggestions have focused on pushing that into the sv's update function, or deferring the output handler to take place in a continuation (i.e., after # etc..).

My suggestion is to add the output handler to a scheduled variable's trigger list, and then invoke that output handler while iterating through the trigger list:

  # wake up listening processes
  for t in sv.triggers:
    case t:
      Act (a) -> act_queue.enqueue(a)
      OutputHandler (o) -> o()

This runs before any continuations during that instant, meaning it is still eager, but doesn't rely on update being virtual. Notably, this means we can do the exact same thing during instantaneous assignment, without needing to make assign virtual. In fact, this requires no modifications to the data type implementations. Also, it doesn't require any extra memory: with some careful bit hacking, we can union the struct act * and the output handler function pointer in the same field in struct trigger.

The cost we pay is an extra branch to check each trigger, though I think that can be optimized too (we should only need to check for an output handler once). We also still need to figure out from what interface we obtain these output references, since those will need to add this special trigger to the trigger list, but that's somewhat orthogonal to this discussion.

@Rewbert
Copy link
Collaborator

Rewbert commented Jun 21, 2021

The solution you pose sounds cool, and I think it would be fun to see how such an implementation looks like, but I am thinking about the problem you initially posed.

In the example program, where m forks functions f & g, is that really a compiler issue? I would think that that's an application error, i.e a poorly written program. All other programming languages surely have the same problem? Or maybe I am missing the point. It seems like we have been too careless with our resources, and now we have two processes that wish to modify the same resource at the same time.

@j-hui
Copy link
Collaborator Author

j-hui commented Jun 21, 2021

Other "asynchronous" languages usually leave the behavior of data races undefined, or at least nondeterministic, but only because they don't have the total ordering between processes we have (which, to me, is the point of our programming model).

My worry is that, with an eager implementation, an external effect is duplicated, for the same logical time, since the output handler may be invoked multiple times. It just seems inconsistent to say that this behavior is somehow undefined when we give such strong guarantees about the behavior of "races" on variables that aren't output variables.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants