Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: New API for work-stealing from QueueExecutionContext #3853

Closed
wants to merge 8 commits into from

Conversation

WojciechMazur
Copy link
Contributor

Based on discussion from #3144 about new api for NativeExecutionContext that shall be maintanble under the changes to Scala standard library in the future.
Currently we'd like to add only work-stealing part so threads can help with completion of tasks from the QueueExecutionContext, it's especially important in the context of #3852 where we can have multiple task producers.

I'm not sure if we should add scheduling capability at this point, probably this can be added in a separate PR

  • Add dedicated trait implemented and exposed by NativeExecutionContext.queue
  • Deprecate runtime.loop, instead use NativeExecutionContext.queue.helpComplete

@catap
Copy link
Contributor

catap commented Mar 28, 2024

@WojciechMazur this still doesn't allow to implement integration with event loop. An example of implementation before your refactoring:

  private def loopRunOnce(): Boolean = {
    if (queue.nonEmpty) {
      val runnable = queue.remove(0)
      try runnable.run()
      catch { case t: Throwable => ExecutionContext.global.reportFailure(t) }
      if (uv_loop_alive(uvLoop) != 0) {
        uv_update_time(uvLoop)
        uv_run(uvLoop, uv_run_mode.NOWAIT)
      }
    } else if (uv_loop_alive(uvLoop) != 0) uv_run(uvLoop, uv_run_mode.ONCE)
    (uv_loop_alive(uvLoop) != 0) || queue.nonEmpty
  }

here I should know has the executor got anything inside to run it, or I may block until the event on the loop.

With this PR it still impossible to make.

Comment on lines 61 to 70
override def stealWork(maxSteals: Int): Unit = {
if (maxSteals <= 0) ()
else {
var steals = 0
while (steals < maxSteals && hasAvailableTasks) {
doStealWork()
steals += 1
}
}
}
Copy link
Contributor

@catap catap Mar 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
override def stealWork(maxSteals: Int): Unit = {
if (maxSteals <= 0) ()
else {
var steals = 0
while (steals < maxSteals && hasAvailableTasks) {
doStealWork()
steals += 1
}
}
}
override def stealWork(maxSteals: Int): Boolean = {
if (maxSteals > 0) {
var steals = 0
while (steals < maxSteals && hasAvailableTasks) {
doStealWork()
steals += 1
}
}
queue.hasAvailableTasks
}

like this allows to implement libuv integration without any issue

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's the issue with taking look at it at 2 AM, but I don't really get how returning boolean helps here. Wouldn't having an additional call to public method returning the same value would be feasible? Especially in context of previous snippet with libuv usage seems to confirm that, becouse I assume it could be rewritten as following I suppose:

  val queue = QueueExecutionContext
  private def loopUV(): Boolean = {
    queue.stealWork(1) // execute first or noop
    if (queue.hasAvailableTasks) {
      if (uv_loop_alive(uvLoop) != 0) {
        uv_update_time(uvLoop)
        uv_run(uvLoop, uv_run_mode.NOWAIT)
      }
    } else if (uv_loop_alive(uvLoop) != 0) uv_run(uvLoop, uv_run_mode.ONCE)
    (uv_loop_alive(uvLoop) != 0) || queue.hasAvailableTasks
  }

Disclaimer: I do miss context about usage of libuv, I have no experience in it and don't really know how you're integrating it in rest of the code
My main issue with making the stealWork return Boolean is the fact that it might be confusing. I'd rather expect boolean to symbolize that it have stealed some work instead of informing that it can steal more tasks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, making hasAvailableTasks public helps here.

Regarding original issue or some context: libuv is just an example event-loop. You may think about it as a pool or select. The goal is mix poll/select with scala-native event loop with trivial logic:

  • if SN has something else, run multiplexor without blocking to fetch that it has already;
  • if SN hasn't got anything else, block on it until something happened.

@armanbilge
Copy link
Member

Thanks for working on this API!

I'm not sure if we should add scheduling capability at this point, probably this can be added in a separate PR

I think we should think about this now before this API is finalized.

For example, consider that the runtime supports the enqueuing of timers. How should the timer stealing API look like? If there is a timer scheduled, but it will not fire until the future, does isWorkStealingPossible return true or false? How can the user determine how long to wait for the timer?

For any user application built on top of this API to be backwards-compatible when scheduling capabilities are introduced, we must put in the hooks now.

Similarly, what about I/O? It should be possible to support AsynchronousSocketChannel within this framework but we need to add APIs to allow stealing of I/O events.

@WojciechMazur
Copy link
Contributor Author

Ok, have to admit that with the remaining part it might not be so trivial.

How should the timer stealing API look like?

If we don't want to force any particular dependencies, then for timers we can simply add dedicate queue like you do in CE, and then execute or move timed out timers to the front of compute-queue. We can inject at the begining of the stealWork calls, not the most efficient though. Not really sure we could allow to determinate how long to. wait for the timers. Up to that point the native ExecutionContext was immutable and opaque, so we might expose mutable config to setup defaults.

If there is a timer scheduled, but it will not fire until the future, does isWorkStealingPossible return true or false?

Tough question. If we assume to EC is just a computation bound then it shall return false. Maybe an additional def untilNextScheduled(): Option[FiniteDuration] allowing to determinate by the user if awaiting for the next event makes sense?

Similarly, what about I/O?

At that point we'd need to also provide the whole support based on epoll, kqueue or overlapped operations on Windows. Doable, but would take time. And we're already behind the schedule.

As long as we're dealing with only computation-bound queue it's fine, but when we dig into the I/O, async stuff then we quickly get to the point, where we need to have a fully functional event loop. I'll try to dig into these areas tomorrow, altough I hoped we would be able to make a release soon.
I'd appreciate any drafts or more info what kind of operations would be requested by the CE to make integration easier.

@armanbilge
Copy link
Member

Doable, but would take time. And we're already behind the schedule.

I agree, and as I have said before I do not think that there is an urgency:

Cats Effect might also benefit from this, but we have also been successful without it—it is not a blocker.

#3144 (comment)

Until Scheduling and I/O APIs land in Scala stdlib it might be difficult to know exactly how it should look like.


As long as we're dealing with only computation-bound queue it's fine, but when we dig into the I/O, async stuff then we quickly get to the point, where we need to have a fully functional event loop.

Yes, indeed. That is the point of the interfaces that we are proposing to add to stdlib: to create an event loop abstraction. The power of this abstraction is that would be cross-platform (JVM / JS / {single,multi}-threaded Native) and async framework agnostic (Future / IO / ZIO / Gears / Ox), the same way that ExecutionContext is a universal API for compute tasks today.


Cats Effect might also benefit from this, but we have also been successful without it—it is not a blocker.

Along these lines, I think it is important to take a step back and consider why we are even adding these APIs anyway. The current proposed usecase for these APIs is to make it possible to interleave two event loops (QueueExecutionContext and an external loop e.g. libuv). But if an application is using libuv, then why is it submitting tasks to the QueueExecutionContext at all instead of to e.g. a UvExecutionContext?

In Scala, hard-coding an ExecutionContext instead of taking it as a configurable (implicit) parameter is generally considered bad practice. So any library that has hard-coded QueueExecutionContext or ExecutionContext.global is following bad practices and should be fixed. In that case, the stealing APIs proposed here are not necessary at all, as demonstrated by Cats Effect.

@catap
Copy link
Contributor

catap commented Mar 31, 2024

#3144 (comment)

Until Scheduling and I/O APIs land in Scala stdlib it might be difficult to know exactly how it should look like.

As a guy who had implemented Scheduling and async I/O for Scala-Native via libuv and who struggle with porting this to 0.5.0: I plan to release under AGPL the mail repository which I'm working quite soon (I really hope, really-really) but this part isn't critical for solution and I can extract it under public domain which allows to use it everywhere, like I did for SHA and Blake3.

libuv can be avoided, but it requires to some work to reimplemnt support of different systems with tons of compatibility bugs which should be fixed and before discovered.

@JD557
Copy link
Contributor

JD557 commented Mar 31, 2024

Until Scheduling and I/O APIs land in Scala stdlib it might be difficult to know exactly how it should look like.

Is there any work actually being done on this, or is this just an hypothetical? The last comment on the contributors thread is more than one year old.

Maybe it's OK to just go ahead with this change and then, if needed, implement the Scheduling API on 0.6.x?

I think it's the 3rd time this feature is implemented, so there's clearly some interest in this, and I fear that we are letting perfect be the enemy of good.

Copy link
Contributor

@catap catap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WojciechMazur thanks for new API. It makes me very happy that ugly hack with rawPtr finaly removed.

@armanbilge
Copy link
Member

armanbilge commented Mar 31, 2024

Is there any work actually being done on this, or is this just an hypothetical? The last comment on the contributors thread is more than one year old.

Yes. The necessary changes are being made to tooling so that the stdlib is open for additions. As soon as it is, I intend to propose these new interfaces. See Discord thread.


I think it's the 3rd time this feature is implemented, so there's clearly some interest in this,

Right, but interest does not justify adding it. As I wrote in #3853 (comment), are we sure that these APIs are necessary or just a workaround for bad practices?

Basically I would like to understand if there is a better explanation than "libraries should be able to hard-code ExecutionContext.global even though it is a bad practice and other libraries should be able to work around that via these APIs". Because in my opinion this is a poor reason to add these APIs.

@catap thoughts?

@catap
Copy link
Contributor

catap commented Mar 31, 2024

Is there any work actually being done on this, or is this just an hypothetical? The last comment on the contributors thread is more than one year old.

Yes. The necessary changes are being made to tooling so that the stdlib is open for additions. As soon as it is, I intend to propose these new interfaces. See Discord thread.

Well, if we move into implementing NIO, we defently need something as dummy inside Scala.JS to make ecosystem complete.

I do have some wrappers around NIO which works with SJS and SN, but it quite limmited.

Frankly speaking this task quite amazing and huge, special in case of debuging it and make it stable enoug.

I think it's the 3rd time this feature is implemented, so there's clearly some interest in this,

Right, but interest does not justify adding it. As I wrote in #3853 (comment), are we sure that these APIs are necessary or just a workaround for bad practices?

Basically I would like to understand if there is a better explanation than "libraries should be able to hard-code ExecutionContext.global even though it is a bad practice and other libraries should be able to work around that via these APIs". Because in my opinion this is a poor reason to add these APIs.

@catap thoughts?

Without this "stealing work" API, it is impossible to mix SN event loop and event loop for socket via libuv or similar library.

When multithreading works and availble, an event loop can be moved to dedicated thread and no need of such API.

But not all platform is supporting multithreading now, and some new platform may not support it.

So, we need a way to allow it for such platform at least.

@armanbilge
Copy link
Member

Without this "stealing work" API, it is impossible to mix SN event loop and event loop for socket via libuv or similar library.

I agree. However, my question is why is it necessary to mix the event loops to begin with?

@catap
Copy link
Contributor

catap commented Mar 31, 2024 via email

@armanbilge
Copy link
Member

armanbilge commented Mar 31, 2024

Without such mixing it is impossible to create any server which.

@catap Have you looked at the Typelevel stack? We've had a working HTTP stack since 2022 without using any "hacks" or mixing event loops. You may also be interested to study the implementation because it supports NIO without libuv.

@catap
Copy link
Contributor

catap commented Mar 31, 2024 via email

@armanbilge
Copy link
Member

In my case it works with standard execution context, like scala-native-loop does, which seems cleaner and more user friendly.

Right, so if I understand correctly you are confirming what I already said #3853 (comment):

Basically I would like to understand if there is a better explanation than "libraries should be able to hard-code ExecutionContext.global even though it is a bad practice and other libraries should be able to work around that via these APIs". Because in my opinion this is a poor reason to add these APIs.

As I said, I believe this API is being introduced as a workaround for poor library design that does not follow best practices. I do not think that is a good justification to introduce this API.

@armanbilge
Copy link
Member

standard execution context

@catap just to clarify: by "standard" do you mean "global", or do you mean ExecutionContext in general? For the record, it is 100% possible to build an ExecutionContext that is backed by the libuv event loop.

@armanbilge
Copy link
Member

And the hack is simple moved into it.

What hack? There is no hack.

@catap
Copy link
Contributor

catap commented Mar 31, 2024 via email

@catap
Copy link
Contributor

catap commented Mar 31, 2024

standard execution context

@catap just to clarify: by "standard" do you mean "global", or do you mean ExecutionContext in general? For the record, it is 100% possible to build an ExecutionContext that is backed by the libuv event loop.

As "standrd" I mean scala.concurrent.ExecutionContext.Implicits.global

@armanbilge
Copy link
Member

Let assume that we have two libraries L1 and L2, and both requires to use their own ExecutionContext. We are doom, aren't we?

Yes. This is because the libraries have not followed best practices. Best practice is to never hardcode an ExecutionContext, including ExecutionContext.global. Best practice is to instead take an implicit ec: ExecutionContext as a parameter. See for example the design of Future.

If L1 and L2 both accept an ExecutionContext as a parameter, then you can share the same ExecutionContext between them.

@catap
Copy link
Contributor

catap commented Mar 31, 2024 via email

@armanbilge
Copy link
Member

armanbilge commented Mar 31, 2024

L1 requires to use L1ExecutionContext, L2 requires L2ExecutionContext.

I'm not sure what you mean by "require". Can you give an example?

ExecutionContext is an abstraction:

trait ExecutionContext {
  def execute(runnable: Runnable): Unit
  def reportFailure(cause: Throwable): Unit
}

Suppose that L1ExecutionContext and L2ExecutionContext both add extra methods to the interface, for example:

trait L1ExecutionContext extends ExecutionContext {
  def doL1Thing(runnable: Runnable): Unit
}

trait L2ExecutionContext extends ExecutionContext {
  def doL2Thing(runnable: Runnable): Unit
}

In this case, I completely agree with you that it would be impossible for L1 or L2 to use the other library's ExecutionContext. You are indeed "doomed".

But if that is the case you are in, I do not see how the changes in this PR will help you at all. The fundamental problem is that you cannot mix two event loops in a single-threaded application. There is no way around that.

@catap
Copy link
Contributor

catap commented Mar 31, 2024 via email

@armanbilge
Copy link
Member

and scala.scalanative.runtime.NativeExecutionContext is just another implementation of ExecutionContext which adds some methods to be easy integrated into someone's event loop or similar thing.

Yes, and this is not necessary. If some library L1 is able to use the "standard" global ExecutionContext abstraction but library L2 needs a special L2ExecutionContext, then the user should simply pass L2ExecutionContext as a parameter to L1. Then there will be only one ExecutionContext in the application and everything will work properly.

Frankly, I don't understand the reason of this discussion.

Me neither :) I'm afraid we are talking past each other. I understand your goals are to implement a single-threaded server. This is possible without the changes in this PR and without the hack you are currently using. I know this because this is exactly what we've done in Typelevel.

@JD557
Copy link
Contributor

JD557 commented Apr 1, 2024

I was thinking a bit more about this last night, and I still have some questions regarding rejecting this:

First, SIP-51 only proposes dropping the forwards binary compatibility for the 2.13 stdlib, so even if the Schedule API lands, Scala native will still need to support an implementation without Schedule until 2.12 support is dropped

Second, regarding, "libraries should be able to hard-code ExecutionContext.global even though it is a bad practice and other libraries should be able to work around that via these APIs", I think this PR actually allows the opposite.
With this, libraries that are now hard-coding NativeExecutionContext.queue can request an implicit ec: ExecutionContext with WorkStealingExecutor, and then it's up to the application to pass the execution context (which might usually be NativeExecutionContext.queue, but could be something else).

On that note, maybe it would be nice if QueueExecutionContext was a class instead of a private object (or at least have some public constructor, kind of like ExecutionContext.fromExecutorService(null) creates an alternative to the default global EC.

@WojciechMazur
Copy link
Contributor Author

On that note, maybe it would be nice if QueueExecutionContext was a class instead of a private object (or at least have some public constructor, kind of like ExecutionContext.fromExecutorService(null) creates an alternative to the default global EC.

Yeah, I was thinking about it, but then we'd need to somehow track all the instances of the QueueExecutionContext and create a loop to iterate on all of these, otherwise SN runtime would never be able to advance them, so it would be the users responsibility to always remember about iterating the loop.

However, maybe we should not expose any API at all? What if the iteration on the internal QueueExecutionContext would happen automatically in some well known parts of Java standard library, eg. Thread.sleep, Thread.join, LockSupport.park, Object.wait - call to any of this gives an opportunity to iterate over the queue execution context.

LockSupport.park is used to implement Await.result/ready so it would be enough to cover the basic needs

@catap
Copy link
Contributor

catap commented Apr 1, 2024 via email

@catap
Copy link
Contributor

catap commented Apr 1, 2024

To be clear: https://github.com/scala-native/scala-native-loop is quite official and release the next version of scala-native which brokes it, without any way to fix it is insane, isn't it?

@JD557
Copy link
Contributor

JD557 commented Apr 1, 2024

Yeah, I was thinking about it, but then we'd need to somehow track all the instances of the QueueExecutionContext and create a loop to iterate on all of these, otherwise SN runtime would never be able to advance them, so it would be the users responsibility to always remember about iterating the loop.

This was actually one of the use cases which I think would be great. This would effectively work as a better parasitic.

  • You can start the computation when you want
  • You can limit the computation (e.g. instead of running helpComplete(), you wound run stealWork(timeout))

However, maybe we should not expose any API at all? What if the iteration on the internal QueueExecutionContext would happen automatically in some well known parts of Java standard library, eg. Thread.sleep, Thread.join, LockSupport.park, Object.wait - call to any of this gives an opportunity to iterate over the queue execution context.

I don't have a strong opinion about this, although, for reference, I have some code that relies on SDL_sleep instead of Thread.sleep because of #2850. I wonder if cases like that could also be covered 🤔

Move to scalanative.concurrent package
@WojciechMazur
Copy link
Contributor Author

I've prepared a next iteration of the API, 2 main additions:

  • Scheduling - seperate queue of scheduled timers, transfered to head to the computeQueue in stealWork methods
  • EventHandlers - the idea is to have a handle to computation that can be resumed when some external event (thread, signal handler, callback) would signal readiness to continue execution. At that point the task is moved to head of the compute queue. Handlers are modeled as stateful tasks (based on CE3 handlers) allowing to repeat execution. When integrating with libuv, signal can to the handler can be called e.g. from the uv_fs_open or uv_fs_read callback

@armanbilge
Copy link
Member

armanbilge commented Apr 2, 2024

Second, regarding, "libraries should be able to hard-code ExecutionContext.global even though it is a bad practice and other libraries should be able to work around that via these APIs", I think this PR actually allows the opposite.
With this, libraries that are now hard-coding NativeExecutionContext.queue can request an implicit ec: ExecutionContext with WorkStealingExecutor, and then it's up to the application to pass the execution context (which might usually be NativeExecutionContext.queue, but could be something else).

@JD557 Any library that is currently hard-coding NativeExecutionContext.queue can change to request an implicit ec: ExecutionContext today and just work without us making any changes to Scala Native. Meanwhile, the application can pass any ExecutionContext whether it is NativeExecutionContext.queue or UvExecutionContext or whatever.

This is precisely my point :) I'm not convinced that anybody actually needs these "stealing" APIs unless they are hard-coding an ExecutionContext instead of taking it as a parameter.

There are really two kinds of libraries we are talking about:

  1. Libraries that need to schedule async tasks, these should take an ExecutionContext parameter
  2. Libraries that are implementing ExecutionContext as an event loop that possibly also supports timers/scheduling and I/O.

When wiring up an application, the user just needs to pass the EC from (2) to (1).


To be clear: https://github.com/scala-native/scala-native-loop is quite official and release the next version of scala-native which brokes it, without any way to fix it is insane, isn't it?

@catap I am confident there is a way to fix it. Again, I strongly encourage you to look at Cats Effect where we have already implemented a working event loop without these APIs and without any hacks. I don't know what else to say, it really is possible.


@WojciechMazur I appreciate the iteration and I'll take a look, but based on this discussion I've become less convinced that we need this API at all, unless we are deliberately supporting the hard-coding usecase. If we insist that we are not supporting that usecase, then I no longer see the motivation for these changes.

@catap
Copy link
Contributor

catap commented Apr 2, 2024

To be clear: https://github.com/scala-native/scala-native-loop is quite official and release the next version of scala-native which brokes it, without any way to fix it is insane, isn't it?

@catap I am confident there is a way to fix it. Again, I strongly encourage you to look at Cats Effect where we have already implemented a working event loop without these APIs and without any hacks. I don't know what else to say, it really is possible.

I need a layer which allows to use the same code in JVM, Native and JS for both DOM and Node. I try to avoid here any dependencies and frankly speaking it has a few dependencies.

Adding Cats Effects seems a bit overkill here. Special that code base is stable enough and I don't see any reason to move it to something else, frankly speaking I better to make one more RawPtr hack, without touching it again to stabelize it.

Thus, the right move is implementing NIO API inside SN and SJS, but as far as I recall it won't happening for the last project, so, I stuck with may wrapper anyway.

@WojciechMazur
Copy link
Contributor Author

When it comes to hard-coded execution contexts in the core Scala Native repo there is only 1 usage of them - testing interface. When running tests we use either ForkJoinPool based execution context when multithreading is enabled, or queue execution context when running in single threaded mode. Because we're spawning a new process for each native test runner we need to support this hard coded queue.
The only alternative I can think of would be to use ServiceProvider to specify a custom execution context to be used, although it would require currently configuration on the user side, unless we'd come up with solution to support service providers that are referenced in the code, instead of being explicitly enabled. This way test framework/library authors could try to hint about using different execution context underneath.
Otherwise, we might leave the simplified API allowing only to iterate over compute-bounded tasks in the queue execution context (WorkStealing) to make integration possible. This part makes the most sense to me, as it would work as safety hatch when integrating with different executors/event loops to allowing for interleaved execution of tasks between the too.

@catap
Copy link
Contributor

catap commented Apr 2, 2024

Otherwise, we might leave the simplified API allowing only to iterate over compute-bounded tasks in the queue execution context (WorkStealing) to make integration possible. This part makes the most sense to me, as it would work as safety hatch when integrating with different executors/event loops to allowing for interleaved execution of tasks between the too.

Frankly speaking I really think that it should be backported to upstream Scala.

@armanbilge
Copy link
Member

When it comes to hard-coded execution contexts in the core Scala Native repo there is only 1 usage of them - testing interface.

Yes, that's right. Fortunately this is not a problem in practice, because these test tasks happen at the "end of the world" i.e. they do not need to be interleaved with other user tasks submitted during testing. This is why we could integrate successfully in Cats Effect.


Otherwise, we might leave the simplified API allowing only to iterate over compute-bounded tasks in the queue execution context (WorkStealing) to make integration possible.

I still have not seen a good reason that this needs to be a public API.

This part makes the most sense to me, as it would work as safety hatch when integrating with different executors/event loops to allowing for interleaved execution of tasks between the too.

In my opinion, this safety hatch is not necessary and it encourages poor library design that hard-code the ExecutionContext instead of taking it as a parameter.

Just for the record, can anyone here provide an example of a library that does that, and why it cannot be fixed to take an ExecutionContext as a parameter?

I'm sorry, I don't mean to be obstructive: I just want us to be confident that there is a real-world motivation for making this change. Implementing event loops is an advanced topic and I am concerned that this API is being added out of confusion instead of because it is necessary.

@ekrich
Copy link
Member

ekrich commented Apr 2, 2024

I'm sorry, I don't mean to be obstructive: I just want us to be confident that there is a real-world motivation for making this change. Implementing event loops is an advanced topic and I am concerned that this API is being added out of confusion instead of because it is necessary.

I am tending to to side with the passing of an execution context. I think this is different in kind to what Scala Native has to do internally to manage single and multi-threaded. Although I am certainly not an expert in this area, it seems to be the approach that library creators should use and that is what is taught to users as well.

I do also agree with @catap that changes that apply equally to all three platforms should be implemented upstream where that makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants