Enumerator fiber yield #2002

ioquatix · 2018-11-03T08:54:17Z

This makes it so that it is possible to call Fiber.yield within a Enumerator block.

This is breaking async when non-blocking IO is used in an Enumerator: socketry/async#23

I'm not sure if this is the best solution, but it feels like the right approach.

The general idea is that user should not worry about how Enumerator is implemented, Fiber.yield should work as expected.

funny-falcon · 2018-11-05T10:30:33Z

Second commit just cancels first, and it tries to solve problem in a wrong way...

Async should use transfer, because logically, transfer is to switch fiber, and yield is to dive into fiber.

funny-falcon · 2018-11-05T10:40:01Z

Fiber.resume/yield is to dive into fiber as into assymetric coroutine. Enumerator is assymetric coroutune, that is why Fiber.resume/yield is suitable for.

Fiber.transfer is for swithing between symmetric coroutines, and "lightweight threads" are symmetric coroutines.

Fiber.resume/yield is just Fiber.transfer with call stack maintained. Therefore, you can always use Fiber.transfer instead of Fiber.yield if you maintain call stack by your self. But you don't need to.

ioquatix · 2018-11-05T12:15:19Z

Ignoring Async for a moment, and thinking purely about Enumerator - do you think that the internal detail of how Enumerator is working should change the behaviour of user code?

Because, to me, I think behaviour of Fiber.yield should not change whether Enumerator is "internal" or "external". So, the 2nd commit is a way to achieve that.

That being said, my first approach was to use transfer, but it turns out to be impossible to use transfer because there is no way to resume the correct fiber. Consider the following:

#!/usr/bin/env ruby

require 'fiber'

class Fiberator
	def initialize(&block)
		@caller = nil
		@fiber = Fiber.new(&block)
	end
	
	def next
		return nil unless @fiber.alive?
		
		@caller = Fiber.current
		
		return @fiber.transfer(self)
	end
	
	def << value
		@caller.transfer(value)
	end
end

e = Fiberator.new do |y|
	while true
		Fiber.yield
		
		y << 10
	end
end

f = Fiber.new do
	puts e.next
	puts e.next
end

f.resume
f.resume # double resume

Once you transfer to another fiber, you MUST transfer back. Otherwise, your assumptions about fiber stack are wrong and you don't know who to resume.

So, using Fiber.transfer in Enumerator is impossible if we want to allow use to call Fiber.yield predictably.

Therefore, the only solution is the 2nd commit, which captures Fiber.yield, and correctly forwards it to user code.

Taking into account async, unfortunately it depends on nested resume/yield, so unless we implement our own stack, it's impossible to simply use transfer. It has similar issues to the above design too.

I did try implementing it here: https://github.com/socketry/async/blob/fiber-transfer/lib/async/scheduler.rb and I might have another go at trying to make it work, but it was tricky to get the right behaviour. I'm also concerned about performance of tracking that in "interpreted" code since by design fiber context switch needs to be fast. I'd rather pay a small cost in Enumerator than a big cost in async for every context switch.

funny-falcon · 2018-11-06T05:32:33Z

Impossible? I did it once, and it worked quite well for me: https://gist.github.com/funny-falcon/2023354
But today I really think Enumerator should not use Fiber.transfer.

The fact "async" uses nested Fiber.yield is a design mistake of "async", and it should not lead to bad decisions in Ruby.

I did some thing that were quite close to "async" by features on top of EventMachine, and all attempts to use nested Fiber.yield lead to errors. Use of EM.next_tick always lead to much more composable and managable solution, because symmetric coroutines should be scheduled with symmetric mechanism.

ioquatix · 2018-11-06T05:34:26Z

I am interested in your patch, I will try it out.

The fact "async" uses nested Fiber.yield is a design mistake of "async", and it should not lead to bad decisions in Ruby.

Can you explain why you think using Fiber.yield is a design mistake?

funny-falcon · 2018-11-06T05:46:31Z

I've already explained. But I will repeat:

Symmetric coroutines should not use assymmetric control switch between them. Assymetric control switch should only be between coroutine and scheduler. Direct switch between coroutines should be only symmetric.

Async's Condition, Notification, Queue and Semafore should not use Fiber#resume to dive into "tasks", but rather Reactor#<< to schedule "tasks" for future execution.

funny-falcon · 2018-11-06T05:58:30Z

Assymetric control switch should only be between coroutine and scheduler.

But, since Fiber.yield and Fiber#resume already occupied by Enumerator, "async" have to implement its own assymetric control switch on top of Fiber#transfer

It was big mistake to hide Fiber#transfer into library and not to expose it in core.

ioquatix · 2018-11-06T08:28:02Z

It was big mistake to hide Fiber#transfer into library and not to expose it in core.

You mean, require 'fiber'?

Async's Condition, Notification, Queue and Semafore should not use Fiber#resume to dive into "tasks", but rather Reactor#<< to schedule "tasks" for future execution.

I thought about this design. I wouldn't say it's better or worse. In some ways, it's better, in some ways, it's worse. I understand now what you are talking about though.

funny-falcon · 2018-11-06T08:51:16Z

You mean, require 'fiber'?

Yep. Because of that many people doesn't consider transfer at all.

Async's Condition, Notification, Queue and Semafore should not use Fiber#resume to dive into "tasks", but rather Reactor#<< to schedule "tasks" for future execution.
I thought about this design. I wouldn't say it's better or worse.

It certainly better, because it uses right thing for the task. As I've said, symmetric coroutines should be switched using only symmetric mechanism. Reactor#<< is symmetric.

Think in another way: when one uses operation systems synchronization instruments, does operation system switches tasks immediately? No, it schedules them for execution. And beside simplicity of implementation, it provides better composability.

funny-falcon · 2018-11-06T09:44:30Z

Probably, both Enumerator and async should use Fiber#transfer with their own stack maintenance. This way Fiber.yield will always mean "user intented call to coroutine".

ioquatix · 2018-11-06T10:12:03Z

I understand your explanation.

Think in another way: when one uses operation systems synchronization instruments, does operation system switches tasks immediately? No, it schedules them for execution. And beside simplicity of implementation, it provides better composability.

I understand this. I agree with your reasoning and I think it's a valid concurrency model that is very common. For me, however, another thing to consider is determinism.

I think coroutines provide determinism which OS/threads cannot. This is a major benefit because we schedule IO when it's possible, rather than OS which doesn't always know what to do next (i.e. which thread to resume). So, in theory, it's more efficient, because when we call resume we go directly to code which should execute next. By putting it back into reactor, we loose determinism and we also incur an overhead because every operation must go through scheduler.

I don't really believe one can say which approach is better. They have different trade-offs IMHO.

ioquatix · 2018-11-06T10:26:54Z

Probably, both Enumerator and async should use Fiber#transfer with their own stack maintenance. This way Fiber.yield will always mean "user intented call to coroutine".

Unfortunately, if you call Fiber.yield after Fiber.transfer, it's almost impossible to know what to call resume on. So, it cannot be composed together. At least, that was my experience when trying to implement it.

funny-falcon · 2018-11-06T11:00:57Z

Ah? If you call Fiber.yield after transfer, you will return to fiber, which called Fiber#resume into fiber, which called yield.

transfer acts as switching between independant control flows. And resume+yield as "calling into coroutune", therefore, resume always returns from fiber which it calls to.

ioquatix · 2018-11-06T11:02:23Z

Ah? If you call Fiber.yield after transfer, you will return to fiber, which called Fiber#resume into fiber, which called yield.

transfer acts as switching between independant control flows. And resume+yield as "calling into coroutune", therefore, resume always returns from fiber which it calls to.

Yes, that's right, but after transfer, then yield, what do you call resume on to get back?

funny-falcon · 2018-11-06T11:28:42Z

Yes, that's right, but after transfer, then yield, what do you call resume on to get back?

That depends on what you mean by "back". There are many "backs".

require 'fiber'
queue = []
sched = Fiber.new do
  while fib = queue.shift
    puts "Schedule #{fib}"
    if f=fib.transfer
      f.resume # finish fiber
    end
  end
  puts "No tasks to execute"
end

task = lambda do |n|
  Fiber.new do
    subcoro = Fiber.new do |k|
      k = Fiber.yield "#{n}-#{k}-1"
      #blocking call
      queue << Fiber.current
      sched.transfer
      #resume
      Fiber.yield "#{n}-#{k}-2"
    end

    puts "task#{n} #{subcoro.resume 1}"
    puts "task#{n} #{subcoro.resume 2}"
    # task exit
    sched.transfer Fiber.current
  end
end

queue << task[1] << task[2]
sched.resume

Schedule #<Fiber:0x00005604af70e4f0@fibb.rb:12 (created)>
task1 1-1-1
Schedule #<Fiber:0x00005604af70e3b0@fibb.rb:12 (created)>
task2 2-1-1
Schedule #<Fiber:0x00005604af70dca8@fibb.rb:13 (suspended)>
task1 1-2-2
Schedule #<Fiber:0x00005604af70d5f0@fibb.rb:13 (suspended)>
task2 2-2-2
No tasks to execute

ioquatix · 2018-11-06T11:38:28Z

Thanks for the example, it's late, I will take a look tomorrow.

funny-falcon · 2018-11-07T05:28:31Z

Fixed example a bit: added fiber finalization (f.resume # finish fiber)

funny-falcon · 2018-11-07T05:47:19Z

Looks like there is a need for Fiber. transfer_on_exit(target) method, that will allow to resemble "switch back to caller on exit" behavior of fiber.resume, but in more flexible way.

ioquatix · 2018-11-07T07:49:42Z

Looks like there is a need for Fiber. transfer_on_exit(target) method, that will allow to resemble "switch back to caller on exit" behavior of fiber.resume, but in more flexible way.

Yep, I understand. Otherwise, transfer makes (predictable) resume impossible.

ko1 · 2018-11-07T07:58:50Z

Just FYI (I don't read this thread completely because of many English text...), Fiber#transfer is not supported enough because it cause critical bugs in some situation (I forget correct example...). This is why it is not supported. It is same position as callcc (it also has some critical issues).

Generally speaking, Fiber#transfer is difficult to use for ordinal ruby programmer I think. This is another reason why Fiber#transfer is not supported w/o require fiber lib.

ioquatix · 2018-11-07T07:59:54Z

Thanks for that @ko1 it's really helpful to understand the historical context and how it integrates with the rest of the system.

funny-falcon · 2018-11-07T08:09:03Z

@ko1

Fiber#transfer is not supported enough because it cause critical bugs in some situation (I forget correct example...).

But Fiber.yield is not enough. Yes, transfer is low level, but it is unavoidable for building new functionality, because Fiber.yield could not be orthogonal to itself.

Therefore, either transfer used, or there should keyed resume_with+yield_for to maintain orthogonal fiber nesting stack.

Offtopic: for me, critical bug is "ensure could be ignored if fiber not returned", ie fiber could be forgotten and garbage collected despite pending ensure block.

ioquatix · 2018-11-07T08:11:05Z

In my C++ implementation, if a fiber goes out of scope but it's not finished, it's automatically resumed and terminated.

https://github.com/kurocha/concurrent/blob/master/source/Concurrent/Fiber.cpp#L29-L44

ko1 · 2018-11-07T08:38:17Z

But Fiber.yield is not enough. Yes, transfer is low level, but it is unavoidable for building new functionality, because Fiber.yield could not be orthogonal to itself.

Yes. Fiber without transfer is designed as semi-coroutine. Not a coroutine in computer science. This design is from Python's generator (maybe...). This limitation is intentional and I understand it is not enough for some users.

Therefore, either transfer used, or there should keyed resume_with+yield_for to maintain orthogonal fiber nesting stack.

Offtopic: for me, critical bug is "ensure could be ignored if fiber not returned", ie fiber could be forgotten and garbage collected despite pending ensure block.

Offtopic too. Yes. I want to solve this issue, but it is difficult to solve it, implementation and compatibility....

ioquatix · 2018-11-07T08:39:49Z

New coroutine implementation can solve this problem. The next step is pooled fibers, with explicit scope.

funny-falcon · 2018-11-08T06:33:31Z

Mew coroutine will not solve external enumerator, that iterated over File.open{} file, imho. Unless external enumerator became to be cooutine.

ioquatix · 2018-11-08T06:42:14Z

The coroutine implementation exposes a consistent API on which Fibers and other abstractions can be implemented. It can help us solve some of these issues, for example Enumerator might not use Fiber.. it can still use coroutine, but it won't affect fiber stack in any way.

ioquatix · 2019-07-28T10:01:49Z

@elct9620 here is the sample code we worked on

#!/Users/samuel/Documents/ruby/ruby/build/miniruby

def sequence
	yield 1
	yield 2
	yield 3
end

def things
	yield "cats"
	Fiber.yield "dogs"
end

f = Fiber.new do
	#e = enum_for(:things)
	
	#puts "next: #{e.next}"
	#puts "next: #{e.next}"
	
	# things do |item|
	to_enum(:sequence).each.zip(to_enum(:things)) do |item|
		puts "each: #{item}"
	end
end

puts "resume: #{f.resume}"
#f.resume

ioquatix · 2019-08-05T00:33:57Z

Another example/repro:

#!/usr/bin/env ruby

def things
	yield :cat
	Fiber.yield :dog
	yield :fish
end

fiber = Fiber.new do
	iterator = to_enum(:things)
	
	puts "iterator.next: #{iterator.next}"
	puts "iterator.next: #{iterator.next}"
end

puts "fiber.resume: #{fiber.resume}"

k0kubun · 2019-08-17T05:09:57Z

It seems to have a conflict now. Could you rebase this from master?

ioquatix · 2020-04-14T11:21:43Z

While this PR still has value, in https://www.codeotaku.com/journal/2020-04/ruby-concurrency-final-report/index I define blocking and non-blocking fibers. Naturally, this avoids the problem because Enumerator's fiber can be defined as blocking. By doing this, no scheduling operation should occur during Enumerator#next etc.

Considering all possible options, I think modelling blocking/non-blocking makes more sense when we are adding implicit context switches, which is the reason why this was an issue in the first place. In any case, we can revisit this problem in the future if necessary.

ioquatix · 2024-04-07T05:37:43Z

@funny-falcon not sure If I said this elsewhere, but you were totally correct, fiber scheduler should use transfer only. Well, it can be done either way, but it's technically better to use #transfer to avoid disturbing user flow control.

funny-falcon · 2024-04-07T06:07:13Z

@ioquatix , thanks.

ioquatix mentioned this pull request Jan 10, 2019

streaming responses socketry/falcon#43

Closed

Fiber.yield handled transparently in Enumerator.

03440f6

ioquatix force-pushed the enumerator-fiber-yield branch from 9d686fb to 03440f6 Compare July 28, 2019 10:01

k0kubun changed the base branch from trunk to master August 15, 2019 17:26

ioquatix mentioned this pull request Sep 30, 2019

Non-blocking redis calls using redis-async lib 3scale/apisonator#96

Merged

ioquatix closed this Apr 14, 2020

ioquatix mentioned this pull request Apr 7, 2024

Enumerator should use a non-blocking fiber. #10478

Merged

ioquatix mentioned this pull request Apr 7, 2024

Prefer to use Fiber#transfer in scheduler implementation. #10479

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enumerator fiber yield #2002

Enumerator fiber yield #2002

ioquatix commented Nov 3, 2018

funny-falcon commented Nov 5, 2018 •

edited

funny-falcon commented Nov 5, 2018

ioquatix commented Nov 5, 2018 •

edited

funny-falcon commented Nov 6, 2018 •

edited

ioquatix commented Nov 6, 2018

funny-falcon commented Nov 6, 2018 •

edited

funny-falcon commented Nov 6, 2018

ioquatix commented Nov 6, 2018

funny-falcon commented Nov 6, 2018

funny-falcon commented Nov 6, 2018

ioquatix commented Nov 6, 2018

ioquatix commented Nov 6, 2018

funny-falcon commented Nov 6, 2018

ioquatix commented Nov 6, 2018

funny-falcon commented Nov 6, 2018 •

edited

ioquatix commented Nov 6, 2018

funny-falcon commented Nov 7, 2018

funny-falcon commented Nov 7, 2018

ioquatix commented Nov 7, 2018

ko1 commented Nov 7, 2018

ioquatix commented Nov 7, 2018

funny-falcon commented Nov 7, 2018

ioquatix commented Nov 7, 2018

ko1 commented Nov 7, 2018

ioquatix commented Nov 7, 2018

funny-falcon commented Nov 8, 2018 •

edited

ioquatix commented Nov 8, 2018

ioquatix commented Jul 28, 2019 •

edited

ioquatix commented Aug 5, 2019 •

edited

k0kubun commented Aug 17, 2019

ioquatix commented Apr 14, 2020

ioquatix commented Apr 7, 2024

funny-falcon commented Apr 7, 2024

Enumerator fiber yield #2002

Enumerator fiber yield #2002

Conversation

ioquatix commented Nov 3, 2018

funny-falcon commented Nov 5, 2018 • edited

funny-falcon commented Nov 5, 2018

ioquatix commented Nov 5, 2018 • edited

funny-falcon commented Nov 6, 2018 • edited

ioquatix commented Nov 6, 2018

funny-falcon commented Nov 6, 2018 • edited

funny-falcon commented Nov 6, 2018

ioquatix commented Nov 6, 2018

funny-falcon commented Nov 6, 2018

funny-falcon commented Nov 6, 2018

ioquatix commented Nov 6, 2018

ioquatix commented Nov 6, 2018

funny-falcon commented Nov 6, 2018

ioquatix commented Nov 6, 2018

funny-falcon commented Nov 6, 2018 • edited

ioquatix commented Nov 6, 2018

funny-falcon commented Nov 7, 2018

funny-falcon commented Nov 7, 2018

ioquatix commented Nov 7, 2018

ko1 commented Nov 7, 2018

ioquatix commented Nov 7, 2018

funny-falcon commented Nov 7, 2018

ioquatix commented Nov 7, 2018

ko1 commented Nov 7, 2018

ioquatix commented Nov 7, 2018

funny-falcon commented Nov 8, 2018 • edited

ioquatix commented Nov 8, 2018

ioquatix commented Jul 28, 2019 • edited

ioquatix commented Aug 5, 2019 • edited

k0kubun commented Aug 17, 2019

ioquatix commented Apr 14, 2020

ioquatix commented Apr 7, 2024

funny-falcon commented Apr 7, 2024

funny-falcon commented Nov 5, 2018 •

edited

ioquatix commented Nov 5, 2018 •

edited

funny-falcon commented Nov 6, 2018 •

edited

funny-falcon commented Nov 6, 2018 •

edited

funny-falcon commented Nov 6, 2018 •

edited

funny-falcon commented Nov 8, 2018 •

edited

ioquatix commented Jul 28, 2019 •

edited

ioquatix commented Aug 5, 2019 •

edited