Use Fiber.transfer instead of Fiber.yield #23

funny-falcon · 2018-07-04T16:52:20Z

Fiber.yield is used by Enumerator, so if enumerator will call to some io, it will return to calling fiber instead of being scheduled.
But if scheduler will use fiber.transfer, than it will play nicely with Enumerator.

The text was updated successfully, but these errors were encountered:

ioquatix · 2018-07-04T22:28:26Z

I will try it

ioquatix · 2018-07-04T22:36:08Z

I will probably work on this in the next week, but if you are interested, do you think you can make a failing test case to demonstrate the problem? I got the general idea from Ruby bug tracker.

funny-falcon · 2018-07-04T22:36:37Z

I will try.

ioquatix · 2018-07-04T22:59:48Z

Don't try to fix this issue as it's pretty deeply embedded in the current API, but if you can provide a failing test case using enumerator it gives us a goal to work towards.

funny-falcon · 2018-07-05T08:17:02Z

I was not quite right: Enumerator uses Fiber.yield only in "external iterator" mode.
Still zip method uses this mode, for example.
External iterator mode could be used for merging two sequences (ie merge-sort, or other kind of merges).
And some times, it is desirable to use Fiber.resume+Fiber.yield in user code directly.

I added test in #24

ioquatix · 2018-07-05T08:18:23Z

Thank you so much this is awesome!

ioquatix · 2018-07-05T10:39:40Z

I will make separate branch for this. There might be some limitations of transfer because it has a direct relationship between reactor and task, but in some cases the relationship is a bit more complex.

https://github.com/socketry/async/tree/fiber-transfer

ioquatix · 2018-07-05T10:53:50Z

Okay, so I've played around with this a bit.

There are some areas where it might be feasible to use #transfer, and I'll try to get your specs passing.

However, now that I think about it, why doesn't Enumerator use transfer? Because it has all the state, it should be trivial to do so, right? Then, this is a non-issue because normal Fiber.yield can't affect it? I'd have to play with the implementation to say for certain, but I wonder if you thought about it.

ioquatix · 2018-07-05T11:07:18Z

Here is an example of why this might be impossible:

Async.run do |parent| # Fiber.new.resume
	parent.async do # Fiber.new.resume
		# Two kind of implementation:
		reactor.sleep # Implemented by Fiber.yield -> back to parent
		reactor.sleep # Implemented by Reactor.transfer -> back to reactor, parent never resumed.
		
		# Another example:
		input.read # Implemented by Fiber.yield -> back to parent
		input.read # Implemented by Reactor.transfer -> back to reactor, parent never resumed.
	end
end

In order to use transfer successfully, every call to transfer would need to know what to transfer back to. Because of nesting, and the expectation about order of execution, it would be tricky to change async, it would be a breaking change to API.

Let me know if you have any thoughts.

ioquatix · 2018-07-05T11:15:35Z

Here is my attempt to make an Enumerator which uses transfer:

#!/usr/bin/env ruby

require 'fiber'

module Async
	class Enumerator
		def initialize(&block)
			@fiber = Fiber.new do
				block.call do |*args|
					self.transfer(*args)
				end
			end
		end
		
		def transfer(*args)
			@back.transfer(*args)
		end
		
		def next
			@back = Fiber.current
			
			if @fiber.alive?
				@fiber.transfer
			else
				raise StopIteration
			end
		end
	end
end

e = Async::Enumerator.new do |&block|
	block.call(1)
	Fiber.yield 2
	block.call(3)
end

3.times do |i|
	puts "#{i} #{e.next}"
end

The result I got was not what I expected. Fiber.yield travelled back through transfer. I didn't think this was acceptable behaviour. But after checking documentation, it appears it's okay. Hah :)

ioquatix · 2018-07-05T11:47:09Z

It seems like it is possible to emulate fiber.resume and Fiber.yield using your own "stack":

#!/usr/bin/env ruby

require 'fiber'

module Async
	class Reactor
		def initialize
			@fiber = nil
			@ready = []
		end
		
		def yield
			@fiber.transfer
		end
		
		def resume(fiber)
			previous = @fiber
			
			@fiber = Fiber.current
			fiber.transfer
			
			@fiber = previous
		end
		
		def async(&block)
			fiber = Fiber.new do
				block.call
				self.yield
			end
			
			resume(fiber)
		end
		
		# Wait for some event...
		def wait
			@ready << Fiber.current
			self.yield
		end
		
		def run
			while @ready.any?
				fiber = @ready.pop
				resume(fiber)
			end
		end
	end
end

reactor = Async::Reactor.new

reactor.async do
	puts "Hello World"
	reactor.async do
		puts "Goodbye World"
		reactor.wait
		puts "I'm back!"
	end
	puts "Foo Bar"
end

puts "Running"
reactor.run
puts "Finished"

It prints out

Hello World
Goodbye World
Foo Bar
Running
I'm back!
Finished

Which is the expected order, using transfer, but still support nesting yield/resume like semantics.

funny-falcon · 2018-07-05T15:24:55Z

Here is my attempt to make an Enumerator which uses transfer:

Here is my old attempt: https://gist.github.com/funny-falcon/2023354
But you are right, it is not acceptable.

It seems like it is possible to emulate fiber.resume and Fiber.yield using your own "stack"

I think you picked the right way.
I've tried to modify you example to use the fact, that current fiber could be transferred with @fiber.transfer Fiber.current, but looks like it has no sense.

ioquatix · 2018-07-06T00:00:23Z

I almost got this working, but it require some breaking changes to async so unfortunately it requires more work.

I think there will be a performance hit too.

There is one other solution. To make special async enumerator which behaves like normal enumerator but preserves asynchronous semantics.

The Thread.scheduler shows that this is possible in a general case. Maybe just not with the current design of async - it may require minor breaking API changes.

My idea to make this change manageable is to introduce new concept Reactor.scheduler. It has method for async, resume and yield. The current implementation would use fiber.resume and Fiber.yield. But it should be possible to make a different implementation using fiber.transfer. Because the API is simple, it should be possible to test and design it in isolation, once it is working, it would be drop in replacement. It would probably require a major version bump, so async v2.0.0. Ideally by that point we also know if Thread.scheduler will be merged, which improves this situation too. I think I will wait to see what direction Ruby goes in and then react accordingly, but I the mean time I will try to get the branch working with the scheduler idea proposed above.

ioquatix · 2018-07-06T06:27:41Z

Okay, I tried several more options including overriding Enumerator. Unfortunately Enumerator doesn't accept duck type.

I think the only solution to this is as above, waiting for Thread.scheduler to land. That will make it possible to implement this because the fiber within the enumerator will be capable of asynchronous scheduling.

funny-falcon · 2018-07-06T09:41:34Z

Can you at least mention, which difficulties and backward incompatibilities you've found?
Even if I couldn't find the way to deal with, at least they will be documented here.

ioquatix · 2018-07-06T09:50:07Z

When enumerator creates it's own Fiber, it isn't associated with an Async::Task so when you try to do an asynchronous operation, it will fail because Task.current fails. It's a design issue of async, but it's one that could be alleviated with the Thread.scheduler PR because then all Fibers are able to be scheduled.
Making a class which looks like an enumerator isn't enough to work, it appears to need to be a sub-class of Enumerator and the constructor is strict on what it accepts and how it works. If that was possible, at least we would have a workable solution.
I tried two approaches to overriding Enumerator, the one which I think almost worked was using two fibers and transferring from within the task to the enumerator and back again. But it's hard to synchronise the creation of the different parts of the enumerator and task.

module Async
	class Enumerator < ::Enumerator
		def initialize(method, *args)
			@method = method
			@args = args
			
			current = Task.current
			@task = @yielder = nil
			
			puts "Calling super"
			super do |yielder|
				puts "Setting up @yielder"
				@yielder = Fiber.current
				
				puts "Starting task"
				current.async do |task|
					puts "Setting up @task"
					@task = Fiber.current
					
					task.yield while @yielder.nil?
					
					@method.call(task, *@args) do |*args|
						@yielder.transfer(args)
					end
					
					@yielder.transfer(nil)
				end
				
				while value = @task.transfer
					yielder << value.first
				end
			end
		end
		
		def dup
			Enumerator.new(@method, *@args)
		end
	end
end

ioquatix · 2018-07-06T09:51:22Z

I hope this help you understand a bit more what I was trying to do.

I had another idea for a solution but I will implement example on Thread.selector PR as proof of concept as it depends on that working correctly.

ioquatix · 2018-07-06T10:51:36Z

Just for the xref: ruby/ruby@e938076

funny-falcon · 2018-07-06T10:55:10Z

I don't see, why Thread.current.thread_variable_set and Thread.current.thread_variable_get could not be used to correctly substitute Task.current usage.
But probably I don't see whole picture.

ioquatix · 2018-07-06T10:57:36Z

Task.current is per-fiber not per-thread, so that cannot work. There is a 1-1 relationship between task and fiber. Fiber itself is a bit too limited, but in the future perhaps Task can be subclass of Fiber. It's an area to explore.

ioquatix · 2018-08-21T00:12:49Z

@funny-falcon I've been thinking more about this issue.

Do you think it would make sense for Enumerator to also use transfer?

funny-falcon · 2018-08-21T06:19:33Z

I don't know :-( It is hard to decide for already existed feature.

ioquatix · 2018-11-03T05:10:39Z

I've started working on updating Enumerator to use transfer. With this change, this should no longer be a problem.

ioquatix · 2018-11-03T05:10:50Z

Hopefully it would be out in Ruby 2.6.

ioquatix · 2018-11-03T07:27:04Z

Okay.

So, it's more tricky than I imagined.

Essentially, just using transfer is impossible, and composing a reactor and enumerator together which use transfer is, I think, impossible. Because internally, if you use transfer, you must some how manage transferring back. That being said, it's possible to make Enumerator hide the fiber it's using by carefully handling resume/yield. Here is a simple POC:

#!/usr/bin/env ruby

require 'fiber'

class Fiberator
	def initialize(&block)
		@caller = nil
		@buffer = []
		@fiber = Fiber.new(&block)
	end
	
	def next
		return nil unless @fiber.alive?
		
		@caller = Fiber.current
		@value = nil
		
		while @fiber.alive? and @buffer.empty?
			*result = @fiber.resume(self)
			
			if @buffer.any?
				return @buffer.shift
			else
				Fiber.yield(*result) if @fiber.alive?
			end
		end
	end
	
	def << value
		@buffer << value
		Fiber.yield
	end
end

e = Fiberator.new do |s|
	s << 10
	Fiber.yield 30
	s << 20
end

f = Fiber.new do
	while v = e.next
		puts "next: #{v.inspect}"
	end

	puts "done."
	
	40
end

2.times do
	puts "f.resume: #{f.resume}"
end

Firstly, you seem pretty knowledgeable about this stuff, so feel free to correct me.

The way this work is Enumerator carefully handles << and Fiber.yield. If it gets Fiber.yield in it's nested code, it invokes Fiber.yield in context of caller.

Essentially, Enumerator is hiding the use of Fiber.

ioquatix · 2018-11-03T07:28:37Z

@nobu just FYI, I am hoping to make a PR to make this work with ::Enumerator.

ioquatix · 2018-11-03T08:55:17Z

Okay, I made a basic PR against MRI to make this work. It implements in C, the same as the Ruby pseudo-code above.

Feel free to give me your feedback.

If you think you have a better idea, please let me know.

funny-falcon · 2018-11-05T10:39:47Z

Fiber.resume/yield is to dive into fiber as into assymetric coroutine. Enumerator is assymetric coroutune, that is why Fiber.resume/yield is suitable for.

Fiber.transfer is for swithing between symmetric coroutines, and "lightweight threads" are symmetric coroutines.

Fiber.resume/yield is just Fiber.transfer with call stack maintained. Therefore, you can always use Fiber.transfer instead of Fiber.yield if you maintain call stack by your self. But you don't need to.

ioquatix · 2018-11-05T11:56:19Z

Fiber.resume/yield is just Fiber.transfer with call stack maintained. Therefore, you can always use Fiber.transfer instead of Fiber.yield if you maintain call stack by your self.

Yes, agreed.

But you don't need to.

Unfortunately, async is not that simple, and it does use yield/resume for flow control between different tasks, not just to reactor and back.

funny-falcon · 2018-11-05T15:21:25Z

it does use yield/resume for flow control between different tasks, not just to reactor and back.

That is huge mistake, and it will bite you.
Never use asymetric mechanism to communicate symmetric coroutines.

ioquatix · 2018-11-05T21:09:27Z

That is huge mistake, and it will bite you.
Never use asymetric mechanism to communicate symmetric coroutines.

Why?

funny-falcon · 2018-11-06T05:05:11Z

You break your own abstractions. It will inevitably lead to puzzles. Attempts to solve that puzzles will lead to hard to follow code. And finally, there will be unsolvable one. (Like this one with Enumerator. But I claim, there will be more).

Probably I missed something, and your abstractions are more complex. But doubdfully more complex abstractions lead to better architecture. Such thing happens, but quite rare.

ioquatix · 2018-11-06T05:06:54Z

The puzzle here is that Enumerator changes the behaviour of Fiber.yield. It's not unsolvable, there is a proposed solution that solves the issue entirely.

The abstraction is not any more complex than Fiber resume/yield working as expected.

funny-falcon · 2018-11-06T05:13:08Z

I hope you right. But I don't believe.

ioquatix · 2018-11-06T05:32:08Z

I invite you to try the code I posted above, and I invite you to try and solve the Enumerator problem. We don't need to depend on beliefs, but we can solve these problems with actual code and discuss the merits of different solutions. The reality is, Enumerator is a problem for any code which uses Fiber.yield. I think you can ignore async - maybe it's design should change, but it's a separate issue. Let's try to fix Enumerator so that it doesn't leak it's implementation.

ioquatix · 2018-12-31T13:10:18Z

After further discussion with @ko1 I found that Enumerator with Fiber is leaking resources. Maybe there is something wrong with the design or problem with Ruby language itself. I will be working on this over the next couple of months.

ioquatix · 2020-07-05T00:01:01Z

This issue should be fixed in Ruby 3 using the thread scheduler as the Enumerator will become a blocking context to preserve existing behaviour. We may introduce some kind of "non-blocking" enumerator (or make that the default).

MatheusRich · 2020-12-27T22:55:02Z

@ioquatix any plans for this now that ruby 3 is out?

ioquatix · 2021-02-22T01:40:46Z

After considering this issue further, I think we need to explore using transfer for the scheduler interface.

ioquatix · 2021-04-25T05:08:36Z

Ruby 3 now allows us to mix resume/yield for user flow control and transfer for scheduling operations. I've got a local branch which I hope to push soon which uses this approach and so far it looks acceptable. More evaluation is required, but I think it's the right approach.

@funny-falcon thanks for your original discussion here. I think we've finally come full circle.

Ruby 3 allows mixing `Fiber#resume/#yield` and `Fiber#transfer`. We can take advantage of that to minimise the impact of non-blocking operations on user flow control. Previously, non-blocking operations would invoke `Fiber.yield` and this was a user-visible side-effect. We did take advantage of it, but it also meant that integration of Async with existing usage of Fiber could be problematic. We tracked the most obvious issues in `enumerator_spec.rb`. Now, non-blocking operations transfer directly to the scheduler fiber and thus don't impact other usage of resume/yield.

nevans · 2021-04-25T15:12:25Z

Ruby 3 allows mixing Fiber#resume/#yield and Fiber#transfer.

FWIW, I've been using a fun little "trampoline fiber" trick for this in my projects since at least 2.4 (ruby 3.0 makes it unnecessary). I'd looked into submitting a PR several times, but concluded that switching async from a yield/resume scheduler to a transferring scheduler was a bigger change than I had time for. But I'll take a look at your commit this week. If you've done all of that architectural work, then I have a feeling we can probably make it work with Ruby 2.x with maybe a dozen lines of fiber trampoline code.

nevans · 2021-04-25T15:29:18Z

Essentially:

The scheduler always (with one major caveat) transfers into a task's continuation fiber (I make a distinction between a task's root fiber and its continuation fiber, but this could also be modeled as a separate new task).
detect when you are in a "scheduled" fiber: scheduled fibers use a straight transfer to/from scheduler
unscheduled fibers resume a new trampoline fiber (which sets its return fiber to the unscheduled fiber)
Store that trampoline fiber as the "continuation" for whatever async operation has suspended. It could be stored as the "continuation" for the last scheduled task or it could become a new task on which the earlier task is implicitly waiting.
transfer out from the trampoline into the scheduler
transferring back into the trampoline "breaks" it for resume/yield purposes, but that's okay. It still has its original "return fiber" and It only returns once, when it terminates, back into the original unscheduled fiber.

This entire trampoline-continuation dance allows us to transfer out of and back into a resumed fiber, without "breaking" it (marking it as transferring). IOW, it provides similar semantics to ruby 3.0 resume/transfer.

The ruby is shorter/simpler than the English. 😉

ioquatix · 2021-04-25T21:06:10Z

This is not a bad idea and I've actually implemented a similar idea for Enumerator to avoid breaking it when calling Fiber.yield. However, I'm also considering a clean break for Ruby 3+. If you have time and/or thoughts it would be most helpful to discuss it #112.

Ruby 3 allows mixing `Fiber#resume/#yield` and `Fiber#transfer`. We can take advantage of that to minimise the impact of non-blocking operations on user flow control. Previously, non-blocking operations would invoke `Fiber.yield` and this was a user-visible side-effect. We did take advantage of it, but it also meant that integration of Async with existing usage of Fiber could be problematic. We tracked the most obvious issues in `enumerator_spec.rb`. Now, non-blocking operations transfer directly to the scheduler fiber and thus don't impact other usage of resume/yield.

funny-falcon mentioned this issue Jul 5, 2018

Test for Enumerator as external iterator, and explicit Fiber communication #24

Merged

ioquatix closed this as completed in f84df8d Jul 6, 2018

ioquatix reopened this Nov 3, 2018

ioquatix mentioned this issue Nov 3, 2018

Enumerator fiber yield ruby/ruby#2002

Closed

ioquatix added enhancement question and removed question labels Nov 6, 2018

davidor mentioned this issue Feb 27, 2019

Non-blocking Redis calls: preparation (import libs, write async client, adapt tests, etc.) 3scale/apisonator#77

Merged

ioquatix closed this as completed Jul 5, 2020

ioquatix reopened this Apr 25, 2021

ioquatix mentioned this issue Apr 25, 2021

Moving forward with Ruby 3+ only implementation. #112

Closed

ioquatix closed this as completed in 00de67c Jul 16, 2021

Use Fiber.transfer instead of Fiber.yield #23

Use Fiber.transfer instead of Fiber.yield #23

Comments

funny-falcon commented Jul 4, 2018

ioquatix commented Jul 4, 2018

ioquatix commented Jul 4, 2018

funny-falcon commented Jul 4, 2018

ioquatix commented Jul 4, 2018

funny-falcon commented Jul 5, 2018

ioquatix commented Jul 5, 2018

ioquatix commented Jul 5, 2018

ioquatix commented Jul 5, 2018

ioquatix commented Jul 5, 2018 • edited

ioquatix commented Jul 5, 2018 • edited

ioquatix commented Jul 5, 2018 • edited

funny-falcon commented Jul 5, 2018

ioquatix commented Jul 6, 2018 • edited

ioquatix commented Jul 6, 2018

funny-falcon commented Jul 6, 2018

ioquatix commented Jul 6, 2018 • edited

ioquatix commented Jul 6, 2018

ioquatix commented Jul 6, 2018

funny-falcon commented Jul 6, 2018

ioquatix commented Jul 6, 2018

ioquatix commented Aug 21, 2018

funny-falcon commented Aug 21, 2018

ioquatix commented Nov 3, 2018

ioquatix commented Nov 3, 2018

ioquatix commented Nov 3, 2018

ioquatix commented Nov 3, 2018

ioquatix commented Nov 3, 2018

funny-falcon commented Nov 5, 2018

ioquatix commented Nov 5, 2018

funny-falcon commented Nov 5, 2018

ioquatix commented Nov 5, 2018

funny-falcon commented Nov 6, 2018

ioquatix commented Nov 6, 2018

funny-falcon commented Nov 6, 2018

ioquatix commented Nov 6, 2018

ioquatix commented Dec 31, 2018 • edited

ioquatix commented Jul 5, 2020

MatheusRich commented Dec 27, 2020

ioquatix commented Feb 22, 2021

ioquatix commented Apr 25, 2021

nevans commented Apr 25, 2021 • edited

nevans commented Apr 25, 2021

ioquatix commented Apr 25, 2021

ioquatix commented Jul 5, 2018 •

edited

ioquatix commented Jul 5, 2018 •

edited

ioquatix commented Jul 5, 2018 •

edited

ioquatix commented Jul 6, 2018 •

edited

ioquatix commented Jul 6, 2018 •

edited

ioquatix commented Dec 31, 2018 •

edited

nevans commented Apr 25, 2021 •

edited