Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task copying: #4085 #15078

Closed
wants to merge 7 commits into from
Closed

Task copying: #4085 #15078

wants to merge 7 commits into from

Conversation

yebai
Copy link

@yebai yebai commented Feb 14, 2016

This PR contains a working implementation of task copying. More specifically:

  • This implementation allows user to fork tasks (aka coroutines): newt = copy(t::Task).
  • The new task newt has an independent stack from the original task t, i.e. newt runs independently from the original task t.
  • This implementation shallowly copies heap objects referenced from the stack.

Here is a simple test script for this feature:

# test case 1: stack allocated objects are deep copied.
function f()
  t = 0;
  while true
    produce(t)
    t = 1 + t
  end
end

t = Task(f)

consume(t); # produce 0
a = copy(t);
consume(a); # produce 1
consume(t);  # produce 1 again



# test case 2: heap allocated objects are shallowly copied.
function f()
  t = [0];
  while true
    produce(t[1])
    t[1] = 1 + t[1]
  end
end

t = Task(f)

consume(t); # produce 0
a = copy(t);
consume(a); # produce 1
consume(t);  # produce 2, as expected.

newt
end


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are finish testing, most of these should move to the C file.

@yebai yebai mentioned this pull request Feb 15, 2016
if (t->stkbuf){
newt->ssize = t->ssize; // size of saved piece
newt->bufsz = t->bufsz;
newt->stkbuf = allocb(t->bufsz);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is missing a GC root for the new task and a NULL initialization of the stkbuf before filling it in.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Julia documentation page says JL_GC_PUSH is needed when performing jl_... calls. Since there is jl_... call inside this function, is GC root still needed? Thanks!

EDIT: sorry - I missed the jl_gc_wb_back call.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is needed on every call that can allocate, all the ones that are exported starts with jl_ which is the reason the doc say it that way. Here allocb allocates and therefore the newly allocated task needs to be rooted and valid.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be fixed now. Thanks!

@yebai
Copy link
Author

yebai commented Feb 16, 2016

This function in gc.c suggests there are two different data structures for GC, namely one for JIT's code and one for native C code. GC for Julia's C side is managed through JL_GC_PUSH and JL_GC_POP calls.
I am curious about how GC works for JIT'ed code? Many thanks!

@yuyichao
Copy link
Contributor

The jit code effectively generate JL_GC_PUSH and JL_GC_POP to interact with the GC. The format is a little different from the fixed argument ones in C since the object reference is stored directly rather than the address to the stack allocated object are stored. The one used by the jit is identical to the one used in JL_GC_PUSHARGS

@yebai
Copy link
Author

yebai commented Feb 16, 2016

This design makes it easy to track C variables on the stack. Does JIT keep a list of objects/variables (created via JIT code) on the stack? If not, how difficult is it to make JIT track them?

EDIT: the documentation of Julia internal can be improved by mentioning these key GC design choices :-)

@yuyichao
Copy link
Contributor

This design makes it easy to track C variables on the stack.

No, you don't track all of them.

Does JIT keep a list of objects/variables (created via JIT code) on the stack?

No. (only in debug_info)

If not, how difficult is it to make JIT track them?

Impossible.

@StefanKarpinski
Copy link
Sponsor Member

Since this approach seems to be working well for @yebai's project, would it be possible for us to provide enough hooks that people can do this in their own code, together with dire warnings that the world is likely to end if they do so and don't follow certain very strict rules?

@yuyichao
Copy link
Contributor

Since this approach seems to be working well for @yebai's project

That doesn't sound like a good reason to add code that is otherwise unused, untested, dangerous, breaks the standard library (when used) and deprecated.

@StefanKarpinski
Copy link
Sponsor Member

I agree with that, but if we had this, we would want to test it, e.g. but having the Julia-side code necessary to exercise it in the test suite would satisfy the issue. Maybe it's a bad idea, I just hate to force @yebai to maintain a fork to do this, but maybe that's just the situation.

@yuyichao yuyichao added kind:breaking This change will break code status:won't change Indicates that work won't continue on an issue or pull request labels Oct 21, 2016
@yebai
Copy link
Author

yebai commented Aug 11, 2018

Hi, @JeffBezanson here is an updated version of the task copying code I mentioned:

https://github.com/TuringLang/Turing.jl/blob/master/deps/task.c

It works for Julia 0.3 - 0.6 without issues. Though I haven't tested it against 0.7 -- it should work as long as the COPY_STACK mechanism is not changed.

It would be great if Julia can support this feature, even as an optional argument that a user can choose to enable/disable.

@yebai
Copy link
Author

yebai commented Sep 11, 2018

UPDATE: the code contained in this PR has been registered as a standalone Julia package, see

https://github.com/TuringLang/Libtask.jl

@StefanKarpinski
Copy link
Sponsor Member

StefanKarpinski commented Sep 11, 2018

It's interesting to me that this works and is useful. That goes against what has been claimed about copying stacks not being feasible or a good idea. It seems like at least some limited form of task copying could be officially supported and tested.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Sep 11, 2018

That goes against what has been claimed about copying stacks not being feasible

That's not what I've said. I've said that merging this may prevent us from doing some more aggressive allocation optimizations, which may be far more critical and useful.

@yuyichao
Copy link
Contributor

yuyichao commented Sep 11, 2018

this works

Work as in incompatible with the compiler in a unfixable way? Working as in not crashing in simple test cases is trivial. This is exactly what undefined behavior means and it's the same as all the missing GC root bugs.

some limited form

So no C code (runtime included) allowed on the stack. No LLVM level allocation optimization allowed. No pass by reference arguments allowed on the call stack. Basically no one on the call stack is allowed to take stack address and pass it around. (edit: Calling functions using these features are fine but all of them should have returned by the time you copy the task). The current system implements and heavily relies on all of the above and that's why I said it's unfixable.

@JeffBezanson
Copy link
Sponsor Member

Right; fully supporting this would require flags to change compiler behavior in various ways. It's possible (as long as you don't interleave julia and C code before copying as Yichao said) but would take a concerted effort to develop and maintain.

@yuyichao
Copy link
Contributor

Actually I just realized that it's even worse than this... This is basically a fork() without the explicit full address space isolation. It means that everything on the call stack are escaping and any optimizations that is based on the escape analysis (of julia objects) has to be disabled, or the user will be able to see behavior change from the optimization. This include allocation optimization in type inference. It's not as bad as crashing but it's the only thing that optimizer are not supposed to do.

@JeffBezanson
Copy link
Sponsor Member

I'm not sure that's true --- objects only referenced by locals are still only referenced by the same locals. Can you give an example?

@yuyichao
Copy link
Contributor

Can you give an example?

a = Ref(0)
if fork()
    a[] = 1
    exit_task()
end
wait_for_the_other_task_to_finish()
a[]

For the actual C fork. a will be copied so the read on the last line is independent of the assignment before the other task ends.
However, for copying the task, it couldn't be defined as copying all the memory so a must still be shared between the two tasks and it cannot be placed on the stack or optimized out.

This is what I mean by escaping. The variable cannot be accessed from a different scope but that scope can have multiple executions that need to have shared states. Similar to other return-twice bugs but worse.

@JeffBezanson
Copy link
Sponsor Member

This doesn't implement fork() though; it implements copy(::Task). The difference is crucial, since to make a task you need a closure, and then the Ref would already escape via the escaping closure.

@yuyichao
Copy link
Contributor

yuyichao commented Sep 11, 2018

This doesn't implement fork() though; it implements copy(::Task)

It's not a fork but it introduce a "fork point" that can be externally forked?

Or in another word just change the example above to.

function f()
  t = Ref(0);
  while true
    produce(t[])
    t[] = 1 + t[]
  end
end

And what should this do?

t = Task(f)

consume(t); # produce 0
a = copy(t);
consume(a); # produce 1
consume(t);  # produce 1 or 2 ?

@JeffBezanson
Copy link
Sponsor Member

That's a better example. Indeed in this mode I think we'd have to turn off alloc-elim of mutable objects in functions that might yield.

@yuyichao
Copy link
Contributor

And by fork I just mean something on the line of t = copy(current_task()) (which could be yielding and letting another task doing the actual copying).

@yebai yebai changed the title WIP: implemented task copying: #4085 Task copying: #4085 Sep 13, 2018
@vtjnash vtjnash closed this Apr 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:breaking This change will break code status:won't change Indicates that work won't continue on an issue or pull request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants