Skip to content

Conversation

@vtjnash
Copy link
Member

@vtjnash vtjnash commented Aug 16, 2024

Move the registers onto the stack, so that they only are present when the Task is actually switched out, saving memory when the Task is not running yet or already finished. It makes this mostly just a huge renaming job.

On Linux x86_64 this reduces it from 376 bytes to 184 bytes.

Has some additional advantages too, such as copy_stack tasks (e.g. with always_copy_stacks) can migrate to other threads before starting if they are not sticky.

Also fixes a variable that got mixed up by #54639 and caused always_copy_stacks to abort, since the stack limits were wrong.

Also now fixes #43124, though I am not quite confident enough in it to re-enable that test right now.

@vtjnash vtjnash requested a review from JeffBezanson August 16, 2024 22:27
@vtjnash vtjnash force-pushed the jn/task-size branch 2 times, most recently from e8f7d12 to b13fe75 Compare August 19, 2024 21:23
Move the registers onto the stack, so that they only are present when
the Task is actually switched out, saving memory when the Task is not
running yet or already finished.

On Linux x86_64 this reduces it from 376 bytes to 184 bytes.

Has some additional advantages too, such as copy_stack tasks (e.g. with
always_copy_stacks) can migrate to other threads before starting if they
are not stick.

Also fixes a variable that got mixed up by #54639 and caused
always_copy_stacks to abort, since the stack limits were wrong.
@vtjnash vtjnash merged commit a2b1b4e into master Aug 20, 2024
@vtjnash vtjnash deleted the jn/task-size branch August 20, 2024 20:42
@oscardssmith
Copy link
Member

do we have any benchmarks showing the improvement here?

KristofferC pushed a commit that referenced this pull request Sep 12, 2024
Move the registers onto the stack, so that they only are present when
the Task is actually switched out, saving memory when the Task is not
running yet or already finished. It makes this mostly just a huge
renaming job.

On Linux x86_64 this reduces it from 376 bytes to 184 bytes.

Has some additional advantages too, such as copy_stack tasks (e.g. with
always_copy_stacks) can migrate to other threads before starting if they
are not sticky.

Also fixes a variable that got mixed up by #54639 and caused
always_copy_stacks to abort, since the stack limits were wrong.

Also now fixes #43124, though I
am not quite confident enough in it to re-enable that test right now.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Profiling many spawn and sync does not work on Windows

4 participants