Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add metadata to tasks #33

Merged
merged 7 commits into from
Nov 6, 2022
Merged

Add metadata to tasks #33

merged 7 commits into from
Nov 6, 2022

Conversation

notgull
Copy link
Member

@notgull notgull commented Oct 2, 2022

In the course of resolving #31, I realized that users may want to associate other things with tasks, like string names, priorities, debugger information, and otherwise. This PR adds a new generic field to both Task and Runnable, M. By default, M is the unit type, (), in order to keep consistency with the previous operation of this crate. This metadata is stored in the allocation and can be accessed via the metadata() function on both Task and Runnable.

In order to create new tasks with metadata, I've introduced a new Builder structure. This builder expands on the previous spawn_* family of functions by providing a metadata() function that creates the task using a generic piece of metadata. Builder may be useful later on for solving #24 once allocator APIs are stable.

This is a minor version bump; Tasks use () as their default M parameter to ensure that the logic is kept consistent.

Discussion questions:

  • Since M is essentially stored in an Arc, I've added M: Send + Sync bounds to the impl Send/Sync for both the Task and the Runnable. Is this right?
  • In order to make the metadata more useful, there may be a way to somehow make the metadata available to the current future. However, I'm not sure if there's a safe way of doing this.

Closes #31

@taiki-e
Copy link
Collaborator

taiki-e commented Oct 7, 2022

At first glance, I feel that generic metadata is a bit overkill/complicated, but I have no strong opinion on this.

@taiki-e
Copy link
Collaborator

taiki-e commented Oct 7, 2022

@smol-rs/admins Any thoughts on this?

@zeenix
Copy link
Member

zeenix commented Oct 7, 2022

At first glance, I feel that generic metadata is a bit overkill/complicated, but I have no strong opinion on this.

I don't think it's overkill but seems to go beyond the scope. #31 is about debugging and makes perfect sense. If users need to associate other metadata, they can create custom types that contain the task along with any other data they need and pass it around. Unless I am missing something?

A lot more useful thing to have would be task-local data, which async-std and tokio provide an API for.

@notgull
Copy link
Member Author

notgull commented Oct 7, 2022

Technically, you're right about metadata. The metadata could be stored in both the scheduler function and alongside the task handle. You could do something like this for the priority queue example I wrote:

let priority = 3usize;
let schedule = move |runnable| {
    QUEUE.lock().unwrap().push((priority, runnable));
};

let (task, runnable) = spawn(..., schedule);

...and then priority is accessible to both the scheduling function context and the task handle context. However, for more complicated cases, it may need to be passed around via an Arc. For instance, if you wanted to change the priority of the task, you'd need an AtomicUsize stored in an Arc. At that point, you now have two heap allocations. It makes more sense to reuse that same heap allocation for the metadata as well, which is what this PR sets out to do.

Overall I would be opposed to task-local variables. We could emulate them by storing a Mutex<HashMap> in the metadata. However, if this crate itself were to do this, I'm not sure it would be possible without using Mutex (which would make us not no_std) or relying on dashmap.

@zeenix
Copy link
Member

zeenix commented Oct 7, 2022

Overall I would be opposed to task-local variables. We could emulate them by storing a Mutex<HashMap> in the metadata. However, if this crate itself were to do this, I'm not sure it would be possible without using Mutex (which would make us not no_std) or relying on dashmap.

It doesn't necessarily have to be part of this crate and if it's part of this crate, it can be feature-gated.

@zeenix
Copy link
Member

zeenix commented Oct 7, 2022

It makes more sense to reuse that same heap allocation for the metadata as well, which is what this PR sets out to do.

Thanks for explaining. I guess the only questions are that which you asked in the description:

  • Since M is essentially stored in an Arc, I've added M: Send + Sync bounds to the impl Send/Sync for both the Task and the Runnable. Is this right?

Seems fine to me since it's not an API break.

  • In order to make the metadata more useful, there may be a way to somehow make the metadata available to the current future. However, I'm not sure if there's a safe way of doing this.

Indeed this would make the metadata api a lot more appealing.

@notgull
Copy link
Member Author

notgull commented Oct 7, 2022

I've implemented a new strategy. Instead of the future, the Builder takes an FnOnce(&'a M) -> Fut where Fut: 'a. This allows the future to incorporate the reference to the metadata so that it can be within the future. This would allow, i.e, a HashMap to be stored inside in order to contain task-local variables. The future could then take that reference to the HashMap and use it. Since incorporating a lifetime makes a future !'static, this means that the unsafe API must be used.

Please let me know if there is a more idiomatic way of going about this safely.

@fogti
Copy link
Member

fogti commented Oct 7, 2022

Looks mostly good to me, do we ensure that the Output of the future can't hold onto a &'a M reference?

@notgull
Copy link
Member Author

notgull commented Oct 7, 2022

Looks mostly good to me, do we ensure that the Output of the future can't hold onto a &'a M reference?

I think that's assured by this line in spawn_unchecked's documentation:

If future is not 'static, borrowed variables must outlive its Runnable.

Since the borrowed metadata is a borrowed variable, returning it would mean outliving the Runnable.

For the safe cases, the Future is 'static so it can't contain the metadata refrence.

@notgull
Copy link
Member Author

notgull commented Nov 6, 2022

@smol-rs/admins Are there any blockers to this being approved for merge? It seems like this was blocked for further discussion.

@notgull notgull merged commit d5bd8be into smol-rs:master Nov 6, 2022
@notgull notgull deleted the metadata branch November 6, 2022 17:52
@notgull notgull mentioned this pull request Dec 6, 2022
@notgull notgull mentioned this pull request Mar 24, 2023
notgull added a commit that referenced this pull request Mar 24, 2023
- Ensure that the allocation doesn't exceed `isize::MAX` (#32)
- Add `FallibleTask::is_finished()` (#34)
- Add a metadata generic parameter to tasks (#33)
- Add panic propagation to tasks (#37)
- Add a way to tell if the task was woken while running from the schedule function (#42)
@khoover
Copy link
Contributor

khoover commented Aug 1, 2023

I think the docs around spawn_unchecked's safety could be worded better. For one, "If future’s output is not 'static, borrowed variables must outlive its Runnable." We could replace "future's output" with Fut, since it took me a second read to realize it's not Fut::Output being referred to. Also, we should clarify whether or not captures of the metadata reference "outlive" the runnable; again, after a few reads and some mental processing, I can be pretty sure they do, but clarity is key for unsafe.

Second, "If schedule is not 'static, borrowed variables must outlive the task’s Waker." Again, when I read it again, it's clear it refers to any instances of Waker returned by Runnable::waker, but we should just say that instead. When I first read "the task's Waker" I thought it was referring to the waker for another future .awaiting the Task instance.

@notgull
Copy link
Member Author

notgull commented Aug 1, 2023

I think the docs around spawn_unchecked's safety could be worded better. For one, "If future’s output is not 'static, borrowed variables must outlive its Runnable." We could replace "future's output" with Fut, since it took me a second read to realize it's not Fut::Output being referred to. Also, we should clarify whether or not captures of the metadata reference "outlive" the runnable; again, after a few reads and some mental processing, I can be pretty sure they do, but clarity is key for unsafe.

Second, "If schedule is not 'static, borrowed variables must outlive the task’s Waker." Again, when I read it again, it's clear it refers to any instances of Waker returned by Runnable::waker, but we should just say that instead. When I first read "the task's Waker" I thought it was referring to the waker for another future .awaiting the Task instance.

I think this is a good idea! My bandwidth is limited at the moment, do you mind writing a PR?

@khoover
Copy link
Contributor

khoover commented Aug 1, 2023

I could, sure, but I actually don't know whether it's fine for Fut's lifetime to be exactly 'a or not (i.e. it captures the metadata reference). I'm pretty sure it is, since the metadata lives for the union of the task and runnable lifetimes, whereas the future itself only lives for the runnable's lifetime, but I can't say for sure.

EDIT: Ok, after reading some more, yeah, guaranteed the Future is dropped before the ref count is decremented on the raw task, and the metadata reference is valid for the lifetime of the raw task. So capturing references to metadata is safe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Add names to tasks
5 participants