Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

status_quo: add "variant Go", "alan builds a task scheduler" #209

Merged
merged 1 commit into from
Jul 9, 2021

Conversation

gyuho
Copy link
Contributor

@gyuho gyuho commented May 22, 2021

For #150.

Any feedback would be appreciated!

Thanks in advance.

@gyuho gyuho force-pushed the alan-go branch 9 times, most recently from 85092ed to 1168d7f Compare May 23, 2021 02:19
Comment on lines +131 to +135
let mut _mu;
match self.requests.try_write() {
Some(guard) => _mu = guard,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this seemed so weird? Like _mu = guard.

Comment on lines +138 to +141
if _mu.get(&id).is_none() {
_mu.insert(id, request_tx);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is used, why does it need to have underscore? Also, why is _mu which is a mutex guard named as mutex? Shouldn't it be writer or requests?

Comment on lines +223 to +232
"create" => Ok(format!(
"SUCCESS create {}",
to_string(req.message.to_owned())
)),
"delete" => Ok(format!(
"SUCCESS delete {}",
to_string(req.message.to_owned())
)),
Copy link
Contributor

@pickfire pickfire May 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stringly-typed API is not a good idea in rust, we should switch this to something else. Like Create and Delete within an enum. And why to_string?

use async_std::{sync::Arc, sync::RwLock};

pub struct Manager {
mu: Arc<RwLock<()>>,
Copy link
Contributor

@pickfire pickfire May 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is mu? Why is it mu? Can it have a better name to let readers understand what this does? If it's a mutex, it should be wrapping something, that something should be the name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note this is an attempt to transliterate Go code to Rust, where it's common to define an empty sync.Mutex field in struct with a variable name mu. I will name it better for this example :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes I noticed that. But I think a better approach would be to translate module by module based on functionality if you are translating from go to rust. Go mental model does not fit rust very well, if you are translating from python to rust then it may be a good idea to translate line by line but not quite a good idea for go. If you are translating, thinking in terms of how data flow in rust would ease the translation, like what data should a struct contains and such. But yeah, what you should could the the first version of translation. Maybe I am looking at the status quo differently, now it seemed like the hard part is the translation part from go to rust as the memory model is different.


1. If `Mutex` itself had to be protected, why `Arc` is not unified into a single type? Is the flexibility of having different types really worth the less safety guarantee?
2. Rust claims unparalleled safety. Is it still true for async programming? Rust compiler not complaining about the missing `Arc` means `Mutex` is still safe without `Arc`?
3. What happens if the code went into production without `Arc`? Would the code have race conditions?
Copy link
Contributor

@pickfire pickfire May 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If rust does not complain then it should not have race condition (unless deadlock), if you faced this you might want to give an example.

EDIT: I think this is a general question for someone new to rust, everyone will probably experience this themselves. Should we keep this question here?

1. If `Mutex` itself had to be protected, why `Arc` is not unified into a single type? Is the flexibility of having different types really worth the less safety guarantee?
2. Rust claims unparalleled safety. Is it still true for async programming? Rust compiler not complaining about the missing `Arc` means `Mutex` is still safe without `Arc`?
3. What happens if the code went into production without `Arc`? Would the code have race conditions?
4. Does having `Arc` make code slower? Did I just introduce extra runtime cost?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing is it does not always require Arc, sometimes just Rc will do. But most of the async ecosystem put an assumption that async may be using multiple threads so it is almost always wrapped in Arc even when it is not necessary (like actix-net which uses single-thread). Yes, both Arc and Rc have a cost but Arc have a higher cost.

Copy link

@anguslees anguslees May 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arc has various memory barriers for cross-cpu synchronisation; Rc does not and thus can be reordered/optimised more aggressively, including not writing out to memory at all if the compiler can prove it's equivalent from a thread-local pov. In the worst contended case, Arc is similar to a mutex - you're thrashing a cpu cache line between cpus.


This raises multiple questions for Alan:

1. If `Mutex` itself had to be protected, why `Arc` is not unified into a single type? Is the flexibility of having different types really worth the less safety guarantee?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, this is not a question in async. Both Arc and Mutex are solutions to different problems, just that in this case using both is required. I think this is also for sync and the book should have show some examples about these. I am not sure if there are any benefits unifying it into a single type.

Talking about that here feels off-topic to me at least, because mainly this comes from sync and I don't think there is any issue with this. Unless separating them is a perf issue, but otherwise others would have merged it already.

2. Rust claims unparalleled safety. Is it still true for async programming? Rust compiler not complaining about the missing `Arc` means `Mutex` is still safe without `Arc`?
3. What happens if the code went into production without `Arc`? Would the code have race conditions?
4. Does having `Arc` make code slower? Did I just introduce extra runtime cost?
5. Which one is safe for async programming: `std::sync::Arc` and `async_std::sync::Arc`?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I think most people stumbled across this. Both is safe when rust complains. But the issue is when, I think people are usually confused here.

Copy link

@anguslees anguslees May 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Afaics async_std::sync::Arc is std::sync::Arc. They're the same symbol, just re-exposed under async_std as well, presumably because it's also part of the async_std MutexGuardArc public API: https://docs.rs/async-std/1.9.0/src/async_std/sync/mod.rs.html#177

...
```

Alan is satisfied with the compilation success for the moment, but doesn't feel confident about the production readiness of Rust async.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this part is related to async rust. It's just how rust, ownership system works. It is the same in sync rust. And it is not related to production readiness, if you use rust concepts and port it directly to go, it will be roughly the same.

}
```

For Rust, Alan is faced with multiple choices again:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case I think rust is better here. I think web server handling is too complex to have a "standard" way of doing things. Having choices is better than being forced with a single choice.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remember that these stories are meant to reflect lived experiences. It's not necessarily relevant whether having choice is better than having a standard way of doing things. If this caused Alan confusion or hesitation, it should be noted.

- [rocket](https://github.com/SergioBenitez/Rocket)
- [tide](https://github.com/http-rs/tide)

Alan strongly believes in Go's minimal dependency approach, and thereby chooses "hyper" for its low-level API. While "hyper" is meant to be a low-level building block, implementing a simple request handler in "hyper" still requires four different external dependencies. Alan is not surprised anymore, and rather accepts the status quo of split Rust ecosystem:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am surprised by this since hyper would be my last choice. hyper already mentioned that they are "low-level" and therefore is not recommended for normal use. Did you faced the same situation in the past but still chose hyper? Can I know if having a minimum amount of dependency was the reason? But yes, I agree that there is no single official comparison or a guide to choosing rust frameworks is a bad thing, things would have been easier if there is sort of feature comparison and review between them to compare, cargo-crev might help but it does not provide comparison.

@pickfire
Copy link
Contributor

@gyuho Thanks for sending the pull request. Yes, this may likely be someone experienced with go but new to rust, like Alan. Most of these issues arise that rust have a different mental memory model and how types are used to make application type safe, having a high-level API but giving low-level control which leads to its complexity. Essentially, forcing a good design and best practices to a new developer to ensure nothing is wrong later on (like no race conditions).

Yes, I do agree to those async frameworks split, but what you showed is just a glimpse, the worse parts are tokio vs async-std vs smol split which affects even the high-level frameworks. What was faced here are mainly due to complexity but at least there are no "unexpected" behaviors (like async footguns).

I wonder if having something similar to cargo-crev but review across crates on high-level API, targets, design and such would make someone new to rust make choices like these easier since there are lots of choices.

Copy link
Contributor

@nikomatsakis nikomatsakis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave this a skim -- this is great, thanks! I'm inclined not to do too many edits at this point, since we're trying to reach closure on the vision doc, and I found that I could largely follow what was happening.

@gyuho
Copy link
Contributor Author

gyuho commented May 24, 2021

@anguslees made some improvements to my Rust code (see gyuho/task-scheduler-examples#1). I am still new, so let me know if we need to put some disclaimer that the example is not the most idiomatic Rust code :)

@nikomatsakis nikomatsakis added the status-quo-story-ideas "Status quo" user story ideas label May 24, 2021
Copy link
Member

@rylev rylev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good view into potential confusion when writing async Rust from an experienced Go developers POV. A lot of the details here aren't necessarily specific to async Rust (e.g., atomics, Arc, etc.), but it's still all relevant.

I did see a few misunderstandings of basic Rust types (e.g., Arc). Perhaps it would be relevant to note how Alan learned Rust. Was this Alan's first project in Rust?


## The story

A core component of [`DistriData`][distridata], called `TaskScheduler`, is in charge of (1) receiving client requests via an HTTP server, (2) serializing them in a task queue, (3) relaying each task to the applier, and (4) returning the result back to the client.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I got hung up on the term "applier". Is that a well known term that I'm just not familiar with? If not, maybe replacing it with something a tiny bit more descriptive would help.

}
```

Accustomed to an opinionated way of doing concurrency in Go, Alan loses confidence in Rust async support, as he sees fragmented but specialized solutions in Rust async ecosystem.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting although it's not related to async per say. Atomics are also heavily used in synchronous code. I think we should leave this here though.

src/vision/status_quo/alan_builds_a_task_scheduler.md Outdated Show resolved Hide resolved
}
```

The code compiles and thus must be safe. However, after discussing the code with [Barbara][], Alan learns that `std::sync::Mutex` protects data from concurrent access, but `std::sync::Mutex` itselt must be also protected between threads. And this is where [`std::sync::Arc`](https://doc.rust-lang.org/std/sync/struct.Arc.html) comes in to provide safe multi-threaded access to the `Mutex`. Deeply confused, Alan made a quick fix to wrap `Mutex` with `Arc`:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code compiles and thus must be safe. However, after discussing the code with [Barbara][], Alan learns that std::sync::Mutex protects data from concurrent access, but std::sync::Mutex itselt must be also protected between threads.

Hmm this isn't really true, or at least it's phrased in a potentially confusing way. Arc must be used to make the type useable from several threads. The code will not compile if you try to use it from multiple threads. This is relevant, but perhaps there was a misunderstanding in the communication between Barbara and Alan?

This raises multiple questions for Alan:

1. If `Mutex` itself had to be protected, why `Arc` is not unified into a single type? Is the flexibility of having different types really worth the less safety guarantee?
2. Rust claims unparalleled safety. Is it still true for async programming? Rust compiler not complaining about the missing `Arc` means `Mutex` is still safe without `Arc`?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the code will not compile once you try to use the type in more than thread.

}
```

For Rust, Alan is faced with multiple choices again:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remember that these stories are meant to reflect lived experiences. It's not necessarily relevant whether having choice is better than having a standard way of doing things. If this caused Alan confusion or hesitation, it should be noted.

@gyuho gyuho force-pushed the alan-go branch 3 times, most recently from 5652d2a to 7c68699 Compare May 25, 2021 12:29
@pickfire
Copy link
Contributor

@gyuho Maybe can you write down the experience that you have as Alan, like what kind of knowledge did you have in rust before writing this to let others know by not knowing what material can face this kind of issues. Like the follow.

  • Other than Go, did you know anything else? If that is the only one, how familiar are you with Go?
  • Did you read the rust book? Perhaps just a few chapters?
  • If you did not go through that, maybe can you share other resources you had learned?

I think with these we can know like to which extent newcomers will face if they did not go through some material or understand the concepts. I wonder what gives the knowledge gap on the Arc<Mutex<_>> part because only someone who went through the book or have read some articles about rust only they have seen it.

It would be good to be able to skip the rust book but it seemed necessary for anyone now because rust takes in concepts from many places which makes it complicated. Which I feel contributed to this issue.

Comment on lines +607 to +614
let cloned_uri = req.uri().clone();
let path = cloned_uri.path();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better if we don't need to clone this. We can just do req.uri().path() below.

Side tip, people will usually try to avoid clone() and unwrap() in rust, since it's not always necessary, but sometimes it will make life easier.

Comment on lines +625 to +658
let mut success = false;
let mut req = apply::Request::new();
match path {
"/echo" => match echo::parse_request(&body) {
Ok(bb) => {
req.echo_request = Some(bb);
success = true;
}
Err(e) => {
resp = Response::builder()
.status(StatusCode::INTERNAL_SERVER_ERROR)...
}
},
_ => {
println!("unknown path {}", path);
resp = Response::builder()
.status(StatusCode::INTERNAL_SERVER_ERROR)...
}
}
if success {
match applier.apply(req).await {
Ok(rs) => resp = Response::new(Body::from(rs)),
Err(e) => {
resp = Response::builder()
.status(StatusCode::INTERNAL_SERVER_ERROR)...
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be structured in a better way. Like we don't need to keep a separate variable for success.

@gyuho
Copy link
Contributor Author

gyuho commented May 25, 2021

Maybe can you write down the experience that you have as Alan, like what kind of knowledge did you have in rust before writing this to let others know by not knowing what material can face this kind of issues. Like the follow.

Great callout. Given the sheer number of concepts in Rust (compared to other languages), Alan (me, personally) did not want to wait till the end of the book (which may take multiple months): The Arc<Mutex> is mentioned in the chapter 16 https://doc.rust-lang.org/book/ch16-03-shared-state.html. Yes, https://doc.rust-lang.org/std/sync/struct.Mutex.html explains Arc but if you are given with multiple mutex options and happen to choose async_std, there's no mention of Arc here https://docs.rs/async-std/0.99.7/async_std/sync/struct.RwLock.html. If Alan didn't have to worry about choosing between multiple mutex libraries and just went with std mutex, he would have better understanding by looking at the official doc. I believe this is pretty typical to other new comers as well, and contributed to Alan's lack of understanding on some Rust concepts. I added more details on the "The story" section.

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
@nikomatsakis nikomatsakis merged commit 8b7bb1d into rust-lang:master Jul 9, 2021
@nikomatsakis
Copy link
Contributor

Hey @gyuho -- I merged this, not sure if you had further changes in mind, but if so, follow-up PRs welcome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status-quo-story-ideas "Status quo" user story ideas
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants