New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallelize importer #50
Comments
This is the primary relevant file for this issue right?
Are there any others I should be mindful of as well? @dabreegster |
That's the main one. The other modules in importer are worth a look too, but hopefully they're pretty simple. I think the only easy opportunity for parallelization is the The only potential race conditions would be attempting to produce the same file at the same time. This could happen by two callers trying to do (By the way, if you know of a simple way to express a dependency graph of tasks and be idempotent / detect work that's already been done, there may be a much better way to express everything in importer.) |
Okay, so from looking at it seems like the only jobs that can be run in parallel within let mut was_ensure_popdate_exists_called = false;
for name in names {
//Leave job.osm_to_raw alone
if job.osm_to_raw {
match job.city.as_ref() {
"austin" => austin::osm_to_raw(&name),
"los_angeles" => los_angeles::osm_to_raw(&name),
"seattle" => seattle::osm_to_raw(&name),
x => panic!("Unknown city {}", x),
}
}
//Spawn a thread to manage this
if job.raw_to_map {
utils::raw_to_map(&name, job.use_fixes);
}
if job.scenario {
assert_eq!(job.city, "seattle");
//make sure ensure_popdat_exists is only called once
if !was_ensure_popdat_exists{
seattle::ensure_popdat_exists(job.use_fixes);
was_ensure_popdat_exists = true;
}
//Spawn a thread to manage this
let mut timer = abstutil::Timer::new(format!("Scenario for {}", name));
let map = map_model::Map::new(abstutil::path_map(&name), job.use_fixes, &mut timer);
soundcast::make_weekday_scenario(&map, &mut timer).save();
}
if job.scenario_everyone {
assert_eq!(job.city, "seattle");
if !was_ensure_popdat_exists_called{
seattle::ensure_popdat_exists(job.use_fixes);
}
//Spawn a thread to manage this
let mut timer = abstutil::Timer::new(format!("Scenario for {}", name));
let map = map_model::Map::new(abstutil::path_map(&name), job.use_fixes, &mut timer);
soundcast::make_weekday_scenario_with_everyone(&map, &mut timer).save();
}
}
//Wait for all threads to complete |
This should almost work. The only problem is that I haven't done much parallelism in Rust before, but it could be nice to use the type system to statically enforce correctness. What if we had some marker type that does not implement Also, the generic dependency graph executor I was thinking of is https://github.com/salsa-rs/salsa. Probably overkill for this tool, though. |
Agreed that the dependency graph executor is probably a bad move. Plus, it seems unstable and I imagine adding unstable dependencies wouldn't be healthy for this project. I'll take a stab at a marker type but other than that I'll look to execute the plan from my earlier comment. |
Maybe I'm missing something but I keep getting the following compile error on the latest revision of the repo @dabreegster : error[E0599]: no associated item named `MAX` found for type `f64` in the current scope
--> geom/src/bounds.rs:16:25
|
16 | min_x: f64::MAX,
| ^^^ associated item not found in `f64`
|
= help: items from traits can only be used if the trait is in scope
= note: the following trait is implemented but not in scope; perhaps add a `use` for it:
`use rand::distributions::weighted::alias_method::Weight;`
help: you are looking for the module in `std`, not the primitive type
|
16 | min_x: std::f64::MAX,
| ^^^^^^^^^^^^^
error[E0599]: no associated item named `MAX` found for type `f64` in the current scope
--> geom/src/bounds.rs:17:25
|
17 | min_y: f64::MAX,
| ^^^ associated item not found in `f64`
|
= help: items from traits can only be used if the trait is in scope
= note: the following trait is implemented but not in scope; perhaps add a `use` for it:
`use rand::distributions::weighted::alias_method::Weight;`
help: you are looking for the module in `std`, not the primitive type
|
17 | min_y: std::f64::MAX,
| ^^^^^^^^^^^^^
error[E0599]: no associated item named `MIN` found for type `f64` in the current scope
--> geom/src/bounds.rs:18:25
|
18 | max_x: f64::MIN,
| ^^^ associated item not found in `f64`
|
help: you are looking for the module in `std`, not the primitive type
|
18 | max_x: std::f64::MIN,
| ^^^^^^^^^^^^^
error[E0599]: no associated item named `MIN` found for type `f64` in the current scope
--> geom/src/bounds.rs:19:25
|
19 | max_y: f64::MIN,
| ^^^ associated item not found in `f64`
|
help: you are looking for the module in `std`, not the primitive type
|
19 | max_y: std::f64::MIN,
| ^^^^^^^^^^^^^
error[E0599]: no associated item named `MAX` found for type `f64` in the current scope
--> geom/src/bounds.rs:96:27
|
96 | min_lon: f64::MAX,
| ^^^ associated item not found in `f64`
|
= help: items from traits can only be used if the trait is in scope
= note: the following trait is implemented but not in scope; perhaps add a `use` for it:
`use rand::distributions::weighted::alias_method::Weight;`
help: you are looking for the module in `std`, not the primitive type
|
96 | min_lon: std::f64::MAX,
| ^^^^^^^^^^^^^
error[E0599]: no associated item named `MAX` found for type `f64` in the current scope
--> geom/src/bounds.rs:97:27
|
97 | min_lat: f64::MAX,
| ^^^ associated item not found in `f64`
|
= help: items from traits can only be used if the trait is in scope
= note: the following trait is implemented but not in scope; perhaps add a `use` for it:
`use rand::distributions::weighted::alias_method::Weight;`
help: you are looking for the module in `std`, not the primitive type
|
97 | min_lat: std::f64::MAX,
| ^^^^^^^^^^^^^
error[E0599]: no associated item named `MIN` found for type `f64` in the current scope
--> geom/src/bounds.rs:98:27
|
98 | max_lon: f64::MIN,
| ^^^ associated item not found in `f64`
|
help: you are looking for the module in `std`, not the primitive type
|
98 | max_lon: std::f64::MIN,
| ^^^^^^^^^^^^^
error[E0599]: no associated item named `MIN` found for type `f64` in the current scope
--> geom/src/bounds.rs:99:27
|
99 | max_lat: f64::MIN,
| ^^^ associated item not found in `f64`
|
help: you are looking for the module in `std`, not the primitive type
|
99 | max_lat: std::f64::MIN,
| ^^^^^^^^^^^^^
error: aborting due to 8 previous errors
For more information about this error, try `rustc --explain E0599`.
error: could not compile `geom`. |
I upgraded to Rust 1.43 on April 23, which made those imports unnecessary. Try |
The importer tool converts one map at a time. Easy opportunity to make use of all CPUs. Simple change in one layer of the code. Makes logging harder to read, but eh. Only trick is that ensure_popdat_exists has to be called before the parallelization starts.
The text was updated successfully, but these errors were encountered: