Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate origin/destination subpoints #11

Merged
merged 8 commits into from Feb 8, 2022
Merged

Separate origin/destination subpoints #11

merged 8 commits into from Feb 8, 2022

Conversation

dabreegster
Copy link
Owner

This splits the --subpoints-path flag into separate --subpoints-origins-path and --subpoints-destinations-path flags. If either one isn't specified, the tool falls back to picking random points instead.

No support for weighted subpoints yet; I'll do that separately. There was some other cleanup to do first, so this PR is already big

Copy link
Owner Author

@dabreegster dabreegster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going to merge and keep going. Next steps:

  1. Do the renaming you proposed
  2. Add weighted subpoint support
  3. Make a separate command for full disaggregation, with a different output format

@@ -67,7 +67,8 @@ With reference to the test data in this repo, you can run the `jitter` command l
```{bash}
odjitter --od-csv-path data/od.csv \
--zones-path data/zones.geojson \
--subpoints-path data/road_network.geojson \
--subpoints-origins-path data/road_network.geojson \
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only updated this file; I'm having lots of issues regenerating the .md...

  1. The bash commands all rely on an odjitter command existing in my $PATH, but if I'm developing that command, the new copy is in ./target/{release,debug}/odjitter instead. I can temporarily change that, but that's a very clunky workflow. Or is the intention to do cargo install from my local development copy?
  2. I had to Rscript -e 'install.packages("od")' and for readr, and also install pandoc. It very much weirds me out that these dependencies aren't more explicit, get installed in some system-wide place, and aren't pinned to a version. But we've discussed this philosophical difference before, it's fine. :)
  3. When I regenerated the .md, there were a bunch of spurious diffs in the output with whitespace, and it left random output_max50.geojson files without any cleanup

Since there are more changes coming, I'm going to leave the Rmd and md out of sync and sort this out at the end.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the heads up and apologies for the clunky nature of it. There should be absolutely minimal dependencies. Plan: I can take care of the docs. Thinking of porting the .Rmd to .qmd which should remove the R dependency totally. Sound like a plan?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be worth trying, but doesn't quarto turn around and use R or something else underneath for all the interactive script running? Why don't we just write a regular .md without any magic at all? Embedding command output in the README is an unusual practice for crates.io. Maybe the README could stay really narrowly focused, and the interactive walk-through style guide is better suited for a vignette of a future R package that interfaces with odjitter?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be worth trying, but doesn't quarto turn around and use R or something else underneath for all the interactive script running? Why don't we just write a regular .md without any magic at all? Embedding command output in the README is an unusual practice for crates.io. Maybe the README could stay really narrowly focused, and the interactive walk-through style guide is better suited for a vignette of a future R package that interfaces with odjitter?

I don't think so. Quarto does not depend on R (only if you try to compile to .pdf it by default tries to install a minimal LaTeX installation via R I think). Open minded and happy to look at reducing the length of the README although currently it provides context and usage about the odjitter crate not the (currently non existent) R/Python interfaces.

// TODO Crashes on NaNs, infinity
Value::Number(serde_json::Number::from_f64(x / repeat).unwrap())
// Scale all of the numeric values, unless all_key for this row is 0
let scaled = if repeat == 0.0 { x } else { x / repeat };
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's part of the problem we discussed yesterday. We only transform numeric values if all_key / max_per_od is non-zero now.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable. In my mind repeat would be 1 when the threshold_key is 0 but this makes that explicit (x / 1 equals x).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at other commits and comments, maybe repeat could be set as 1 or more on creation (not zero). But that does matter so ignore these comments!

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6f42425
You're right, setting to 1 is simpler. Thanks!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Feel I'm taking baby steps in understanding Rust code that will make me a better programmer all round. Many thanks for quick implementation and glad the comment led to a positive change in the code base 🙏

Ok(points)
}

fn geometry_to_points(geom: geo_types::Geometry<f64>) -> Vec<Point<f64>> {
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Michael clued me in I can probably use https://docs.rs/geo/latest/geo/algorithm/coords_iter/trait.CoordsIter.html, I'll try this separately

}
}

#[test]
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unit test code is very verbose and awkward; I'm not happy with it at all, will keep iterating.

@dabreegster dabreegster merged commit 14756dd into main Feb 8, 2022
@dabreegster
Copy link
Owner Author

Oops, forgot to associate this with #7

@dabreegster dabreegster deleted the weighted_subp branch February 8, 2022 11:31
@Robinlovelace Robinlovelace mentioned this pull request Feb 8, 2022
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants