New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Separate origin/destination subpoints #11
Conversation
(Test fails, because of incorrect handling of rows with 0)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going to merge and keep going. Next steps:
- Do the renaming you proposed
- Add weighted subpoint support
- Make a separate command for full disaggregation, with a different output format
@@ -67,7 +67,8 @@ With reference to the test data in this repo, you can run the `jitter` command l | |||
```{bash} | |||
odjitter --od-csv-path data/od.csv \ | |||
--zones-path data/zones.geojson \ | |||
--subpoints-path data/road_network.geojson \ | |||
--subpoints-origins-path data/road_network.geojson \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only updated this file; I'm having lots of issues regenerating the .md
...
- The bash commands all rely on an
odjitter
command existing in my$PATH
, but if I'm developing that command, the new copy is in./target/{release,debug}/odjitter
instead. I can temporarily change that, but that's a very clunky workflow. Or is the intention to docargo install
from my local development copy? - I had to
Rscript -e 'install.packages("od")'
and forreadr
, and also install pandoc. It very much weirds me out that these dependencies aren't more explicit, get installed in some system-wide place, and aren't pinned to a version. But we've discussed this philosophical difference before, it's fine. :) - When I regenerated the .md, there were a bunch of spurious diffs in the output with whitespace, and it left random
output_max50.geojson
files without any cleanup
Since there are more changes coming, I'm going to leave the Rmd and md out of sync and sort this out at the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the heads up and apologies for the clunky nature of it. There should be absolutely minimal dependencies. Plan: I can take care of the docs. Thinking of porting the .Rmd to .qmd which should remove the R dependency totally. Sound like a plan?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be worth trying, but doesn't quarto turn around and use R or something else underneath for all the interactive script running? Why don't we just write a regular .md
without any magic at all? Embedding command output in the README is an unusual practice for crates.io. Maybe the README could stay really narrowly focused, and the interactive walk-through style guide is better suited for a vignette of a future R package that interfaces with odjitter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be worth trying, but doesn't quarto turn around and use R or something else underneath for all the interactive script running? Why don't we just write a regular
.md
without any magic at all? Embedding command output in the README is an unusual practice for crates.io. Maybe the README could stay really narrowly focused, and the interactive walk-through style guide is better suited for a vignette of a future R package that interfaces with odjitter?
I don't think so. Quarto does not depend on R (only if you try to compile to .pdf it by default tries to install a minimal LaTeX installation via R I think). Open minded and happy to look at reducing the length of the README although currently it provides context and usage about the odjitter
crate not the (currently non existent) R/Python interfaces.
// TODO Crashes on NaNs, infinity | ||
Value::Number(serde_json::Number::from_f64(x / repeat).unwrap()) | ||
// Scale all of the numeric values, unless all_key for this row is 0 | ||
let scaled = if repeat == 0.0 { x } else { x / repeat }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's part of the problem we discussed yesterday. We only transform numeric values if all_key / max_per_od
is non-zero now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds reasonable. In my mind repeat
would be 1 when the threshold_key
is 0 but this makes that explicit (x / 1
equals x
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at other commits and comments, maybe repeat
could be set as 1 or more on creation (not zero). But that does matter so ignore these comments!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
6f42425
You're right, setting to 1 is simpler. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! Feel I'm taking baby steps in understanding Rust code that will make me a better programmer all round. Many thanks for quick implementation and glad the comment led to a positive change in the code base 🙏
Ok(points) | ||
} | ||
|
||
fn geometry_to_points(geom: geo_types::Geometry<f64>) -> Vec<Point<f64>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Michael clued me in I can probably use https://docs.rs/geo/latest/geo/algorithm/coords_iter/trait.CoordsIter.html, I'll try this separately
} | ||
} | ||
|
||
#[test] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The unit test code is very verbose and awkward; I'm not happy with it at all, will keep iterating.
Oops, forgot to associate this with #7 |
This splits the
--subpoints-path
flag into separate--subpoints-origins-path
and--subpoints-destinations-path
flags. If either one isn't specified, the tool falls back to picking random points instead.No support for weighted subpoints yet; I'll do that separately. There was some other cleanup to do first, so this PR is already big