Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revisit geom validity checks #1

Open
dabreegster opened this issue Feb 9, 2023 · 6 comments
Open

Revisit geom validity checks #1

dabreegster opened this issue Feb 9, 2023 · 6 comments

Comments

@dabreegster
Copy link
Contributor

CC @michaelkirk and @BudgieInWA as FYI

Background

Objects in the geom library perform some validity checks upfront:

  • a Line can't be zero-length
  • a PolyLine can't have any duplicate adjacent points (aka, no internal zero-length line segments)
  • a PolyLine can't double back on itself and repeat points
  • a Ring is like a PolyLine, but the first and last point must match
  • It's unimplemented, but I've wanted to have even stronger checks for PolyLines and Rings -- the polylines should never cross themselves / self-intersect.

The definition of "zero length" depends on this rounding behavior. Everything in geom is meant to logically use fixed-precision arithmetic, so that serialization is idempotent. (Aka, if we calculate some geometry, serialize it, and deserialize it, the result in-memory should be exactly the same as the original thing, so we don't have subtle differences add up later in the traffic simulation. Worth noting that a-b-street/abstreet#689 means something here has been broken for ages, though!)

The intention behind upfront checks is to fail-fast, instead of let problems propagate and show up later, making debugging harder.

Problem

So so many bugs have been crashes from too-small lines somewhere. a-b-street/abstreet#1005 and a-b-street/abstreet#1051 are recent examples. And over in a GTFS viewer project, I recently switched over to a variation of PolyLine thickening that ignores these types of problems (https://github.com/dabreegster/bus_spotting/blob/7c3a0e8cf716e49d4291b3aa2930c64a90938620/ui/src/multiday/viewer.rs#L349).

In practice to resolve problems like this, the crash is totally unhelpful. I wind up disabling checks to just get something showing in the UI, so I can try to see what's going on. Sometimes there's no visible problem at all! (a-b-street/abstreet#1051 (comment) and the GTFS viewer case)

Past approach: double down and fix the root cause

The point of loudly crashing upfront is to force us to deal with whatever root cause. Sometimes that's been useful -- in a-b-street/abstreet#860, it made it obvious that we shifting a polyline multiple times is more dangerous than shifting once from some relative point. (But I think to debug that, I probably disabled all the assertions temporarily...)

Geometry comes from raw input (OSM, GTFS) and often has various problems. We generally try to smooth or dedupe points coming in. Lots of other geometry is derived (like tracing around the block for the LTN tool).

To fix some of the current round of bugs, we could keep trying to find the root cause. But for these specific cases, I tried a bit and couldn't. So, I'm tired of this approach; I don't think it's been helpful.

Short-term proposal: Remove validity checks

When we construct a Line with points too close to each other, just spam a warning to STDOUT or don't even care at all. This would "paper over" many of the current problems. There will be some downstream consequences -- like, what's the angle of a zero-length line segment? But I think most callers don't use angle and care. For example, Ring::get_shorter_slice_between winds up internally iterating over line segments just to check length, and angle doesn't matter at all.

Long-term proposal: Rewrite Line, PolyLine, and Ring APIs

#2 was about cleaning up many of the Polygon methods and always constructing something with valid Rings. I think it's time to attempt a big rewrite of the line-based stuff too, and just use georust wherever possible. PolyLine::dist_along_of_point can be replaced by LineLocatePoint, which isn't so obnoxiously picky about points that're very slightly past the start or end.

georust doesn't have everything we need today (projecting polylines left/right or slicing/clipping them to a [start, end] distance), but the ideal direction forward should be to contribute those algorithms there directly.

@dabreegster
Copy link
Contributor Author

Ah, a-b-street/abstreet#980 walks through the same reasoning. Time to actually do it!

dabreegster referenced this issue in a-b-street/abstreet Feb 9, 2023
…d a case with the speed controls rendering. #1061
@michaelkirk
Copy link

Ah, a-b-street/abstreet#980 walks through the same reasoning. Time to actually do it!

I consider it a good sign that your internal monologue is consistent. 🤣

As far as whether this is a good idea, I'm not really sure either way. Your rationale seems sound, and no one is better positioned to make the judgement call than you.

One significant difference between abstreet:geom and geo::geometry is that geo doesn't have a Ring type, it only has LineStrings, which may or may not be closed. I'm not sure if that has implications for abstreet.

projecting polylines left/right

That may be this PR: georust/geo#935, though it's been kind of quiet lately (Oh! I see you already left a comment in there at one point).

but the ideal direction forward should be to contribute those algorithms there directly.

❤️

@BudgieInWA
Copy link

It certainly is tricky to decide where/how those kinds of constraints should be represented. If typical uses are unwraping all over the place and crashing, that's an indication that the constraints on the type are too strong (at least for some of those uses).

I don't think the base geometry types should be as fussy as they currently are. For example, using a PolyLine to represent a list of GPS samples seems reasonable, and sometimes I don't care if a Line has zero length. Anything that's not going to crash rendering code seems ok to me.

georust/geo seems pretty light on the checks: geo::geometry::Triangle states that "the three vertices must not be collinear and they must be distinct", but doesn't verify it anywhere; geo::geometry::Line::slope might divide by zero, and there are debug_assert!s all over the implementation. It does have HasDimensions which can be used to detect degenerate Lines etc.

I would prefer to use the type system to provide more safety than that.

E.g. for a Line type that doesn't enforce distinct points, I prefer safe implementations for methods that depend on that, like Line::angle(&self) -> Option<T>. Providing Line::is_proper(&self) -> bool and optionally Line::angle_unchecked(&self) -> T might help the ergonomics in some situations. (I really like how shift_right returns a Result for this reason).

Another idea is to provide wrapper types or newtypes around the basic types that enforce additional constraints. This would be super helpful for users, because they can have fields of type ProperLine and NonOverlappingLineString as applicable. Ring is an example of this that I have found useful for its semantics.

This would be pretty easy to implement if the user is expected to cast to cast to the base type in order to work with the geometry (using Into or AsRef or whatever). For extra credit, there's also the possibility for a super ergenomic API that encodes the limitation of all the operations. Like, ProperLine::angle(&self) -> T would be safe, and Ring could have grow(dist) -> Ring which can't fail, but shrink(dist) -> Option<Ring>, because the ring might shrink into nothing.

@BudgieInWA
Copy link

georust doesn't have everything we need today (projecting polylines left/right or slicing/clipping them to a [start, end] distance), but the ideal direction forward should be to contribute those algorithms there directly.

We should consider implementing our own traits and trait impls for the georust type for these algorithms. Ideally they eventually land upstream, but we get the nice interface right away.

@dabreegster
Copy link
Contributor Author

E.g. for a Line type that doesn't enforce distinct points, I prefer safe implementations for methods that depend on that, like Line::angle(&self) -> Option

This makes sense to me -- the upfront check of a non-zero length line is too strict, but if something later on actually cares about angle, zero-length does matter.

A similar piece of unnecessarily strict checks is dist_along_of_point, which I've been trying to change to use https://docs.rs/geo/latest/geo/algorithm/line_locate_point/trait.LineLocatePoint.html. If we feed in a distance that's slightly negative or slightly past the end of the polyline, currently we fail, and this occasionally causes weirdness in intersection geometry code.

Some specific tasks I'll try to tackle:

  • Use LineLocatePoint instead of dist_along_of_point, and maybe make the API infallible
  • Revisit duplicate point checks in all the constructors, and maybe adjust things like angle to return Option
  • Try to replace some of the line/polyline intersection math with georust things
  • Rewrite the polyline slicing operations more carefully, and try to upstream in georust

@BudgieInWA
Copy link

If we feed in a distance that's slightly negative or slightly past the end of the polyline, currently we fail...

A version of the method that "clamps" to the line (i.e. using 0 if a negative dist is passed in) would be ergenomic for some usecases.

dabreegster referenced this issue in a-b-street/abstreet Feb 13, 2023
11f37f1 (#1061).

This is breaking more things in the map importer pipeline. Need to
rethink this change and do it more cautiously.
dabreegster referenced this issue in a-b-street/abstreet Feb 13, 2023
dabreegster referenced this issue in a-b-street/abstreet Feb 13, 2023
- new osm2streets default lane widths
- some big duplicate osm.pbf input files, now that we never clip a
  boundary near a Geofabrik boundary
- most maps are still clipped to an old Geofabrik boundary, because the
  .osm file was still around
- downtown Seattle starts infinite-looping for blockfinding (will fix
  later)
- hopefully a large class of geometry crashes go away, because
  serialized geometry matches the original in-memory version (see #1061)
@dabreegster dabreegster transferred this issue from a-b-street/abstreet Nov 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants