-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wkt reader #20
wkt reader #20
Conversation
b7f0e0a
to
6610393
Compare
4a0814e
to
18320e0
Compare
use std::str::FromStr; | ||
// PERF: it would be good to avoid copying data into this string when we already | ||
// have a string as input. Maybe the wkt crate needs a from_reader implementation. | ||
let mut wkt_string = String::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of this file was copied from the geojson_reader. Like there, we copy the input into a String, which seems wasteful, but I don't know of another way around it.
I don't think the wkt crate currently exposes anything other than a from_str constructor.
Does this seem reasonable for now?
edit: the eventual solution might be similar to the work proposed in https://github.com/georust/geozero/pull/16/files
geozero/src/wkt/wkt_reader.rs
Outdated
if let Some(ref coord) = point.0 { | ||
process_coord(coord, multi_dim, 0, processor)?; | ||
} else { | ||
return Err(GeozeroError::Geometry( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK WKT is the only format with an "empty point".
I think erroring here is the safest thing to do here given that there is no support for it in other formats, so it's not really defined how they should handle it.
Another option would be to add
pub trait GeomProcessor {
...
fn empty_coordinate() -> Result {
Err(GeozeroError::Geometry("The WKT Point was empty, but Points cannot be empty in other formats".to_string()))
}
}
And then only override it in the WktReader/WktWriter, but I'm not sure if it would be useful for anything. WDYT @pka?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Paul Ramsey from PostGIS calls ignoring empty geometries strategic laziness (https://gis.stackexchange.com/a/106942). In the meantime PostGIS supports empty geometries and so does GEOS AFAIK. I would continue with strategic laziness and choose the cheapest solution. What happens, if we emit no coordinate and don't return an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried being as lazy as possible with dc6a63a
But I didn't like the results.
#[test]
fn empty_point() {
let wkt = WktStr("POINT EMPTY");
let actual = wkt.to_geo().unwrap();
// This is weird - but it's not obvious to me what we should do in this case.
let expected: geo_types::Geometry<f64> = point!(x: 0.0, y: 0.0).into();
assert_eq!(expected, actual);
}
We can't just "skip" an empty point - since the return type is not optional. It strikes me as pretty bad to return (0, 0).
I've doubled back and reimplemented a "return error" approach when trying to process an empty point into anything other than that WKTWriter.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about representing empty points as Geometry<f64> = point!(x: f64::NaN, y: f64::NaN)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
geo
algorithms assume the absence of NaN values in order to produce correct results, and they don't check for it (for performance reasons), so this would be a source of catastrophic downstream errors (what's the Euclidean distance between Point(0.0, 0.0)
and Point(NaN, NaN)
) and / or a catastrophic perf regression (e.g. checking 10e7 points for NaN values)
geozero/src/wkt/wkt_reader.rs
Outdated
if let Some(ref coord) = point.0 { | ||
processor.point_begin(idx)?; | ||
process_coord(coord, multi_dim, 0, processor)?; | ||
processor.point_end(idx) | ||
} else { | ||
return Err(GeozeroError::Geometry( | ||
"The WKT Point was empty, but Points cannot be empty in other formats".to_string(), | ||
)); | ||
// skip processing of POINT EMPTY for now since no other formats support it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is
processor.point_begin(idx)?;
if let Some(ref coord) = point.0 {
process_coord(coord, multi_dim, 0, processor)?;
} else {
// POINT EMPTY
}
processor.point_end(idx)
causing problems? It looks like the correct representation of an empty geometry to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the spelling is POINT EMPTY
not POINT ()
700bc8f
to
0915431
Compare
0915431
to
ed660a1
Compare
ed660a1
to
9e1d46d
Compare
Looks like we can't avoid going into that rabbit hole. At least we're not alone (locationtech/jts#567 https://trac.osgeo.org/geos/ticket/1005). Looks like your solution with an explicit Anyway let's merge this and improve handling of empty geometries seperately. Thanks a lot! |
Luckily, I think I've expounded on this here: #21 |
Fixes #19
I have a couple things I was hoping to get input on before submitting for actual review... questions inline.Update: OK! Ready for review!
Definitely some weirdness around handling
POINT EMPTY
, but I went with what felt to me like a reasonable compromise of trying to:POINT EMPTY
where possible. Currently this is just Wkt->Wkt which isn't very likely to be used, but I suppose if someone wrote a custom processor, they might want to have the info.EMPTY
in a MultiPoint.Err
when encountering aPOINT EMPTY
. I couldn't find a way to "be lazy" and ignore aPOINT EMPTY
for a top level geometry - we have to return something. And havingWkt("POINT EMPTY).to_geo() -> point!(x: 0, y: 0)
seems likely to be a bad surprise.