Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Geojson improvements #658

Merged
merged 5 commits into from
Mar 16, 2022
Merged

Conversation

lognaturel
Copy link
Member

@lognaturel lognaturel commented Mar 7, 2022

What has been done to verify that this works as intended?

Added tests.

Why is this the best possible solution? Were any other approaches considered?

I was initially going to address the issue with a proguard rule in Collect but I realized that really we should be validating the type as we do other similar validations.

While validating the type, I realized that properties could be empty. This won't work for selects because we need properties for label and underlying value but we shouldn't crash. While I was there, I implemented support for toplevel id which we'd marked as an extension.

How does this change affect users? Describe intentional changes to behavior and behavior that could have accidentally been affected by code changes. In other words, what are the regression risks?

This is isolated to the geojson secondary instance implementation so the only regression risks are there. Test coverage is high so it should be quite safe.

Do we need any specific form for testing your changes? If so, please attach one.

See tests.

Does this change require updates to documentation? If so, please file an issue here and include the link below.

getodk/docs#1423

Will later fail if there's an attempt to use it as a choice for a select. We don't know which properties will be used as underlying value and label so we can't validate at this time.
From GeoJSON spec: "If a Feature has a commonly used identifier, that identifier SHOULD be included as a member of the Feature object with the name "id", and the value of this member is either a JSON string or number." https://datatracker.ietf.org/doc/html/rfc7946\#section-3.2

@Test
public void parse_ignoresUnknownToplevelProperties() throws IOException {
TreeElement featureCollection = GeoJsonExternalInstance.parse("id", r("feature-collection-extra-toplevel.geojson").toString());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did this crash before or is there a missing assertion?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and yes. 😫 Fixed and force pushed the last commit. Please review that one again, it was really a disaster.

0.5
]
},
"foo": "bar",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to make this not exactly the same as the property in properties that we do want. It took me a good 5 minutes to understand what was happening here!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. It almost looks like I wandered off before finishing this last commit, sorry. Renamed the property to ignored and force pushed.

@@ -28,6 +28,8 @@
private GeojsonGeometry geometry;
private Map<String, String> properties;

private String id;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the goal of parsing this out? Is it so it can be referenced in forms (questions/choice filters etc)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's to support a top-level id because that has meaning in the geojson spec. Other top-level properties are ignored (per that dreadful commit). Commit details:

From GeoJSON spec: "If a Feature has a commonly used identifier, that identifier SHOULD be included as a member of the Feature object with the name "id", and the value of this member is either a JSON string or number." https://datatracker.ietf.org/doc/html/rfc7946#section-3.2

We give priority to the toplevel id since it's part of the spec but fall back to properties/id since it's also commonly used (see qgis/QGIS#40876, for example).

The id node in JavaRosa's internal representation (which could come from top level or properties/id) is going to be the default underlying value for selects: XLSForm/pyxform#591 But yeah, you could also use it arbitrarily in a form.

@Test
public void parse_addsFeaturesWithNoProperties() throws IOException {
TreeElement featureCollection = GeoJsonExternalInstance.parse("id", r("feature-collection-no-properties.geojson").toString());
assertThat(featureCollection.getChildAt(0).getNumChildren(), is(1));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this result in a crash in Collect if the name/label is missing or is there a graceful error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error is caught but I wouldn't call it graceful:

device-2022-03-10-100306

Let me think about improving that in another PR.

import java.io.IOException;
import java.util.Map;
import org.javarosa.core.model.data.UncastData;
import org.javarosa.core.model.instance.TreeElement;

@JsonAutoDetect(fieldVisibility = JsonAutoDetect.Visibility.ANY)
@JsonIgnoreProperties(ignoreUnknown = true)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this, unknown properties would cause a crash.

Copy link
Member

@seadowg seadowg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I've left one comment on a tweak that would be nice to sneak in before merging.

public void parse_prioritizesTopLevelId() throws IOException {
TreeElement featureCollection = GeoJsonExternalInstance.parse("id", r("feature-collection-id-twice.geojson").toString());
assertThat(featureCollection.getChildAt(0).getNumChildren(), is(4));
assertThat(featureCollection.getChildAt(0).getChild("id", 0).getValue().getValue(), is("fs87b"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using more test-tailored values here like "top-level-id" instead of a realistic one ("fs87b") would really help the readability of the test given that the JSON lives externally. That goes for the above test as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, a nice improvement!

@lognaturel lognaturel merged commit 2e50ae5 into getodk:master Mar 16, 2022
@lognaturel lognaturel deleted the geojson-improvements branch March 16, 2022 17:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error when attempting to parse geojson file with release build
2 participants