Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GeoJSON to flat binary arrays #690

Merged
merged 27 commits into from Mar 19, 2020
Merged

Conversation

kylebarron
Copy link
Collaborator

@kylebarron kylebarron commented Mar 14, 2020

@ibgreen

Ref #685

The return data is of the format:

{
  points: {
    // Array of x, y or x, y, z positions
    positions: {value: Float32Array, size: coordLength},
    // Array of original feature indexes by vertex
    objectIds: {value: Uint32Array, size: 1},
  },
  lines: {
    // Indices within positions of the start of each individual LineString
    pathIndices: {value: Uint32Array, size: 1},
    // Array of x, y or x, y, z positions
    positions: {value: Float32Array, size: coordLength},
    // Array of original feature indexes by vertex
    objectIds: {value: Uint32Array, size: 1},
  },
  polygons: {
    // Indices within positions of the start of each complex Polygon
    polygonIndices: {value: Uint32Array, size: 1},
    // Indices within positions of the start of each primitive Polygon/ring
    primitivePolygonIndices: {value: Uint32Array, size: 1},
    // Array of x, y or x, y, z positions
    positions: {value: Float32Array, size: coordLength},
    // Array of original feature indexes by vertex
    objectIds: {value: Uint32Array, size: 1},
  }
}

I suppose that positionFormat of either XY or XYZ should also be returned.

  • The data type used for positions defaults to Float32Array, but can optionally be changed to Float64Array
  • I'm bad at naming objects/functions; better name suggestions are welcomed
  • I figured splitting functions within secondPass would be helpful to keep code cleaner and test them individually, but currently functions are impure, which feels wrong. I'm not sure the best way to fix that.
  • If any coordinate is 3D, all coordinates are assumed to be 3D. So if coordinates are a mix of 2D/3D, I think the third coordinate of a 2D position would stay as the TypedArray's default.
  • If the general structure of the code is ok, I can add some tests. Is it possible to test non-exported functions? Or should I just test the exported top-level function?
  • I still haven't touched feature properties at all
  • I didn't spend too much effort on the scaffolding, so that could be improved.

Copy link
Collaborator

@ibgreen ibgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kylebarron This looks very good indeed!

I'd land this right now if maintainers weren't prepping for the 2.1 release.

@xintongxia as soon as you have cut 2.1-release, let's land this.

We need to add

  • modules/gis/docs/api-reference/features-to-binary.md
  • At least some trivial test cases (just something trivial that at least makes sure this code is not "dead on arrival", e.g. find an existing geojson snippet in some modules/*/test/data folder and reuse it).

modules/gis/package.json Outdated Show resolved Hide resolved
modules/gis/src/lib/geojson-to-binary.js Outdated Show resolved Hide resolved
modules/gis/src/lib/geojson-to-binary.js Outdated Show resolved Hide resolved
modules/gis/src/lib/geojson-to-binary.js Outdated Show resolved Hide resolved
modules/gis/docs/README.md Outdated Show resolved Hide resolved
modules/gis/src/lib/geojson-to-binary.js Show resolved Hide resolved
modules/gis/docs/README.md Show resolved Hide resolved
@coveralls
Copy link

coveralls commented Mar 16, 2020

Coverage Status

Coverage increased (+0.6%) to 58.378% when pulling daa2281 on kylebarron:geojson-to-binary into a9fe9c6 on uber-web:master.

@kylebarron
Copy link
Collaborator Author

I'll try to add some tests later tonight.

Do you have any other suggestions for the properties? When this is integrated with MVTLoader, will we still need to return normal arrays/objects for properties?

Something like this? Where we remove the coordinates from the geometry object?

features: {
  "type": "Feature",
  "properties": {...},
  "geometry": {"type": "LineString"}
}

Or do you have in mind returning all properties in typed arrays, where the value is repeated for each vertex? Even strings? That might become hard to work with.

@ibgreen
Copy link
Collaborator

ibgreen commented Mar 17, 2020

If you want to provide properties with the binary arrays

  • one option would be to return one array of property objects per binary bundle (points, lines, polygons).
  • since you are splitting into separate arrays seems there is no need to keep the feature wrapper, just keep the properties objects in the arrays?

But yes, I was thinking that we would generate additional binary arrays (for numerical properties only).

@kylebarron
Copy link
Collaborator Author

I've been thinking about how I'd use the returned binary data, and in order to apply styling in Deck.gl, I'd need access to the feature properties, including string types. Theoretically you could encode UTF-8 characters in a binary array, but I'm not sure if that's worth it.

To make sure I'm on the same page; you're thinking something like (properties in normal objects):

{
  points: {
    // Array of x, y or x, y, z positions
    positions: {value: Float32Array, size: coordLength},
    // Array of original feature indexes by vertex
    objectIds: {value: Uint32Array, size: 1},
	// Array of property objects
	properties: [
		{...}
	]
  },
  ...
}

(binary numeric properties and string properties in objects)

{
  points: {
    // Array of x, y or x, y, z positions
    positions: {value: Float32Array, size: coordLength},
    // Array of original feature indexes by vertex
    objectIds: {value: Uint32Array, size: 1},
	// Object with numeric properties
	numericProperties: {
		propertyName: {value: Float32Array, size: 1}
	}
	// Array of string property objects
	properties: [
		{...}
	]
  },
  ...
}

modules/gis/src/lib/geojson-to-binary.js Outdated Show resolved Hide resolved
modules/gis/src/lib/geojson-to-binary.js Outdated Show resolved Hide resolved
@ibgreen
Copy link
Collaborator

ibgreen commented Mar 18, 2020

To make sure I'm on the same page; you're thinking something like (properties in normal objects)...

Yes this looks good. For what its worth, I tend to think in terms of returning "columnar tables", so I would personally choose not to have any substructure for properties, but have all binary columns at the top level.

In addition we could opt for glTF style accessor names for positions: POSITION which would probably reduce name conflict risk a bit.

{
  points: {
    // Array of x, y or x, y, z positions
    POSITION: {value: Float32Array, size: coordLength},
    // Array of original feature indexes by vertex
    objectIds: {value: Uint32Array, size: 1},
    // optionally - Object with numeric properties
    numericProperty1: {value: Float32Array, size: 1}

    // optionally - Array of original property objects - though this is accessible through objectIds
    properties: [{...}]
  },
  ...
}

To reduce the risk of conflicts, you could name the generated numeric properties as "properties.<propName>".

Some options:

  • binaryProperties: Include numeric properties as binary arrays

@ibgreen
Copy link
Collaborator

ibgreen commented Mar 18, 2020

@kylebarron loaders.gl 2.1 has released and branched off so we can land this on master as soon as you are ready and iterate in additional PRs.

@kylebarron
Copy link
Collaborator Author

I'm struggling with the test scaffolding. What's the preferred way to load a JSON object into the test? require works for node but not the browser tests. I tried to use fetchFile but it's not finding the test file:

ENOENT: no such file or directory, open '@loaders.gl/gis/test/data/2d_features.json

Code here:
https://github.com/uber-web/loaders.gl/pull/690/files#diff-a03bb68043a3ddc75142220a628f5601R10-R13

@kylebarron
Copy link
Collaborator Author

A couple questions regarding pathIndices:

Deck.gl's docs say

an array data.startIndices that describes the vertex index at the start of each path. For example, if there are 3 paths of 2, 3, and 4 vertices each, startIndices should be [0, 2, 5, 9].

  1. I noticed that currently pathIndices is the start index using the number of coordinates, not the number of vertices. So it would currently be [0, 4, 10].
  2. Currently the length of pathIndices is the same as the number of paths, not n + 1. I.e [0, 2, 5], and not [0, 2, 5, 9]. The last number is the same as the number of vertices; should I include it?

@ibgreen
Copy link
Collaborator

ibgreen commented Mar 19, 2020

I noticed that currently pathIndices is the start index using the number of coordinates, not the number of vertices. So it would currently be [0, 4, 10].

Yes agree that number of vertices is better. Also this means that the array should not change if we switch between 2D and 3D coordinates.

Currently the length of pathIndices is the same as the number of paths, not n + 1. I.e [0, 2, 5], and not [0, 2, 5, 9]. The last number is the same as the number of vertices; should I include it?

Let me say that from the perspective of returning a "columnar table", it is desirable that all columns with fixed-length elements are of the same length (as in number of rows).

If e.g. deck.gl currently needs a different format, I'd make that an option.

@kylebarron
Copy link
Collaborator Author

Thanks, I'll update that. Do you have a suggestion for fixing the file not found error on the geojson test data?

@ibgreen
Copy link
Collaborator

ibgreen commented Mar 19, 2020

Do you have a suggestion for fixing the file not found error on the geojson test data?

Yes sorry, this PR is becoming too big, hard to catch all the questions...

fetchFile relies on an alias system to resolve loaders.gl module names on the local file system. Since this is a new module you need to add it to that file. I believe it is this file: https://github.com/uber-web/loaders.gl/blob/master/test/aliases.js

@kylebarron kylebarron marked this pull request as ready for review March 19, 2020 02:02
@kylebarron
Copy link
Collaborator Author

kylebarron commented Mar 19, 2020

Sorry to drag it out! I did discover that mixed coordinates are currently handled incorrectly, i.e. if only some of the coordinates have 3 dimensions, some values in positions will be off by 1. I'm happy to handle that in the next PR along with properties if you'd prefer.

It looks like tests are failing on Node 10 because I use Array.flat(), which apparently was added in Node 11.

@xintongxia
Copy link

Sorry to drag it out! I did discover that mixed coordinates are currently handled incorrectly, i.e. if only some of the coordinates have 3 dimensions, some values in positions will be off by 1. I'm happy to handle that in the next PR along with properties if you'd prefer.

It looks like tests are failing on Node 10 because I use Array.flat(), which apparently was added in Node 11.

You can add a simple helper function flatten to flat the nested coordinates.

@kylebarron
Copy link
Collaborator Author

Sorry got distracted. Finally green!

@ibgreen ibgreen merged commit 4ccf151 into visgl:master Mar 19, 2020
@kylebarron kylebarron deleted the geojson-to-binary branch March 20, 2020 21:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants