Update documentation about using binary data (#3998)

visgl · Dec 12, 2019 · 80b23c2 · 80b23c2
1 parent 000cdc7
commit 80b23c2
Show file tree

Hide file tree

Showing 5 changed files with 214 additions and 36 deletions.
diff --git a/docs/api-reference/layer.md b/docs/api-reference/layer.md
@@ -62,7 +62,24 @@ deck.gl layers typically expect `data` to be one of the following types:
 - `Promise`: the resolved value will be used as the value of the `data` prop.
 - `AsyncIterable`: an [async iterable](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Symbol/asyncIterator) object that yields data in batches. The default implementation expects each batch to be an array of data objects; one may change this behavior by supplying a custom `dataTransform` callback.
 
-Remarks:
+**data.attributes**
+
+When using a non-iterable `data` object, the object may optionally contain a field `attributes`, if the application wishes to supply binary buffers directly to the layer. This use case is discussed in detail in the [performance developer guide](/docs/developer-guide/performance.md#supply-attributes-directly).
+
+The keys in `data.attributes` correspond to the [accessor](/docs/developer-guide/using-layers.md#accessors) name that the binary should replace, for example `getPosition`, `getColor`. See each layer's documentation for available accessor props.
+
+Each value in `data.attributes` may be one of the following formats:
+
+- luma.gl [Buffer](https://luma.gl/docs/api-reference/webgl/buffer) instance
+- A typed array, which will be used to create a `Buffer`
+- An object containing the following optional fields. For more information, see [WebGL vertex attribute API](https://developer.mozilla.org/en-US/docs/Web/API/WebGLRenderingContext/vertexAttribPointer).
+  + `buffer` ([Buffer](https://luma.gl/docs/api-reference/webgl/buffer))
+  + `value` (TypedArray)
+  + `size` (Number) - the number of elements per vertex attribute.
+  + `offset` (Number) - offset of the first vertex attribute into the buffer, in bytes
+  + `stride` (Number) - the offset between the beginning of consecutive vertex attributes, in bytes
+
+**Remarks**
 
 * Some layers may accept alternative data formats. For example, the [GeoJsonLayer](/docs/layers/geojson-layer.md) supports any valid GeoJSON object as `data`. These exceptions, if any, are documented in each layer's documentation.
 * When an iterable value is passed to `data`, every accessor function receives the current element as the first argument. When a non-iterable value (any object with a `length` field) is passed to `data`, the accessor functions are responsible of interpreting the data format. The latter is often used with binary inputs. Read about this in [accessors](/docs/developer-guide/using-layers.md#accessors).

diff --git a/docs/developer-guide/performance.md b/docs/developer-guide/performance.md
@@ -28,7 +28,7 @@ frequently (e.g. animations), "stutter" can be visible even for layers with just
 
 Some good places to check for performance improvements are:
 
-* **Has `data` really changed?**
+#### Avoid unnecessary shallow change in data prop
 
   The layer does a shallow comparison between renders to determine if it needs to regenerate buffers. If
   nothing has changed, make sure you supply the *same* data object every time you render. If the data object has to change shallowly for some reason, consider using the `dataComparator` prop to supply a custom comparison logic.
@@ -79,7 +79,7 @@ Some good places to check for performance improvements are:
   }
   ```
 
-* **Is the change internal to each object?**
+#### Use updateTriggers
 
   So `data` has indeed changed. Do we have an entirely new collection of objects? Or did just certain fields changed in each row? Remember that changing `data` will update *all* buffers, so if, for example, object positions have not changed, it will be a waste of time to recalculate them.
 
@@ -141,7 +141,7 @@ Some good places to check for performance improvements are:
   }
   ```
 
-* **Is the data change incremental?**
+#### Handle incremental data loading
 
   A common technique for handling big datasets on the client side is to load data in chunks. We want to update the visualization whenever a new chunk comes in. If we append the new chunk to an existing data array, deck.gl will recalculate the whole buffers, even for the previously loaded chunks where nothing have changed:
 
@@ -216,7 +216,7 @@ Some good places to check for performance improvements are:
 
   See [Layer properties](/docs/api-reference/layer.md#basic-properties) for details.
 
-* **When to remove a layer**
+#### Favor layer visibility over addition and removal
 
   Removing a layer will lose all of its internal states, including generated buffers. If the layer is added back later, all the WebGL resources need to be regenerated again. In the use cases where layers need to be toggled frequently (e.g. via a control panel), there might be a significant perf penalty:
 
@@ -282,7 +282,7 @@ Some good places to check for performance improvements are:
 
 99% of the CPU time that deck.gl spends in updating buffers is calling the accessors you supply to the layer. Since they are called on every data object, any performance issue in the accessors is amplified by the size of your data.
 
-* **Use constants before callback functions**
+#### Favor constants over callback functions
 
   Most accessors accept constant values as well as functions. Constant props are extremely cheap to update in comparison. Use `ScatterplotLayer` as an example, the following two prop settings yield exactly the same visual outcome:
 
@@ -342,7 +342,7 @@ Some good places to check for performance improvements are:
   ```
 
 
-* **Use trivial functions as accessors**
+#### Use trivial functions as accessors
 
   Whenever possible, make the accessors trivial functions and utilize pre-defined and/or pre-computed data.
 
@@ -427,11 +427,15 @@ Some good places to check for performance improvements are:
   }
   ```
 
-### On Using Binary Data
+### Use Binary Data
 
-When creating data-intensive applications, it is often desirable to offload client-side data processing to the server or web workers. To transfer data efficiently between threads, some binary format is likely involved.
+When creating data-intensive applications, it is often desirable to offload client-side data processing to the server or web workers.
 
-* **Supply binary data to the `data` prop**
+The server can send data to the client more efficiently using binary formats, e.g. [protobuf](https://developers.google.com/protocol-buffers), [Arrow](https://arrow.apache.org/) or simply a custom binary blob.
+
+Some deck.gl applications use web workers to load data and generate attributes to get the processing off the main thread. Modern worker implementations allow ownership of typed arrays to be [transferred directly](https://developer.mozilla.org/en-US/docs/Web/API/Worker/postMessage#Parameters) between threads at virtualy no cost, bypassing serialization and deserialization of JSON objects.
+
+#### Supply binary blobs to the data prop
 
   Assume we have the data source encoded in the following format:
 
@@ -512,23 +516,39 @@ When creating data-intensive applications, it is often desirable to offload clie
   })
   ```
 
-* **Supplying attributes directly**
+#### Supply attributes directly
+
+  While the built-in attribute generation functionality is a major part of a `Layer`s functionality, it can become a major bottleneck in performance since it is done on CPU in the main thread. If the application needs to push many data changes frequently, for example to render animations, data updates can block rendering and user interaction. In this case, the application should consider precalculated attributes on the back end or in web workers. 
+
+  Deck.gl layers accepts external attributes as either a typed array or a WebGL buffer. Such attributes, if prepared carefully, can be directly utilized by the GPU, thus bypassing the CPU-bound attribute generation completely.
+
+  This technique offers the maximum performance possible in terms of data throughput, and is commonly used in heavy-duty, performance-sensitive applications.
 
-  While the built-in attribute generation functionality is a major part of a `Layer`s functionality, it is possible for applications to bypass it, and supply the layer with precalculated attributes.
+  To generate an attribute buffer for a layer, take the results returned from each object by the `get*` accessors and flatten them into a typed array. For example, consider the following layers:
 
-  Some deck.gl applications use workers to load data and generate attributes to get the processing off the main thread. Modern worker implementations allow ownership of typed arrays to be transferred between threads which takes care of about half of the biggest performance problem with workers (deserialization of calculated data when transferring it between threads).
+  ```js
+  // Calculate attributes on the main thread
+  new PointCloudLayer({
+    // data format: [{position: [0, 0, 0], color: [255, 0, 0]}, ...]
+    data: POINT_CLOUD_DATA,
+    getPosition: d => d.position,
+    getColor: d => d.color,
+    getNormal: [0, 0, 1]
+  })
+  ```
 
-  To generate attributes for the `PointCloudLayer`:
+  Should we move the attribute generation to a web worker:
 
   ```js
   // Worker
+  // positions can be sent as either float32 or float64, depending on precision requirements
   // point[0].x, point[0].y, point[0].z, point[1].x, point[1].y, point[1].z, ...
-  const positions = new Float32Array(...);
-  // point[0].r, point[0].g, point[0].b, point[0].a, point[1].r, point[1].g, point[1].b, point[1].a, ...
-  const colors = new Uint8ClampedArray(...);
+  const positions = new Float64Array(POINT_CLOUD_DATA.flatMap(d => d.position));
+  // point[0].r, point[0].g, point[0].b, point[1].r, point[1].g, point[1].b, ...
+  const colors = new Uint8Array(POINT_CLOUD_DATA.flatMap(d => d.color));
 
   // send back to main thread
-  postMessage({pointCount, positions, colors}, [positions.buffer, colors.buffer]);
+  postMessage({pointCount: POINT_CLOUD_DATA.length, positions, colors}, [positions.buffer, colors.buffer]);
   ```
 
   ```js
@@ -539,44 +559,54 @@ When creating data-intensive applications, it is often desirable to offload clie
       // this is required so that the layer knows how many points to draw
       length: data.pointCount,
       attributes: {
-        instancePositions: data.positions,
-        instanceColors: data.colors,
+        getPosition: {value: data.positions, size: 3},
+        getColor: {value: data.colors, size: 3},
       }
     },
     // constant accessor works without raw data
     getNormal: [0, 0, 1]
   });
   ```
 
-  It is also possible to use interleaved or custom layout external buffers by supplying a descriptor instead of typed array to each attribute:
+  Note that instead of `getPosition`, we supply a `data.attributes.getPosition` object. This object defines the buffer from which `PointCloudLayer` should access its positions data. See the base `Layer` class' [data prop](/docs/api-reference/layer.md#basic-properties) for details.
+
+  It is also possible to use interleaved or custom layout external buffers:
+
+  ```js
+  // Worker
+  // point[0].x, point[0].y, point[0].z, point[0].r, point[0].g, point[0].b, point[1].x, point[1].y, point[1].z, point[1].r, point[1].g, point[1].b, ...
+  const positionsAndColors = new Float32Array(POINT_CLOUD_DATA.flatMap(d => [
+    d.position[0],
+    d.position[1],
+    d.position[2],
+    // colors must be normalized if sent as floats
+    d.color[0] / 255,
+    d.color[1] / 255,
+    d.color[2] / 255
+  ]));
+
+  // send back to main thread
+  postMessage({pointCount: POINT_CLOUD_DATA.length, positionsAndColors}, [positionsAndColors.buffer]);
+  ```
 
   ```js
+  import {Buffer} from '@luma.gl/core';
+  const buffer = new Buffer(gl, {data: data.positionsAndColors});
+
   new PointCloudLayer({
     data: {
       length : data.pointCount,
       attributes: {
-        instancePositions: data.positions,
-        // point[0].r, point[0].g, point[0].b, point[1].r, point[1].g, point[1].b, ...
-        instanceColors: {value: data.colors, size: 3, stride: 3},
+        getPosition: {buffer, size: 3, offset: 0, stride: 24},
+        getColor: {buffer, size: 3, offset: 12, stride: 24},
       }
     },
-    // tell the layer that `instanceColors` does not contain alpha channel
-    colorFormat: 'RGB',
     // constant accessor works without raw data
     getNormal: [0, 0, 1]
   });
   ```
 
-  Each value in `data.attributes` may be one of the following formats:
-
-  - luma.gl `Buffer` instance
-  - A typed array
-  - An object containing the following optional fields. For more information, see [WebGL vertex attribute API](https://developer.mozilla.org/en-US/docs/Web/API/WebGLRenderingContext/vertexAttribPointer).
-    + `buffer` (Buffer)
-    + `value` (TypedArray)
-    + `size` (Number) - the number of elements per vertex attribute.
-    + `offset` (Number) - offset of the first vertex attribute into the buffer, in bytes
-    + `stride` (Number) - the offset between the beginning of consecutive vertex attributes, in bytes
+  Note that external attributes only work with primitive layers, not composite layers, because composite layers often need to preprocess the data before passing it to the sub layers. Some layers that deal with variable-width data, such as `PathLayer`, `SolidPolygonLayer`, require additional information passed along with `data.attributes`. Consult each layer's documentation before use.
 
 
 ## Layer Rendering Performance

diff --git a/docs/layers/icon-layer.md b/docs/layers/icon-layer.md
@@ -274,6 +274,13 @@ The rotating angle  of each object, in degrees.
 - If a function is provided, it is called on each object to retrieve its angle.
 
 
+## Use binary attributes
+
+This section is about the special requirements when [supplying attributes directly](/docs/developer-guide/performance.md#supply-attributes-directly) to an `IconLayer`.
+
+If `data.attributes.getIcon` is supplied, since its value can only be a typed array, `iconMapping` can only use integers as keys.
+
+
 ## Source
 
 [modules/layers/src/icon-layer](https://github.com/uber/deck.gl/tree/master/modules/layers/src/icon-layer)
diff --git a/docs/layers/path-layer.md b/docs/layers/path-layer.md
@@ -175,6 +175,67 @@ The width of each path, in units specified by `widthUnits` (default meters).
 * If a number is provided, it is used as the width for all paths.
 * If a function is provided, it is called on each path to retrieve its width.
 
+
+## Use binary attributes
+
+This section is about the special requirements when [supplying attributes directly](/docs/developer-guide/performance.md#supply-attributes-directly) to a `PathLayer`.
+
+Because each path has a different number of vertices, when `data.attributes.getPath` is supplied, the layer also requires an array `data.startIndices` that describes the vertex index at the start of each path. For example, if there are 3 paths of 2, 3, and 4 vertices each, `startIndices` should be `[0, 2, 5, 9]`.
+
+Additionally, all other attributes (`getColor`, `getWidth`, etc.), if supplied, must contain the same layout (number of vertices) as the `getPath` buffer.
+
+To truly realize the performance gain from using binary data, the app likely wants to skip all data processing in this layer. Specify the `_pathType` prop to skip normalization.
+
+Example use case:
+
+```js
+// USE PLAIN JSON OBJECTS
+const PATH_DATA = [
+  {
+    path: [[-122.4, 37.7], [-122.5, 37.8], [-122.6, 37.85]],
+    name: 'Richmond - Millbrae',
+    color: [255, 0, 0]
+  },
+  ...
+];
+
+new PathLayer({
+  data: PATH_DATA,
+  getPath: d => d.path,
+  getColor: d => d.color
+})
+```
+
+Convert to using binary attributes:
+
+```js
+// USE BINARY
+// Flatten the path vertices
+// [-122.4, 37.7, -122.5, 37.8, -122.6, 37.85, ...]
+const positions = new Float64Array(PATH_DATA.map(d => d.path).flat(2));
+// The color attribute must supply one color for each vertex
+// [255, 0, 0, 255, 0, 0, 255, 0, 0, ...]
+const colors = new Uint8Array(PATH_DATA.map(d => d.path.map(_ => d.color)).flat(2));
+// The "layout" that tells PathLayer where each path starts
+const startIndices = new Uint16Array(PATH_DATA.reduce((acc, d) => {
+  const lastIndex = acc[acc.length - 1];
+  acc.push(lastIndex + d.path.length);
+  return acc;
+}, [0]));
+
+new PathLayer({
+  data: {
+    length: PATH_DATA.length,
+    startIndices: startIndices, // this is required to render the paths correctly!
+    attributes: {
+      getPath: {value: positions, size: 2},
+      getColor: {value: colors, size: 3}
+    }
+  },
+  _pathType: 'open' // this instructs the layer to skip normalization and use the binary as-is
+})
+```
+
 ## Source
 
 [modules/layers/src/path-layer](https://github.com/uber/deck.gl/tree/master/modules/layers/src/path-layer)
diff --git a/docs/layers/solid-polygon-layer.md b/docs/layers/solid-polygon-layer.md
@@ -163,6 +163,69 @@ Only applies if `extruded: true`.
 * If a number is provided, it is used as the elevation for all polygons.
 * If a function is provided, it is called on each object to retrieve its elevation.
 
+
+## Use binary attributes
+
+This section is about the special requirements when [supplying attributes directly](/docs/developer-guide/performance.md#supply-attributes-directly) to a `SolidPolygonLayer`.
+
+Because each polygon has a different number of vertices, when `data.attributes.getPolygon` is supplied, the layer also requires an array `data.startIndices` that describes the vertex index at the start of each polygon. For example, if there are 3 polygons of 3, 4, and 5 vertices each (including the end vertex that overlaps with the first vertex to close the loop), `startIndices` should be `[0, 3, 7, 12]`. *Polygons with holes are not supported when using precalculated attributes.*
+
+Additionally, all other attributes (`getFillColor`, `getElevation`, etc.), if supplied, must contain the same layout (number of vertices) as the `getPolygon` buffer.
+
+To truly realize the performance gain from using binary data, the app likely wants to skip all data processing in this layer. Specify the `_normalize` prop to skip normalization.
+
+Example use case:
+
+```js
+// USE PLAIN JSON OBJECTS
+const POLYGON_DATA = [
+  {
+     contour: [[-122.4, 37.7], [-122.4, 37.8], [-122.5, 37.8], [-122.5, 37.7], [-122.4, 37.7]],
+     population: 26599
+  },
+  ...
+];
+
+new SolidPolygonLayer({
+  data: POLYGON_DATA,
+  getPolygon: d => d.contour,
+  getElevation: d => d.population,
+  getFillColor: [0, 100, 60, 160]
+})
+```
+
+Convert to using binary attributes:
+
+```js
+// USE BINARY
+// Flatten the polygon vertices
+// [-122.4, 37.7, -122.4, 37.8, -122.5, 37.8, -122.5, 37.7, -122.4, 37.7, ...]
+const positions = new Float64Array(POLYGON_DATA.map(d => d.contour).flat(2));
+// The color attribute must supply one color for each vertex
+// [255, 0, 0, 255, 0, 0, 255, 0, 0, ...]
+const elevations = new Uint8Array(POLYGON_DATA.map(d => d.contour.map(_ => d.population)).flat());
+// The "layout" that tells PathLayer where each path starts
+const startIndices = new Uint16Array(POLYGON_DATA.reduce((acc, d) => {
+  const lastIndex = acc[acc.length - 1];
+  acc.push(lastIndex + d.contour.length);
+  return acc;
+}, [0]));
+
+new SolidPolygonLayer({
+  data: {
+    length: POLYGON_DATA.length,
+    startIndices: startIndices, // this is required to render the paths correctly!
+    attributes: {
+      getPolygon: {value: positions, size: 2},
+      getElevation: {value: elevations, size: 1}
+    }
+  },
+  _normalize: false, // this instructs the layer to skip normalization and use the binary as-is
+  getFillColor: [0, 100, 60, 160]
+})
+```
+
+
 ## Remarks
 
 * This layer only renders filled polygons. If you need to render polygon