Skip to content

Commit

Permalink
Update documentation about using binary data (#3998)
Browse files Browse the repository at this point in the history
  • Loading branch information
Pessimistress committed Dec 12, 2019
1 parent 000cdc7 commit 80b23c2
Show file tree
Hide file tree
Showing 5 changed files with 214 additions and 36 deletions.
19 changes: 18 additions & 1 deletion docs/api-reference/layer.md
Expand Up @@ -62,7 +62,24 @@ deck.gl layers typically expect `data` to be one of the following types:
- `Promise`: the resolved value will be used as the value of the `data` prop.
- `AsyncIterable`: an [async iterable](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Symbol/asyncIterator) object that yields data in batches. The default implementation expects each batch to be an array of data objects; one may change this behavior by supplying a custom `dataTransform` callback.

Remarks:
**data.attributes**

When using a non-iterable `data` object, the object may optionally contain a field `attributes`, if the application wishes to supply binary buffers directly to the layer. This use case is discussed in detail in the [performance developer guide](/docs/developer-guide/performance.md#supply-attributes-directly).

The keys in `data.attributes` correspond to the [accessor](/docs/developer-guide/using-layers.md#accessors) name that the binary should replace, for example `getPosition`, `getColor`. See each layer's documentation for available accessor props.

Each value in `data.attributes` may be one of the following formats:

- luma.gl [Buffer](https://luma.gl/docs/api-reference/webgl/buffer) instance
- A typed array, which will be used to create a `Buffer`
- An object containing the following optional fields. For more information, see [WebGL vertex attribute API](https://developer.mozilla.org/en-US/docs/Web/API/WebGLRenderingContext/vertexAttribPointer).
+ `buffer` ([Buffer](https://luma.gl/docs/api-reference/webgl/buffer))
+ `value` (TypedArray)
+ `size` (Number) - the number of elements per vertex attribute.
+ `offset` (Number) - offset of the first vertex attribute into the buffer, in bytes
+ `stride` (Number) - the offset between the beginning of consecutive vertex attributes, in bytes

**Remarks**

* Some layers may accept alternative data formats. For example, the [GeoJsonLayer](/docs/layers/geojson-layer.md) supports any valid GeoJSON object as `data`. These exceptions, if any, are documented in each layer's documentation.
* When an iterable value is passed to `data`, every accessor function receives the current element as the first argument. When a non-iterable value (any object with a `length` field) is passed to `data`, the accessor functions are responsible of interpreting the data format. The latter is often used with binary inputs. Read about this in [accessors](/docs/developer-guide/using-layers.md#accessors).
Expand Down
100 changes: 65 additions & 35 deletions docs/developer-guide/performance.md
Expand Up @@ -28,7 +28,7 @@ frequently (e.g. animations), "stutter" can be visible even for layers with just

Some good places to check for performance improvements are:

* **Has `data` really changed?**
#### Avoid unnecessary shallow change in data prop

The layer does a shallow comparison between renders to determine if it needs to regenerate buffers. If
nothing has changed, make sure you supply the *same* data object every time you render. If the data object has to change shallowly for some reason, consider using the `dataComparator` prop to supply a custom comparison logic.
Expand Down Expand Up @@ -79,7 +79,7 @@ Some good places to check for performance improvements are:
}
```

* **Is the change internal to each object?**
#### Use updateTriggers

So `data` has indeed changed. Do we have an entirely new collection of objects? Or did just certain fields changed in each row? Remember that changing `data` will update *all* buffers, so if, for example, object positions have not changed, it will be a waste of time to recalculate them.

Expand Down Expand Up @@ -141,7 +141,7 @@ Some good places to check for performance improvements are:
}
```

* **Is the data change incremental?**
#### Handle incremental data loading

A common technique for handling big datasets on the client side is to load data in chunks. We want to update the visualization whenever a new chunk comes in. If we append the new chunk to an existing data array, deck.gl will recalculate the whole buffers, even for the previously loaded chunks where nothing have changed:

Expand Down Expand Up @@ -216,7 +216,7 @@ Some good places to check for performance improvements are:

See [Layer properties](/docs/api-reference/layer.md#basic-properties) for details.

* **When to remove a layer**
#### Favor layer visibility over addition and removal

Removing a layer will lose all of its internal states, including generated buffers. If the layer is added back later, all the WebGL resources need to be regenerated again. In the use cases where layers need to be toggled frequently (e.g. via a control panel), there might be a significant perf penalty:

Expand Down Expand Up @@ -282,7 +282,7 @@ Some good places to check for performance improvements are:

99% of the CPU time that deck.gl spends in updating buffers is calling the accessors you supply to the layer. Since they are called on every data object, any performance issue in the accessors is amplified by the size of your data.

* **Use constants before callback functions**
#### Favor constants over callback functions

Most accessors accept constant values as well as functions. Constant props are extremely cheap to update in comparison. Use `ScatterplotLayer` as an example, the following two prop settings yield exactly the same visual outcome:

Expand Down Expand Up @@ -342,7 +342,7 @@ Some good places to check for performance improvements are:
```


* **Use trivial functions as accessors**
#### Use trivial functions as accessors

Whenever possible, make the accessors trivial functions and utilize pre-defined and/or pre-computed data.

Expand Down Expand Up @@ -427,11 +427,15 @@ Some good places to check for performance improvements are:
}
```

### On Using Binary Data
### Use Binary Data

When creating data-intensive applications, it is often desirable to offload client-side data processing to the server or web workers. To transfer data efficiently between threads, some binary format is likely involved.
When creating data-intensive applications, it is often desirable to offload client-side data processing to the server or web workers.

* **Supply binary data to the `data` prop**
The server can send data to the client more efficiently using binary formats, e.g. [protobuf](https://developers.google.com/protocol-buffers), [Arrow](https://arrow.apache.org/) or simply a custom binary blob.

Some deck.gl applications use web workers to load data and generate attributes to get the processing off the main thread. Modern worker implementations allow ownership of typed arrays to be [transferred directly](https://developer.mozilla.org/en-US/docs/Web/API/Worker/postMessage#Parameters) between threads at virtualy no cost, bypassing serialization and deserialization of JSON objects.

#### Supply binary blobs to the data prop

Assume we have the data source encoded in the following format:

Expand Down Expand Up @@ -512,23 +516,39 @@ When creating data-intensive applications, it is often desirable to offload clie
})
```

* **Supplying attributes directly**
#### Supply attributes directly

While the built-in attribute generation functionality is a major part of a `Layer`s functionality, it can become a major bottleneck in performance since it is done on CPU in the main thread. If the application needs to push many data changes frequently, for example to render animations, data updates can block rendering and user interaction. In this case, the application should consider precalculated attributes on the back end or in web workers.

Deck.gl layers accepts external attributes as either a typed array or a WebGL buffer. Such attributes, if prepared carefully, can be directly utilized by the GPU, thus bypassing the CPU-bound attribute generation completely.

This technique offers the maximum performance possible in terms of data throughput, and is commonly used in heavy-duty, performance-sensitive applications.

While the built-in attribute generation functionality is a major part of a `Layer`s functionality, it is possible for applications to bypass it, and supply the layer with precalculated attributes.
To generate an attribute buffer for a layer, take the results returned from each object by the `get*` accessors and flatten them into a typed array. For example, consider the following layers:

Some deck.gl applications use workers to load data and generate attributes to get the processing off the main thread. Modern worker implementations allow ownership of typed arrays to be transferred between threads which takes care of about half of the biggest performance problem with workers (deserialization of calculated data when transferring it between threads).
```js
// Calculate attributes on the main thread
new PointCloudLayer({
// data format: [{position: [0, 0, 0], color: [255, 0, 0]}, ...]
data: POINT_CLOUD_DATA,
getPosition: d => d.position,
getColor: d => d.color,
getNormal: [0, 0, 1]
})
```

To generate attributes for the `PointCloudLayer`:
Should we move the attribute generation to a web worker:

```js
// Worker
// positions can be sent as either float32 or float64, depending on precision requirements
// point[0].x, point[0].y, point[0].z, point[1].x, point[1].y, point[1].z, ...
const positions = new Float32Array(...);
// point[0].r, point[0].g, point[0].b, point[0].a, point[1].r, point[1].g, point[1].b, point[1].a, ...
const colors = new Uint8ClampedArray(...);
const positions = new Float64Array(POINT_CLOUD_DATA.flatMap(d => d.position));
// point[0].r, point[0].g, point[0].b, point[1].r, point[1].g, point[1].b, ...
const colors = new Uint8Array(POINT_CLOUD_DATA.flatMap(d => d.color));

// send back to main thread
postMessage({pointCount, positions, colors}, [positions.buffer, colors.buffer]);
postMessage({pointCount: POINT_CLOUD_DATA.length, positions, colors}, [positions.buffer, colors.buffer]);
```

```js
Expand All @@ -539,44 +559,54 @@ When creating data-intensive applications, it is often desirable to offload clie
// this is required so that the layer knows how many points to draw
length: data.pointCount,
attributes: {
instancePositions: data.positions,
instanceColors: data.colors,
getPosition: {value: data.positions, size: 3},
getColor: {value: data.colors, size: 3},
}
},
// constant accessor works without raw data
getNormal: [0, 0, 1]
});
```

It is also possible to use interleaved or custom layout external buffers by supplying a descriptor instead of typed array to each attribute:
Note that instead of `getPosition`, we supply a `data.attributes.getPosition` object. This object defines the buffer from which `PointCloudLayer` should access its positions data. See the base `Layer` class' [data prop](/docs/api-reference/layer.md#basic-properties) for details.

It is also possible to use interleaved or custom layout external buffers:

```js
// Worker
// point[0].x, point[0].y, point[0].z, point[0].r, point[0].g, point[0].b, point[1].x, point[1].y, point[1].z, point[1].r, point[1].g, point[1].b, ...
const positionsAndColors = new Float32Array(POINT_CLOUD_DATA.flatMap(d => [
d.position[0],
d.position[1],
d.position[2],
// colors must be normalized if sent as floats
d.color[0] / 255,
d.color[1] / 255,
d.color[2] / 255
]));

// send back to main thread
postMessage({pointCount: POINT_CLOUD_DATA.length, positionsAndColors}, [positionsAndColors.buffer]);
```

```js
import {Buffer} from '@luma.gl/core';
const buffer = new Buffer(gl, {data: data.positionsAndColors});

new PointCloudLayer({
data: {
length : data.pointCount,
attributes: {
instancePositions: data.positions,
// point[0].r, point[0].g, point[0].b, point[1].r, point[1].g, point[1].b, ...
instanceColors: {value: data.colors, size: 3, stride: 3},
getPosition: {buffer, size: 3, offset: 0, stride: 24},
getColor: {buffer, size: 3, offset: 12, stride: 24},
}
},
// tell the layer that `instanceColors` does not contain alpha channel
colorFormat: 'RGB',
// constant accessor works without raw data
getNormal: [0, 0, 1]
});
```

Each value in `data.attributes` may be one of the following formats:

- luma.gl `Buffer` instance
- A typed array
- An object containing the following optional fields. For more information, see [WebGL vertex attribute API](https://developer.mozilla.org/en-US/docs/Web/API/WebGLRenderingContext/vertexAttribPointer).
+ `buffer` (Buffer)
+ `value` (TypedArray)
+ `size` (Number) - the number of elements per vertex attribute.
+ `offset` (Number) - offset of the first vertex attribute into the buffer, in bytes
+ `stride` (Number) - the offset between the beginning of consecutive vertex attributes, in bytes
Note that external attributes only work with primitive layers, not composite layers, because composite layers often need to preprocess the data before passing it to the sub layers. Some layers that deal with variable-width data, such as `PathLayer`, `SolidPolygonLayer`, require additional information passed along with `data.attributes`. Consult each layer's documentation before use.


## Layer Rendering Performance
Expand Down
7 changes: 7 additions & 0 deletions docs/layers/icon-layer.md
Expand Up @@ -274,6 +274,13 @@ The rotating angle of each object, in degrees.
- If a function is provided, it is called on each object to retrieve its angle.


## Use binary attributes

This section is about the special requirements when [supplying attributes directly](/docs/developer-guide/performance.md#supply-attributes-directly) to an `IconLayer`.

If `data.attributes.getIcon` is supplied, since its value can only be a typed array, `iconMapping` can only use integers as keys.


## Source

[modules/layers/src/icon-layer](https://github.com/uber/deck.gl/tree/master/modules/layers/src/icon-layer)
61 changes: 61 additions & 0 deletions docs/layers/path-layer.md
Expand Up @@ -175,6 +175,67 @@ The width of each path, in units specified by `widthUnits` (default meters).
* If a number is provided, it is used as the width for all paths.
* If a function is provided, it is called on each path to retrieve its width.


## Use binary attributes

This section is about the special requirements when [supplying attributes directly](/docs/developer-guide/performance.md#supply-attributes-directly) to a `PathLayer`.

Because each path has a different number of vertices, when `data.attributes.getPath` is supplied, the layer also requires an array `data.startIndices` that describes the vertex index at the start of each path. For example, if there are 3 paths of 2, 3, and 4 vertices each, `startIndices` should be `[0, 2, 5, 9]`.

Additionally, all other attributes (`getColor`, `getWidth`, etc.), if supplied, must contain the same layout (number of vertices) as the `getPath` buffer.

To truly realize the performance gain from using binary data, the app likely wants to skip all data processing in this layer. Specify the `_pathType` prop to skip normalization.

Example use case:

```js
// USE PLAIN JSON OBJECTS
const PATH_DATA = [
{
path: [[-122.4, 37.7], [-122.5, 37.8], [-122.6, 37.85]],
name: 'Richmond - Millbrae',
color: [255, 0, 0]
},
...
];

new PathLayer({
data: PATH_DATA,
getPath: d => d.path,
getColor: d => d.color
})
```

Convert to using binary attributes:

```js
// USE BINARY
// Flatten the path vertices
// [-122.4, 37.7, -122.5, 37.8, -122.6, 37.85, ...]
const positions = new Float64Array(PATH_DATA.map(d => d.path).flat(2));
// The color attribute must supply one color for each vertex
// [255, 0, 0, 255, 0, 0, 255, 0, 0, ...]
const colors = new Uint8Array(PATH_DATA.map(d => d.path.map(_ => d.color)).flat(2));
// The "layout" that tells PathLayer where each path starts
const startIndices = new Uint16Array(PATH_DATA.reduce((acc, d) => {
const lastIndex = acc[acc.length - 1];
acc.push(lastIndex + d.path.length);
return acc;
}, [0]));

new PathLayer({
data: {
length: PATH_DATA.length,
startIndices: startIndices, // this is required to render the paths correctly!
attributes: {
getPath: {value: positions, size: 2},
getColor: {value: colors, size: 3}
}
},
_pathType: 'open' // this instructs the layer to skip normalization and use the binary as-is
})
```

## Source

[modules/layers/src/path-layer](https://github.com/uber/deck.gl/tree/master/modules/layers/src/path-layer)
63 changes: 63 additions & 0 deletions docs/layers/solid-polygon-layer.md
Expand Up @@ -163,6 +163,69 @@ Only applies if `extruded: true`.
* If a number is provided, it is used as the elevation for all polygons.
* If a function is provided, it is called on each object to retrieve its elevation.


## Use binary attributes

This section is about the special requirements when [supplying attributes directly](/docs/developer-guide/performance.md#supply-attributes-directly) to a `SolidPolygonLayer`.

Because each polygon has a different number of vertices, when `data.attributes.getPolygon` is supplied, the layer also requires an array `data.startIndices` that describes the vertex index at the start of each polygon. For example, if there are 3 polygons of 3, 4, and 5 vertices each (including the end vertex that overlaps with the first vertex to close the loop), `startIndices` should be `[0, 3, 7, 12]`. *Polygons with holes are not supported when using precalculated attributes.*

Additionally, all other attributes (`getFillColor`, `getElevation`, etc.), if supplied, must contain the same layout (number of vertices) as the `getPolygon` buffer.

To truly realize the performance gain from using binary data, the app likely wants to skip all data processing in this layer. Specify the `_normalize` prop to skip normalization.

Example use case:

```js
// USE PLAIN JSON OBJECTS
const POLYGON_DATA = [
{
contour: [[-122.4, 37.7], [-122.4, 37.8], [-122.5, 37.8], [-122.5, 37.7], [-122.4, 37.7]],
population: 26599
},
...
];

new SolidPolygonLayer({
data: POLYGON_DATA,
getPolygon: d => d.contour,
getElevation: d => d.population,
getFillColor: [0, 100, 60, 160]
})
```

Convert to using binary attributes:

```js
// USE BINARY
// Flatten the polygon vertices
// [-122.4, 37.7, -122.4, 37.8, -122.5, 37.8, -122.5, 37.7, -122.4, 37.7, ...]
const positions = new Float64Array(POLYGON_DATA.map(d => d.contour).flat(2));
// The color attribute must supply one color for each vertex
// [255, 0, 0, 255, 0, 0, 255, 0, 0, ...]
const elevations = new Uint8Array(POLYGON_DATA.map(d => d.contour.map(_ => d.population)).flat());
// The "layout" that tells PathLayer where each path starts
const startIndices = new Uint16Array(POLYGON_DATA.reduce((acc, d) => {
const lastIndex = acc[acc.length - 1];
acc.push(lastIndex + d.contour.length);
return acc;
}, [0]));

new SolidPolygonLayer({
data: {
length: POLYGON_DATA.length,
startIndices: startIndices, // this is required to render the paths correctly!
attributes: {
getPolygon: {value: positions, size: 2},
getElevation: {value: elevations, size: 1}
}
},
_normalize: false, // this instructs the layer to skip normalization and use the binary as-is
getFillColor: [0, 100, 60, 160]
})
```


## Remarks

* This layer only renders filled polygons. If you need to render polygon
Expand Down

0 comments on commit 80b23c2

Please sign in to comment.