The correct way to hide scatterplot items and practical count limits? #3068

bclinkinbeard · 2019-05-06T21:18:26Z

I've got a Scatterplot that hides items in response to a time scrubber, which I've implemented by returning 0 from getRadius for items outside the active time range. getRadius is specified in updateTriggers as well, but I'm wondering if this is the most performant way to do this.

Would setting the alpha to 0 or filtering the items out of the data ahead of time be a better approach? I'm also curious if smoothly scrubbing through 1M+ points is a realistic goal or not. I've rendered up to about 7M points but things get pretty bogged down.

The text was updated successfully, but these errors were encountered:

ibgreen · 2019-05-06T22:00:55Z

@bclinkinbeard When you change the radius of points (and fire the updateTrigger) you force a JavaScript recalculation of the radius instance array which will be 7 million items when you have 7 million points.

To get really good performance with millions of points you need to do computations on the GPU (i.e. in the shaders). In this case, this probably requires writing a custom subclass and making some addition to the existing scatterplot layer shader code.

If you think you are up to that it we could give you a few ideas for how to start.

bclinkinbeard · 2019-05-06T23:40:12Z

Hey Ib, that would be great. I'm definitely willing to give it a shot so any pointers are much appreciated. Thanks!

ibgreen · 2019-05-07T00:35:40Z

So the GPU first runs a vertex shader on every vertex in the geometry and then a fragment shader on every rendered pixel.

One trick is to use the discard keyword in the fragment shader to prevent rendering of pixels that belong to currently invisible objects.
In your case that is determined by checking if the time of the current data object is outside of the range.
You would set up two uniforms to represent the start and end time of the interval.
And generate a new attribute containing the time stamps of all the data objects.

To understand where to start, do any of these statements make sense?

Look through existing shaders and see if you can see similar patterns.

bclinkinbeard · 2019-05-07T13:02:49Z

Hey Ib, that does make sense. I actually have my timestamps in a TypedArray already, in two different ways. I have a "vanilla" overlay and an Apache Arrow overlay, so I have both row and column encodings.

I've started looking at the ScatterplotLayer source, can I just subclass that and provide my own shaders?

ibgreen · 2019-05-07T14:58:22Z

I actually have my timestamps in a TypedArray already, in two different ways. I have a "vanilla" overlay and an Apache Arrow overlay, so I have both row and column encodings.

Not quite sure what you mean with "overlay", but yes you want contiguous columnar data, i.e. just one timestamp per object. For Arrow I expect you'd just do table.getColumn('timestamps').toArray().

I've started looking at the ScatterplotLayer source, can I just subclass that and provide my own shaders?

Yes, there is a getShaders method you can supply to override the shaders.

Make sure you have read this page https://github.com/uber/deck.gl/blob/master/docs/developer-guide/custom-layers/subclassed-layers.md

The TripsLayer source code likely also has some timestamp handling that might serve as inspiration.

bclinkinbeard · 2019-05-14T14:19:04Z

Hey Ib, with #3076 resolved I've come back to this and made some progress. I'm successfully filtering in the shader, but I'm not sure how to trigger updates in a performant way.

I followed your suggestions and added minMs and maxMs uniforms to my ScatterplotLayer subclass, and an instance property instanceTime. My vertex shader sets a varying float vTime with the instance value vTime = instanceTime;, which I learned from the TripsLayer source. My fragment shader then discards based on that value.

if (vTime < minMs || vTime > maxMs) {
  discard;
}

Where I'm a bit stuck is how to trigger re-renders without incurring a bunch of CPU work that makes everything lag. I think the recommended approach is to call this.deck.setProps({layers: [myCustomScatterplotLayer]});, but that calls all of the accessors again, resulting in jank. How do I tell the layer to only check the uniforms when re-rendering? Thanks!

bclinkinbeard · 2019-05-14T14:25:38Z

To clarify a bit, the getPosition and getTime (from my new instance attribute) accessors are currently using the index property to look up their values in an Arrow column. You mentioned table.getColumn('timestamps').toArray(), but I'm not sure how to use that in the layer updates.

bclinkinbeard · 2019-05-15T05:09:41Z

OK, after fighting with this all day (and night) I have something functional, but I hope it's not right. I'm still seeing significant jank with less than 1M points, even though I've confirmed my accessors are not being rerun. When my time scrubber changes I do the following.

layer.state.model.draw({
    uniforms: {
      minMs: this.minMs,
      maxMs: this.maxMs,
    }
});

The performance is actually worse than what I've gotten out of the CPU before. Hopefully I'm just holding it wrong?

ibgreen · 2019-05-15T13:31:57Z

You mentioned table.getColumn('timestamps').toArray(), but I'm not sure how to use that in the layer updates

Arrow's toArray method returns a typed array representing that column. If that array has the right format (type and number of components per element) you can feed it directly as an attribute to your shader via a top-level prop.

attribute float timestamps;

const timestamps = table.getColumn('timestamps').toArray();

new Layer({
  timestamps, // Will use this typed array instead of generating it.
  ...
});

ibgreen · 2019-05-15T13:33:47Z

The performance is actually worse than what I've gotten out of the CPU before.

No you should not be able to notice any degradation from that GPU code.

As a first step, try setting deck.log.priority to values between 1 and 4 and see if it can help you trace what is happening in the update cycle.

Just set the value in the console prompt in the debugger.

bclinkinbeard · 2019-05-15T15:24:50Z

you can feed it directly as an attribute to your shader via a top-level prop

Thanks, I was able to do this for my custom attribute but not for getPosition. I tried passing a Float32Array that had the correct values but they don't seem to be used and nothing renders.

As a first step, try setting deck.log.priority

Setting it to 1 doesn't output anything, and anything above 1 produces the error Cannot read property 'totalCount' of undefined.

I think my primary question at this point is what do I do when my custom uniforms change? Should it really just be something like this?

this._layer = new CustomScatterplotLayer({
  data: {length: this.eventsTable.count()},
  getPosition: this.getPosition, // can hopefully replace this with TypedArray ref
  getTime: this.times, // Uint32Array of second precision timestamps
  minMs: this.minMs, // updated uniform
  maxMs: this.maxMs, // updated uniform
  radiusScale: 2 ** (15 - this.fractionalZoom),
});
this.deck.setProps({layers: [this._layer]});

ibgreen · 2019-05-15T15:38:16Z

As a first step, try setting deck.log.priority

Setting it to 1 doesn't output anything, and anything above 1 produces the error Cannot read property 'totalCount' of undefined.

We improved metrics collection in v7, perhaps it introduced a bug with logging.

ibgreen · 2019-05-15T15:46:03Z

Should it really just be something like this?

Yes but I would start by making sure it works without using binary data. As long as you are not changing the data array every render, using binary data should just optimize initialization time, not animation time.

uniforms: {
  minMs: this.minMs,
  maxMs: this.maxMs,
}

});```

Should that be this.props.minMs etc?

bclinkinbeard · 2019-05-15T15:57:41Z

start by making sure it works without using binary data

Everything works, even with binary, I'm just trying to get the performance gains I was expecting.

using binary data should just optimize initialization time, not animation time.

Hmm, but I moved my filtering to the shader specifically so I could animate (show/hide) large numbers of points without touching the CPU and incurring lag. Is it reasonable to expect deck to support that in a performant manner with 1M+ points? More?

ibgreen · 2019-05-15T16:14:07Z

On our MacBook pros we usually see rendering performance start to fall somewhere between 1-10M points. 1M is usually very good. We have examples on our website that loads close to 1M points (e.g. the scatterplot example) and perform quite well, if they perform well for when dragging and tilting then things should work.

Your if statement in the shader should not affect performance much, which is why I am coming back to the suggestion that something is triggering deck.gl to do other types of work every time you render the layer, such data generation or refreshes.

Since the code you showed me using binary presumably doesn't work (you said the submitting the binary positions didn't work for you) I suspected that you were actually using some other code and that code triggers some deck.gl update.

Pessimistress · 2019-05-15T16:52:23Z

@bclinkinbeard A few ideas:

Make sure you are using the latest release (a perf issue regarding typed array props was patched in 7.0.1)
Make sure your data prop is shallowly changed every render. From your snippet:

new CustomScatterplotLayer({
  data: {length: this.eventsTable.count()},  // a new object is created
  ...
})

To pass typed arrays directly to the attribute, you need to use the attribute name instead of the accessor name:

new CustomScatterplotLayer({
  instancePositions: <Float32Array>,
  instanceTime: <Uint32Array>,
  ...
});

bclinkinbeard · 2019-05-15T17:18:11Z

Thanks, I am using the 7.0.6 bundle.

I switched the data prop to a stable ref with data: this.data and replaced getPosition and getTime with instancePositions and instanceTimes. Unfortunately that just gives me a TypeError: Cannot read property 'position' of undefined error and nothing rendered.

I do think it's possible some extra calls are happening so I'm going to investigate that.

Pessimistress · 2019-05-15T18:17:05Z

Unfortunately that just gives me a TypeError: Cannot read property 'position' of undefined error and nothing rendered.

Ah that is probably from the instancePositions64xyLow calculation: https://github.com/uber/deck.gl/blob/master/modules/layers/src/scatterplot-layer/scatterplot-layer.js#L179

Just to verify, try construct an empty Float32Array of length: count * 2 and pass it to a prop named instancePositions64xyLow.

bclinkinbeard · 2019-05-15T18:55:37Z

That worked, thanks! My layer definition now looks like this.

this._layer = new CustomScatterplotLayer({
  numInstances: this.data.length,
  instancePositions: this.positions,
  instanceTimes: this.seconds,
  instancePositions64xyLow: this.instancePositions64xyLow,
  minMs: this.minMs,
  maxMs: this.maxMs,
  radiusScale: 2 ** (15 - this.fractionalZoom),
});
this.deck.setProps({layers: [this._layer]});

However, that generates a warning of Attribute prop instanceTimes is casted to Float32Array on first render, and then .WebGL-0x7fb1199f2e00]GL ERROR :GL_INVALID_OPERATION : glDrawArraysInstancedANGLE: attempt to access out of range vertices in attribute 3 on subsequent renders. (this.seconds is a Uint32Array)

Pessimistress · 2019-05-15T19:05:13Z

The warning should be non-blocking. What is the size of your instancePositions array?

bclinkinbeard · 2019-05-15T19:29:03Z

The warning should be non-blocking.

Cool, went ahead and made it a Float32Array, just wanted to make sure I didn't mess up.

What is the size of your instancePositions array?

It varies, but just now the first render was 600K, the second 1.2M

Pessimistress · 2019-05-15T19:33:05Z

What is the size of your instancePositions array?

Are you packing [x, y] instead of [x, y, z]?

bclinkinbeard · 2019-05-15T19:47:12Z

No, I just have similarly sized data chunks that I'm loading in succession. If I log out the lengths while rendering one particular dataset I get the following output.

1942635 instancePositions
647545 instanceTimes
647545 numInstances

If I then load another dataset of roughly the same size the counts change as I would expect and I see this output.

3835947 instancePositions
1278649 instanceTimes
1278649 numInstances

Immediately after that, though, I see the .WebGL-0x7fb1246cac00]GL ERROR :GL_INVALID_OPERATION : glDrawArraysInstancedANGLE: attempt to access out of range vertices in attribute 3 error and everything disappears.

Pessimistress · 2019-05-15T20:01:49Z

This may be caused by picking colors. Can you create a new data object whenever numInstances change?

bclinkinbeard · 2019-05-15T20:30:47Z

That fixed it, thank you!!

I will close this issue once I do a bit more verification. 🚀 🎉

Pessimistress · 2019-05-15T20:37:10Z

The reason we don't document this approach is, as you have seen, there are a lot of implicit expectations for externally provided attributes, and coordinating their change with internal attribute updates can get tricky. We work with arrow data internally too and we are working on new API designs that make it easier.

bclinkinbeard · 2019-05-15T21:39:40Z

Awesome, I poked around in the 7.x binary RFC docs a bit. Are there issues or branches I should watch for early access to the new APIs? We are just getting started with Arrow but I expect we'll be using it extensively in the near future. Thanks again!

bclinkinbeard added the question label May 6, 2019

ibgreen added the bug label May 15, 2019

bclinkinbeard closed this as completed May 18, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The correct way to hide scatterplot items and practical count limits? #3068

The correct way to hide scatterplot items and practical count limits? #3068

bclinkinbeard commented May 6, 2019

ibgreen commented May 6, 2019

bclinkinbeard commented May 6, 2019

ibgreen commented May 7, 2019

bclinkinbeard commented May 7, 2019

ibgreen commented May 7, 2019 •

edited

bclinkinbeard commented May 14, 2019

bclinkinbeard commented May 14, 2019

bclinkinbeard commented May 15, 2019

ibgreen commented May 15, 2019

ibgreen commented May 15, 2019 •

edited

bclinkinbeard commented May 15, 2019

ibgreen commented May 15, 2019

ibgreen commented May 15, 2019

bclinkinbeard commented May 15, 2019

ibgreen commented May 15, 2019

Pessimistress commented May 15, 2019

bclinkinbeard commented May 15, 2019

Pessimistress commented May 15, 2019

bclinkinbeard commented May 15, 2019

Pessimistress commented May 15, 2019

bclinkinbeard commented May 15, 2019

Pessimistress commented May 15, 2019

bclinkinbeard commented May 15, 2019

Pessimistress commented May 15, 2019

bclinkinbeard commented May 15, 2019

Pessimistress commented May 15, 2019

bclinkinbeard commented May 15, 2019

The correct way to hide scatterplot items and practical count limits? #3068

The correct way to hide scatterplot items and practical count limits? #3068

Comments

bclinkinbeard commented May 6, 2019

ibgreen commented May 6, 2019

bclinkinbeard commented May 6, 2019

ibgreen commented May 7, 2019

bclinkinbeard commented May 7, 2019

ibgreen commented May 7, 2019 • edited

bclinkinbeard commented May 14, 2019

bclinkinbeard commented May 14, 2019

bclinkinbeard commented May 15, 2019

ibgreen commented May 15, 2019

ibgreen commented May 15, 2019 • edited

bclinkinbeard commented May 15, 2019

ibgreen commented May 15, 2019

ibgreen commented May 15, 2019

bclinkinbeard commented May 15, 2019

ibgreen commented May 15, 2019

Pessimistress commented May 15, 2019

bclinkinbeard commented May 15, 2019

Pessimistress commented May 15, 2019

bclinkinbeard commented May 15, 2019

Pessimistress commented May 15, 2019

bclinkinbeard commented May 15, 2019

Pessimistress commented May 15, 2019

bclinkinbeard commented May 15, 2019

Pessimistress commented May 15, 2019

bclinkinbeard commented May 15, 2019

Pessimistress commented May 15, 2019

bclinkinbeard commented May 15, 2019

ibgreen commented May 7, 2019 •

edited

ibgreen commented May 15, 2019 •

edited