Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The correct way to hide scatterplot items and practical count limits? #3068

Closed
bclinkinbeard opened this issue May 6, 2019 · 27 comments
Closed

Comments

@bclinkinbeard
Copy link
Contributor

I've got a Scatterplot that hides items in response to a time scrubber, which I've implemented by returning 0 from getRadius for items outside the active time range. getRadius is specified in updateTriggers as well, but I'm wondering if this is the most performant way to do this.

Would setting the alpha to 0 or filtering the items out of the data ahead of time be a better approach? I'm also curious if smoothly scrubbing through 1M+ points is a realistic goal or not. I've rendered up to about 7M points but things get pretty bogged down.

@ibgreen
Copy link
Collaborator

ibgreen commented May 6, 2019

@bclinkinbeard When you change the radius of points (and fire the updateTrigger) you force a JavaScript recalculation of the radius instance array which will be 7 million items when you have 7 million points.

To get really good performance with millions of points you need to do computations on the GPU (i.e. in the shaders). In this case, this probably requires writing a custom subclass and making some addition to the existing scatterplot layer shader code.

If you think you are up to that it we could give you a few ideas for how to start.

@bclinkinbeard
Copy link
Contributor Author

Hey Ib, that would be great. I'm definitely willing to give it a shot so any pointers are much appreciated. Thanks!

@ibgreen
Copy link
Collaborator

ibgreen commented May 7, 2019

So the GPU first runs a vertex shader on every vertex in the geometry and then a fragment shader on every rendered pixel.

  • One trick is to use the discard keyword in the fragment shader to prevent rendering of pixels that belong to currently invisible objects.
  • In your case that is determined by checking if the time of the current data object is outside of the range.
  • You would set up two uniforms to represent the start and end time of the interval.
  • And generate a new attribute containing the time stamps of all the data objects.

To understand where to start, do any of these statements make sense?

Look through existing shaders and see if you can see similar patterns.

@bclinkinbeard
Copy link
Contributor Author

Hey Ib, that does make sense. I actually have my timestamps in a TypedArray already, in two different ways. I have a "vanilla" overlay and an Apache Arrow overlay, so I have both row and column encodings.

I've started looking at the ScatterplotLayer source, can I just subclass that and provide my own shaders?

@ibgreen
Copy link
Collaborator

ibgreen commented May 7, 2019

I actually have my timestamps in a TypedArray already, in two different ways. I have a "vanilla" overlay and an Apache Arrow overlay, so I have both row and column encodings.

Not quite sure what you mean with "overlay", but yes you want contiguous columnar data, i.e. just one timestamp per object. For Arrow I expect you'd just do table.getColumn('timestamps').toArray().

I've started looking at the ScatterplotLayer source, can I just subclass that and provide my own shaders?

Yes, there is a getShaders method you can supply to override the shaders.

Make sure you have read this page https://github.com/uber/deck.gl/blob/master/docs/developer-guide/custom-layers/subclassed-layers.md

The TripsLayer source code likely also has some timestamp handling that might serve as inspiration.

@bclinkinbeard
Copy link
Contributor Author

Hey Ib, with #3076 resolved I've come back to this and made some progress. I'm successfully filtering in the shader, but I'm not sure how to trigger updates in a performant way.

I followed your suggestions and added minMs and maxMs uniforms to my ScatterplotLayer subclass, and an instance property instanceTime. My vertex shader sets a varying float vTime with the instance value vTime = instanceTime;, which I learned from the TripsLayer source. My fragment shader then discards based on that value.

if (vTime < minMs || vTime > maxMs) {
  discard;
}

Where I'm a bit stuck is how to trigger re-renders without incurring a bunch of CPU work that makes everything lag. I think the recommended approach is to call this.deck.setProps({layers: [myCustomScatterplotLayer]});, but that calls all of the accessors again, resulting in jank. How do I tell the layer to only check the uniforms when re-rendering? Thanks!

@bclinkinbeard
Copy link
Contributor Author

To clarify a bit, the getPosition and getTime (from my new instance attribute) accessors are currently using the index property to look up their values in an Arrow column. You mentioned table.getColumn('timestamps').toArray(), but I'm not sure how to use that in the layer updates.

@bclinkinbeard
Copy link
Contributor Author

OK, after fighting with this all day (and night) I have something functional, but I hope it's not right. I'm still seeing significant jank with less than 1M points, even though I've confirmed my accessors are not being rerun. When my time scrubber changes I do the following.

layer.state.model.draw({
    uniforms: {
      minMs: this.minMs,
      maxMs: this.maxMs,
    }
});

The performance is actually worse than what I've gotten out of the CPU before. Hopefully I'm just holding it wrong?

@ibgreen
Copy link
Collaborator

ibgreen commented May 15, 2019

You mentioned table.getColumn('timestamps').toArray(), but I'm not sure how to use that in the layer updates

Arrow's toArray method returns a typed array representing that column. If that array has the right format (type and number of components per element) you can feed it directly as an attribute to your shader via a top-level prop.

attribute float timestamps;
const timestamps = table.getColumn('timestamps').toArray();

new Layer({
  timestamps, // Will use this typed array instead of generating it.
  ...
});

@ibgreen
Copy link
Collaborator

ibgreen commented May 15, 2019

The performance is actually worse than what I've gotten out of the CPU before.

No you should not be able to notice any degradation from that GPU code.

As a first step, try setting deck.log.priority to values between 1 and 4 and see if it can help you trace what is happening in the update cycle.

Just set the value in the console prompt in the debugger.

@bclinkinbeard
Copy link
Contributor Author

you can feed it directly as an attribute to your shader via a top-level prop

Thanks, I was able to do this for my custom attribute but not for getPosition. I tried passing a Float32Array that had the correct values but they don't seem to be used and nothing renders.

As a first step, try setting deck.log.priority

Setting it to 1 doesn't output anything, and anything above 1 produces the error Cannot read property 'totalCount' of undefined.

I think my primary question at this point is what do I do when my custom uniforms change? Should it really just be something like this?

this._layer = new CustomScatterplotLayer({
  data: {length: this.eventsTable.count()},
  getPosition: this.getPosition, // can hopefully replace this with TypedArray ref
  getTime: this.times, // Uint32Array of second precision timestamps
  minMs: this.minMs, // updated uniform
  maxMs: this.maxMs, // updated uniform
  radiusScale: 2 ** (15 - this.fractionalZoom),
});
this.deck.setProps({layers: [this._layer]});

@ibgreen ibgreen added the bug label May 15, 2019
@ibgreen
Copy link
Collaborator

ibgreen commented May 15, 2019

As a first step, try setting deck.log.priority

Setting it to 1 doesn't output anything, and anything above 1 produces the error Cannot read property 'totalCount' of undefined.

We improved metrics collection in v7, perhaps it introduced a bug with logging.

@ibgreen
Copy link
Collaborator

ibgreen commented May 15, 2019

Should it really just be something like this?

Yes but I would start by making sure it works without using binary data. As long as you are not changing the data array every render, using binary data should just optimize initialization time, not animation time.

uniforms: {
  minMs: this.minMs,
  maxMs: this.maxMs,
}

});```

Should that be this.props.minMs etc?

@bclinkinbeard
Copy link
Contributor Author

start by making sure it works without using binary data

Everything works, even with binary, I'm just trying to get the performance gains I was expecting.

using binary data should just optimize initialization time, not animation time.

Hmm, but I moved my filtering to the shader specifically so I could animate (show/hide) large numbers of points without touching the CPU and incurring lag. Is it reasonable to expect deck to support that in a performant manner with 1M+ points? More?

@ibgreen
Copy link
Collaborator

ibgreen commented May 15, 2019

On our MacBook pros we usually see rendering performance start to fall somewhere between 1-10M points. 1M is usually very good. We have examples on our website that loads close to 1M points (e.g. the scatterplot example) and perform quite well, if they perform well for when dragging and tilting then things should work.

Your if statement in the shader should not affect performance much, which is why I am coming back to the suggestion that something is triggering deck.gl to do other types of work every time you render the layer, such data generation or refreshes.

Since the code you showed me using binary presumably doesn't work (you said the submitting the binary positions didn't work for you) I suspected that you were actually using some other code and that code triggers some deck.gl update.

@Pessimistress
Copy link
Collaborator

@bclinkinbeard A few ideas:

  • Make sure you are using the latest release (a perf issue regarding typed array props was patched in 7.0.1)

  • Make sure your data prop is shallowly changed every render. From your snippet:

new CustomScatterplotLayer({
  data: {length: this.eventsTable.count()},  // a new object is created
  ...
})
  • To pass typed arrays directly to the attribute, you need to use the attribute name instead of the accessor name:
new CustomScatterplotLayer({
  instancePositions: <Float32Array>,
  instanceTime: <Uint32Array>,
  ...
});

@bclinkinbeard
Copy link
Contributor Author

Thanks, I am using the 7.0.6 bundle.

I switched the data prop to a stable ref with data: this.data and replaced getPosition and getTime with instancePositions and instanceTimes. Unfortunately that just gives me a TypeError: Cannot read property 'position' of undefined error and nothing rendered.

I do think it's possible some extra calls are happening so I'm going to investigate that.

@Pessimistress
Copy link
Collaborator

Unfortunately that just gives me a TypeError: Cannot read property 'position' of undefined error and nothing rendered.

Ah that is probably from the instancePositions64xyLow calculation: https://github.com/uber/deck.gl/blob/master/modules/layers/src/scatterplot-layer/scatterplot-layer.js#L179

Just to verify, try construct an empty Float32Array of length: count * 2 and pass it to a prop named instancePositions64xyLow.

@bclinkinbeard
Copy link
Contributor Author

That worked, thanks! My layer definition now looks like this.

this._layer = new CustomScatterplotLayer({
  numInstances: this.data.length,
  instancePositions: this.positions,
  instanceTimes: this.seconds,
  instancePositions64xyLow: this.instancePositions64xyLow,
  minMs: this.minMs,
  maxMs: this.maxMs,
  radiusScale: 2 ** (15 - this.fractionalZoom),
});
this.deck.setProps({layers: [this._layer]});

However, that generates a warning of Attribute prop instanceTimes is casted to Float32Array on first render, and then .WebGL-0x7fb1199f2e00]GL ERROR :GL_INVALID_OPERATION : glDrawArraysInstancedANGLE: attempt to access out of range vertices in attribute 3 on subsequent renders. (this.seconds is a Uint32Array)

@Pessimistress
Copy link
Collaborator

The warning should be non-blocking. What is the size of your instancePositions array?

@bclinkinbeard
Copy link
Contributor Author

The warning should be non-blocking.

Cool, went ahead and made it a Float32Array, just wanted to make sure I didn't mess up.

What is the size of your instancePositions array?

It varies, but just now the first render was 600K, the second 1.2M

@Pessimistress
Copy link
Collaborator

What is the size of your instancePositions array?

Are you packing [x, y] instead of [x, y, z]?

@bclinkinbeard
Copy link
Contributor Author

No, I just have similarly sized data chunks that I'm loading in succession. If I log out the lengths while rendering one particular dataset I get the following output.

1942635 instancePositions
647545 instanceTimes
647545 numInstances

If I then load another dataset of roughly the same size the counts change as I would expect and I see this output.

3835947 instancePositions
1278649 instanceTimes
1278649 numInstances

Immediately after that, though, I see the .WebGL-0x7fb1246cac00]GL ERROR :GL_INVALID_OPERATION : glDrawArraysInstancedANGLE: attempt to access out of range vertices in attribute 3 error and everything disappears.

@Pessimistress
Copy link
Collaborator

This may be caused by picking colors. Can you create a new data object whenever numInstances change?

@bclinkinbeard
Copy link
Contributor Author

That fixed it, thank you!!

I will close this issue once I do a bit more verification. 🚀 🎉

@Pessimistress
Copy link
Collaborator

The reason we don't document this approach is, as you have seen, there are a lot of implicit expectations for externally provided attributes, and coordinating their change with internal attribute updates can get tricky. We work with arrow data internally too and we are working on new API designs that make it easier.

@bclinkinbeard
Copy link
Contributor Author

Awesome, I poked around in the 7.x binary RFC docs a bit. Are there issues or branches I should watch for early access to the new APIs? We are just getting started with Arrow but I expect we'll be using it extensively in the near future. Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants