Skip to content

Getting Good Performance From ANGLE

Austin Kinross edited this page Mar 23, 2018 · 8 revisions

Overview

  • This wiki page contains a lot of advice to help you get good performance out of ANGLE.
  • Most of the advice applies to all ANGLE apps.
  • Some advice applies to Windows Store only. Some applies to certain hardware (e.g. Windows Phones).

- Use the "fast present path" (a.k.a. render-to-backbuffer) flags

Note: ANGLE doesn't use these flags by default, but they are set in our Windows Store templates.

  • Due to differences between the GL and D3D11 coordinate systems, ANGLE usually renders its content upside-down onto an offscreen texture, before inverting this and rendering it to the screen.
  • This inversion is expensive on mobile GPUs. For example, it takes 20% of each frame on a Lumia 630.
  • We’ve implemented an OpenGL ES extension to render the content directly to the screen, the correct way up. This is a big performance win!
  • Note that the extension technically breaks OpenGL ES conformance in certain corner-cases, but in practice this is unlikely to be an issue.

Please see the EGL_ANGLE_experimental_present_path spec for more details


- Use Vertex Buffer Objects (VBOs) and static data

  • If possible, you should avoid copying vertex/index data from the CPU to the GPU every frame.
  • Use Vertex Buffer Objects (VBOs) to avoid this.
  • Use GL_STATIC_DRAW in these VBOs wherever possible to further improve performance.

- Run your app at a lower resolution

  • In OpenGL ES, the Window Surface size typically matches the screen resolution of the device.
  • Some devices have high-resolution screens but weak GPUs, which can cause performance issues.
  • The built-in hardware scaler allows your app to run at a lower resolution than the screen, and have its content scaled up to the screen’s resolution free.
  • This is configured using "IPropertySet", which is passed to eglCreateWindowSurface.
  • This is only supported in the Windows Store versions of ANGLE.

See this wiki page for more details.


- Make fewer, larger draw calls

  • D3D11 drivers are typically optimized to handle few draw calls, each with lots of geometry.
  • Some OpenGL ES drivers are optimized to handle lots of draw calls, each with smaller amounts of geometry.
  • ANGLE uses D3D11 drivers, so apps using ANGLE should try to minimize the number of draw calls.
  • Making lots of draw calls will work; it just isn’t the most efficient way to drive D3D.

- Cache compiled program binaries

  • Shaders and programs are expensive to compile at runtime in standard OpenGL ES.
  • They are particularly expensive in ANGLE, since we have to translate ESSL -> HLSL before compiling the shaders.
  • One possible way to work around this is to use the OES_get_program_binary extension.
  • Instead of compiling shaders every time the app is run, the extension allows your app to compile them once (e.g. during the app’s first run), save the compiled binary to disk, and simply reload it from disk next time the app is run.
  • This can be a lot faster than compiling the shaders from source.

See this wiki page for more details and example code.


- Avoid triggering buffer format conversion

  • Certain OpenGL ES vertex and index formats don't have equivalents in D3D11.
  • This is especially true for D3D11 Feature Level 9_3 (e.g. Windows Phone devices)
  • Using these formats will cause ANGLE to convert the GL format into the nearest D3D11 format, which is expensive.
  • Avoid these formats if it’s easy to do.
  • As a one-time cost (e.g. during app initialization) this isn’t too expensive.

Vertex Buffer Formats to avoid:

  • GL_BYTE, GL_UNSIGNED_BYTE, GL_SHORT and GL_UNSIGNED_SHORT with 3 components. Use 4 components instead.

Index Buffer Formats to avoid:

  • GL_UNSIGNED_BYTE. Use GL_UNSIGNED_SHORT or GL_UNSIGNED_INT instead.

Vertex Buffer formats to avoid on Feature Level 9_3:

  • GL_BYTE. Use GL_SHORT instead.
  • GL_UNSIGNED_BYTE with fewer than 4 components. Use 4 components instead.
  • GL_SHORT with 1 component. Use GL_SHORT with 2 components.
  • GL_UNSIGNED_SHORT. Use another format.

- Follow our advice for textures

  • Avoid redefining old textures. Ideally use immutable textures.
  • Load DXT compressed textures via the DXT extension ANGLE_texture_compression_dxt
    • This improves texture loading times, since DXT formats can be decoded by the hardware driver (instead of on the CPU like other image formats)
    • It also reduces memory consumption, since DXT textures are compressed in GPU memory
  • Avoid corner-cases on D3D11 Feature Level 9_3 which will will trigger additional memory usage.
    • Case 1: Create a texture without mipmaps, bind it to a framebuffer, render to the framebuffer, then call glGenerateMipmaps
      • An easy workaround for this is to create the texture with empty mipmaps before binding it to the framebuffer
    • Case 2: Create a texture without mipmaps, sample from it to render a scene, then call glGenerateMipmap() on the texture
    • Case 3: Create a texture with mipmaps, then disable mipmaps on the texture

- Use Instancing

  • Instancing allows an app to cheaply render the same content lots of times.
  • Instancing is exposed via an extension to OpenGL ES 2.0, ANGLE_instanced_arrays.
  • The extension is implemented in ANGLE, and is supported on D3D11 Feature Level 9_3.
  • The instancing extension graduated to OpenGL ES 3.0 with minor changes.
  • There are lots of tutorials available online for Instancing on ES 3.0. These should work when using the extension too.

- Use glDiscardFramebufferEXT

  • The EXT_discard_framebuffer extension helps ANGLE know when it has to keep framebuffer data, and when it can discard it.
  • This helps performance, particularly on tile-based architectures (such as most Windows Phones)

- Avoid glDrawElements with GL_POINTS on 9_3

  • This triggers an expensive emulation path on Feature Level 9_3.
  • Use glDrawArrays with GL_POINTS if possible.

- Avoid certain primitive types

  • Avoid GL_TRIANGLE_FAN. Use GL_TRIANGLE_STRIP or GL_TRIANGLES instead.
  • Avoid GL_LINE_LOOP. Use GL_LINES or GL_LINE_STRIP instead.
  • These types don't have equivalents in D3D11, so they require conversion to other types.