Change Notes

richgel999 edited this page Mar 23, 2014 · 36 revisions

3/21/14: KHR_debug support, added "-msaa X" option to replayer to control the default framebuffer's multisample settings, added a few more API's to the display list whitelist.

New API's added to replayer whitelist: DebugMessageControl, DebugMessageInsert, DebugMessageCallback, GetDebugMessageLog, PushDebugGroup, PopDebugGroup, ObjectLabel, GetObjectLabel, ObjectPtrLabel, GetObjectPtrLabel

GetPointerv() is not supported yet, but tracing/replaying call streams with this API call will work fine.

Right now we just remap any handles or pointers (as needed) and pass on the labels to the driver. We'll be adding the debug labels to our snapshot system soon (after that we can plum them up into the UI).

3/20/14: VoglEditor Features At The Time It Went Public

  • Loading multi-frame trace files
  • Obtaining a GL state snapshot at any API call that is not within a glBegin/glEnd block
  • Save / Load debug sessions
    • Saves a JSON file which links together the base trace file and all collected state snapshots
    • Saves all collected state snapshots to disk
  • Traces can be replayed from within vogleditor
  • CPU-based timeline
    • Indicates when frame boundaries occur
    • Shows API call execution and cost
    • Most expensive call is shown in red and all other calls are scaled between Red -> Green based on relative execution time
  • API call hierarchy
    • Shows frames and API calls hierarchically
    • Clicking on an API call will launch the replayer and collect a state snapshot after that call executes
    • Icons indicate which API calls have a snapshot
    • Supports searching entrypoints and parameters for a supplied string, and navigating to prev / next search match
    • Supports jumping to prev / next draw call
    • Supports jumping to prev / next snapshot
  • State snapshot panel
    • Viewing all GL state within a snapshot
    • Automatic diff'ing of state between two snapshots
    • GL object explorers will default to displaying currently bound / active objects
    • Visualization of all existing framebuffers, renderbuffers, and textures
      • All visualizations can be scrolled and zoomed
      • Ability to view RGBA (with customizable alpha blend color), RGB, individual color components, 1-component, and 1/component
      • Support for viewing individual samples
      • Support for Y-flipping the image
    • Viewing of all created shader objects
    • Viewing of all created program objects and their linked / attached shaders
      • Linked shaders can be edited and saved back into the snapshot so that the changes affect the trace replay

3/19/14: Bug fixes for snapshotting Cube 2, bug fix for replaying NV traces on AMD (optimized out uniforms could cause us to pass down bogus locs), couple bug fixes for Steam 2ft/10ft client which is now supported (on AMD - still waiting for driver fixes from NV)

A big thanks to blackout24 for filing the Cube 2 snapshot issues, and for testing our fixes.

3/14/14: Should be pushed to github tonight or tomorrow latest:

  • All changes needed to get the Steam 10ft UI tracing (through apitrace right now) and full-stream replaying are in. I've only just begun testing Steam 10ft with vogl, there could be other problems although I think full-stream is OK. Snapshotting 10ft is next. I'm currently debugging a 10ft VR rendering issue for Joe Ludwig, but to do so I need to finish testing snapshotting and natively tracing 10ft with vogl.

PBO's used on various texture API's are now supported (both PIXEL_PACK and PIXEL_UNPACK buffers).

Snapshotting while buffers are mapped during replaying is now supported (but still not during tracing - that will take some more thought).

Unfortunately, playing back Steam 10ft's call streams on NVidia crashes the GPU (both a 480 and 780). No idea why really, but my guess is the constant context switching to multiplex the 5 GL contexts 10ft uses down to 1 playback thread triggers a bug somewhere. This causes massive per-frame 10-15 sec delays on my primary devbox while NVidia's driver recovers, the dmesg log is interesting, and top reports 100% CPU utilization. I tried a few drivers (331 and 334) with no luck. We've submitted a full repro to NVidia.

  • vogl_gl_replayer::process_entrypoint_msg_print_detailed_context() - added a NULL ptr check on m_pCur_gl_packet (we made some changes at the very latest minute to allow packets to be pushed into the replayer individually, and we're still fixing some regressions due to these changes)

  • Fixed asserts while snapshotting or restoring genned, but never used display lists

  • Genned but never used display lists are now properly serialized/deserialized (discovered because apitrace doesn't support glXUseXFont(), and glxspheres uses this GLX API to fill in its display lists - so we would never see the glXUseXFont() calls)

  • Tested all the new GL code written over the previous few weeks on AMD for the first time - fixed some core profile issues. The default framebuffer's front buffer and depth/stencil surfaces are failing to snapshot on fglrx. I don't know why; the same code works fine on NVidia. Still investigating, and I just started the ball rolling with AMD.

  • Fixing core profile issues on g-truc samples - NV accepts some bad glGet's we were doing that AMD doesn't like.

  • gl-320-fbo-depth32 does not play back at all on AMD, so I've removed it from the regression test for now. This test doesn't work on AMD when I run it directly either. I've started the ball rolling with AMD on getting a driver fix.

  • Got our regression test suite to work on AMD (several fixes incoming)

  • Adding darwinia traces to the regression test, so we get some coverage of display list replaying and snapshotting. I'll be adding more game traces and tests, and we'll hopefully be pushing the regression test itself to github very soon.

  • Couple fixes for g-truc sample snapshotting on AMD

  • Replayer's function handler now doesn't horribly die when the packet calls a missing GL function (typically these are extensions present on one driver but not another - happens with Doom3 when traced on NV but played back on AMD).

Summary of Major Changes Since Steam Dev Days and Before Going Public:

These notes cover the non-UI aspects of the project:

  • We've now got the beginnings of a tracer/replayer correctness and smoke test process, currently under bin_richg. The test currently plays back a bunch of prerecorded apitraces using glretrace, traces this with our tracer, replays, the trims and replays again. Backbuffer CRC's (or per-component checksums with a comparison threshold for titles using MSAA) are checked all throughout the process to ensure everything went as expected.

Mike is going to be tightening up the regression test soon by moving it into a better location and rewriting it in C (vs. random bash scripts). We run this test whenever the tracer or replayer is modified, and I continue to expand the test suite and add game/sample traces.

I plan on adding tests to test binary->JSON conversion (and back) don't regress next.

  • All gtruc 3.x samples are now traceable/replayable. I'm going to work on GL 4.x support after 3.x. is solid. I'm seeing very few GL 4.x games in the wild right now, honestly.

We can now snapshot and restore the following new state:

  • Transform feedback

  • The contents of default framebuffers (except for MSAA default framebuffers - on the todo list). Blitted to a GL texture internally, then saved.

  • Renderbuffer contents (both MSAA and regular). Also blitted to a GL texture internally, then saved.

  • Full support for lossless save/restore of MSAA textures (both 2D and 2D_ARRAY). We now support all GL 4.x texture types for save/restore except for cubemap arrays, which should be easy.

  • Saving and restoring regular and MSAA stencil buffers is now fully supported. Saving/restoring stencil and MSAA stencil in GL 3.x requires a lot of quad passes, but it works. We use a temporary separate context (and sharelists) to do the blits and quad passes to avoid perturbing the app's context state.

  • UI now supports viewing depth/stencil buffers. Currently, the UI can only display the individual bytes of the depth/stencil texture data - LunarG is working on fixing this for us.

  • Arbitrary trace packets can now be composed on the fly and individually supplied to the GL replayer class. (Before, it only would read packets from a trace stream.)

  • Just added today: Support for pixel unpack buffers (for glTexImage2D, etc. calls) to tracer and replayer for Steam 10ft. Will check this in tomorrow or Wednesday.

  • Support for multisampled texture arrays is in, but needs to be tested which I just started to work on (but got sidetracked by Steam 10ft UI support).

  • Improved GLenum to string conversion. Still some work to do here, the GLenum spec data which guides the conversion is crappy. Regal probably has better code to do this, which we might grab at some point.