Skip to content

(Xbox 1) Fixed low volume issue#29

Merged
inactive123 merged 13 commits intolibretro:masterfrom
freakdave:master
Jul 22, 2012
Merged

(Xbox 1) Fixed low volume issue#29
inactive123 merged 13 commits intolibretro:masterfrom
freakdave:master

Conversation

@freakdave
Copy link
Copy Markdown
Contributor

(Xbox 1) Fixed low volume issue

inactive123 pushed a commit that referenced this pull request Jul 22, 2012
(Xbox 1) Fixed low volume issue
@inactive123 inactive123 merged commit 3ddb4a9 into libretro:master Jul 22, 2012
inactive123 pushed a commit that referenced this pull request Apr 15, 2018
LibretroAdmin added a commit that referenced this pull request Apr 21, 2026
The RGUI menu blit path in drm_gfx.c and sunxi_gfx.c (audit
findings #29 and #34, byte-for-byte clones of each other) had
several bugs that compounded into wrong menu colors, an
unbounded VLA, and -- on drm_gfx -- a malloc per menu frame
with no matching free.

Bugs in the bit extraction
==========================

The conversion treated the uint16_t source as if its layout
were RGB565 with naive 8-bit-channel shifts:

    R = (src_pix << 8) & 0x00FF0000;
    G = (src_pix << 4) & 0x0000FF00;
    B = (src_pix << 0) & 0x000000FF;
    line[j] = R | G | B;

Two problems with that:

1. The RGUI default output on these platforms is RGBA4444, not
   RGB565 (rgui_set_pixel_format_function in menu/drivers/rgui.c
   selects RGBA4444 for the unmatched-driver-name fallback that
   covers both drm and sunxi). The channel layout is
   R=15..12, G=11..8, B=7..4, A=3..0.

2. Even taken as RGB565, the shifts are wrong: they place 4-bit
   channel data into the low nibble of each 8-bit destination
   byte, so e.g. 0x000F (RGBA black, fully opaque) decodes to
   0x0000000F -- alpha bleeds into the blue byte. 0x00FF (RGBA
   pure blue) decodes to 0x00000FFF, leaking alpha across the
   bottom of green into blue.

Concretely, with the old code:

    rgba4444 input | old output  | correct XRGB8888
    ---------------+-------------+-----------------
    0xF00F (red)   | 0x00F0000F  | 0x00FF0000
    0x0F0F (green) | 0x000FF00F  | 0x0000FF00
    0x00FF (blue)  | 0x00000FFF  | 0x000000FF
    0x000F (black) | 0x0000000F  | 0x00000000
    0x888F (gray)  | 0x0088888F  | 0x00888888
    0xFFFF (white) | 0x00FFFFFF  | 0x00FFFFFF  (only correct case)

Pure white was the one input that came out right, which is
likely why the bug survived: a quick visual sanity check on
fully-saturated menu chrome could look plausible while every
other color was off and had stray bits in the unused alpha
byte of XRGB8888.

Fix: treat src_pix as RGBA4444, extract each 4-bit channel,
and expand to 8 bits via nibble replication (n | (n << 4)),
which is the standard 4-bit-to-8-bit upscaling that maps 0
to 0x00 and 0xF to 0xFF linearly.

The (currently ignored) rgb32 parameter
=======================================

Both functions accepted a `bool rgb32` from the caller and
discarded it. The set_texture_frame contract is: when rgb32
is true, the source is XRGB8888 already and no conversion
is required. The new code branches on rgb32 and does a
per-row memcpy in that case. RGUI does not currently exercise
this path on these drivers, but other set_texture_frame
callers do, and the parameter is part of the interface.

Per-frame malloc leak (drm only)
================================

drm_set_texture_frame did:

    char *frame_output = malloc(dst_pitch * height);
    /* ... fill it ... */
    drm_surface_update(_drmvars, frame_output, _drmvars->menu_surface);

drm_surface_update memcpys row-by-row from frame_output into
the dumb buffer at surface->pages[surface->flip_page].buf.map
(drm_gfx.c:328-331) and returns. frame_output is never freed
-- one full XRGB8888-frame leak per menu refresh.

The intermediate buffer is unnecessary. The destination is a
CPU-mapped DRM dumb buffer; we can write the converted pixels
directly into it. surface->pitch is the visible bytes per row
(src_width * bpp, set at drm_gfx.c:243) and matches the dst
stride for the menu surface, which was created with
pitch == width * 4 == surface->pitch. After writing in place,
call drm_page_flip directly -- this is the same sequence
drm_surface_update performs minus the now-redundant memcpy.

Stack-allocated VLA
===================

Both files declared

    uint32_t line[dst_width];

as a per-row scratch buffer, where dst_width was the menu
output width (drm) or the display xres (sunxi). On a 4K-wide
output that's 16 KiB on the stack per call, on platforms
(embedded ARM Linux) where thread stacks may be small. The
direct-into-mapped-buffer rewrite removes the need for any
scratch line entirely; pixels are produced one at a time
into their final destination slot.

sunxi: stale-stack read on width < xres
=======================================

sunxi_set_texture_frame additionally had a smaller bug: the
inner loop only filled line[0..width-1], but the row memcpy
copied dst_pitch == xres*4 bytes from line[]. When the source
was narrower than the display (e.g. RGUI rendering at a lower
resolution upscaled by the layer), the copy read uninitialized
stack bytes for pixels [width..xres-1] and pushed them into
the framebuffer. With the new direct-write path, only the
`width` actually-meaningful pixels per row are written; pixels
beyond `width` retain whatever was previously in the dumb
buffer (typically the prior frame's same-row content, which is
the correct behavior for a partial-width menu blit).

Defensive clamps on width/height
================================

The destination buffers (drm dumb buffer, sunxi framebuffer
page) are sized once and never resized while the menu is
active. The original drm code was structured around an
intermediate malloc whose size was driven by the surface's
src_height and total_pitch, which silently truncated rows if
the caller passed a larger frame than the surface was created
for. The original sunxi code allocated nothing and would have
walked off the end of pages[0] in the same scenario.

To preserve the no-overrun property of the new direct-write
path, both functions now clamp `width` and `height` against
the destination's known bounds (drm: surface->src_width /
surface->src_height; sunxi: sunxi_disp->xres / src_height)
before the loops. In normal operation these clamps are no-ops;
they exist purely so a future caller passing oversized
dimensions writes a truncated frame rather than overrunning
the mapped region.

Validation
==========

- Both functions compile cleanly under gcc -Wall -Wextra in
  isolation (no unused vars, no unused parameters beyond the
  pre-existing `alpha`).
- Conversion math validated against a tabulated reference for
  the six characteristic RGBA4444 inputs above, including the
  diagnostic that pure-white is the unique input where the old
  and new implementations agree.
- drm path: surface->pitch and surface->total_pitch semantics
  verified against drm_surface_setup (drm_gfx.c:222-278) and
  drm_surface_update (316-338). For the menu surface, both are
  width*4 because drm_surface_setup is called with
  pitch=width*4, bpp=4, src_width=width.
- drm flip semantics verified: drm_page_flip uses
  pages[flip_page] then toggles flip_page, so writing into
  pages[flip_page].buf.map before calling drm_page_flip puts
  the new pixels on the page that is about to be committed,
  matching what drm_surface_update did.
- sunxi flip semantics unchanged: same
  sunxi_layer_set_rgb_input_buffer call follows the in-place
  write, just as before.
- Behavioral validation via a synthetic harness that mocks the
  DRM dumb-buffer and sunxi page interfaces, drops in both the
  pre- and post-patch versions of each function side-by-side,
  and runs them against identical inputs (41 assertions across
  12 cases, including: exhaustive 4096-combination nibble
  matrix, 1000-frame leak counter, red-zone canaries to detect
  oversized-frame overruns, and direct reproduction of the
  original alpha-bleed and stack-garbage bugs against the
  unfixed code). All assertions pass on the post-patch code;
  the original-code reproductions all fail in the expected
  ways.
- Not testable in this sandbox: actual on-device rendering on
  Raspberry Pi (drm) or Allwinner SoC (sunxi). Both drivers
  are conditionally compiled and have no automated tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants