Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iconv/wctomb: fix memory corruption related to CURRENT_UTF8 #511

Merged

Conversation

Mrmaxmeier
Copy link
Contributor

@Mrmaxmeier Mrmaxmeier commented Jul 8, 2024

Hi,

I found an issue with wasi-libc's iconv() implementation: This reproducer shows that iconv() performs an out-of-bounds write to icv_out[-1]:

#include <stdio.h>
#include <stdint.h>
#include <iconv.h>

int main(int argc, char **argv)
{
    struct scratch
    {
        uint32_t crumplezone;
        uint32_t outbuf[1];
    } scratch = {0};

    unsigned char Data[] = {
        0x00, 0x00, 0xff, 0xff,
        0x00, 0x00, 0x00, 0x41};
    char *icv_in = (char *)Data;
    size_t icv_inlen = sizeof(Data);

    char *icv_out = (char *)&scratch.outbuf;
    size_t icv_outlen = sizeof(scratch.outbuf);

    iconv_t cd = iconv_open("UTF-8", "UCS-4");

    printf("crumple = 0x%x\n", scratch.crumplezone);
    printf("iconv(cd=%p, icv_in=%p, icv_inlen=%ld, icv_out=%p, icv_outlen=%ld)\n",
           cd, icv_in, icv_inlen, icv_out, icv_outlen);

    int ret = iconv(cd, &icv_in, &icv_inlen, &icv_out, &icv_outlen);

    printf("   -> ret=%d,      icv_in=%p, icv_inlen=%ld, icv_out=%p, icv_outlen=%ld\n",
           ret, icv_in, icv_inlen, icv_out, icv_outlen);
    printf("crumple = 0x%x\n", scratch.crumplezone);
}
$ wasi-sdk-22.0/bin/clang repro.c && wasmtime run a.out
crumple = 0x0
iconv(cd=0x33730, icv_in=0x336a8, icv_inlen=8, icv_out=0x336d4, icv_outlen=12)
   -> ret=0,      icv_in=0x336b0, icv_inlen=0, icv_out=0x336d4, icv_outlen=12
crumple = 0x41000000

The underlying issue here is an interaction between iconv(), wctomb(), and wasi-libc's inconsistent implementation of the CURRENT_UTF8 macro. iconv calls out to wctomb here and where it doesn't properly handle the error case:

case UTF_8:
  if (*outb < 4) {
    char tmp[4];
    k = wctomb_utf8(tmp, c);
    if (*outb < k) goto toobig;
    memcpy(*out, tmp, k);
  } else k = wctomb_utf8(*out, c); // c = 0xffff
  *out += k; // k = -1

The call to wctomb is intended to never fail, but wctomb's behavior depends on the current locale. iconv always sets this locale to UTF8:

locale_t *ploc = &CURRENT_LOCALE, loc = *ploc;
if (!in || !*in || !*inb) return 0;
*ploc = UTF8_LOCALE;
// [...] all of iconv's logic
*ploc = loc;
return x;

wctomb and others access the current locale via CURRENT_UTF8. The implementation in wasi-libc does not work properly though. CURRENT_UTF8 doesn't reflect updates to CURRENT_LOCALE:

#define CURRENT_LOCALE \
    (*({ \
        if (!libc.current_locale) { \
            libc.current_locale = &libc.global_locale; \
        } \
        &libc.current_locale; \
    }))

#define CURRENT_UTF8 (!!libc.global_locale.cat[LC_CTYPE])

This PR fixes that issue, though it would be nice to refactor iconv to not rely on a global "CURRENT_LOCALE" variable..

Thanks!

NOTE: While most of the affected code for this issue is from upstream musl, the broken CURRENT_UTF8 macro logic is only present in wasi-libc. The reproducer above works fine on native musl, glibc and emscripten-based WASM builds.

@Mrmaxmeier Mrmaxmeier force-pushed the fix-iconv-wctomb-locale-issue branch from d5ed1c2 to f946043 Compare July 8, 2024 13:59
@sunfishcode sunfishcode merged commit 3f43ea9 into WebAssembly:main Jul 8, 2024
8 checks passed
@sunfishcode
Copy link
Member

Thanks!

@Mrmaxmeier Mrmaxmeier deleted the fix-iconv-wctomb-locale-issue branch July 8, 2024 18:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants