Skip to content

Conversation

Cao-Wuhui
Copy link
Contributor

@Cao-Wuhui Cao-Wuhui commented Sep 24, 2025

Summary

  • Add ASan interceptors for wcscpy/wcsncpy on all platforms.
  • Enable wcscat/wcsncat on Windows (already enabled on POSIX via sanitizer_common).

Motivation

  • Use of wchar string APIs is common on Windows; improve parity with char* string checks.

Changes

  • Implement wcscpy/wcsncpy in asan_interceptors.cpp; check overlap and mark read/write ranges in bytes.
  • wcsncpy: compute write size in bytes (size * sizeof(wchar_t)) to avoid missed overflows when sizeof(wchar_t) != 1.
  • Use MaybeRealWcsnlen when available to bound reads.
  • Register Windows static thunk for wcscpy/wcsncpy/wcscat/wcsncat; rely on sanitizer_common interceptors for wcscat/wcsncat.
  • Tests: add wcscpy/wcsncpy/wcscat/wcsncat; flush stdout before crash; use resilient FileCheck patterns (reuse [[ADDR]], wildcard for function suffixes and paths, flexible line numbers).

Testing

  • AArch64 Linux: new tests pass with check-asan locally.

Follow-up to and based on prior work in PR #90909 (author: branh, Microsoft); builds on that work and addresses review feedback. Thanks!

Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot
Copy link
Member

llvmbot commented Sep 24, 2025

@llvm/pr-subscribers-compiler-rt-sanitizer

Author: Yixuan Cao (Cao-Wuhui)

Changes

Summary

  • Add ASan interceptors for wcscpy/wcsncpy on all platforms.
  • Enable wcscat/wcsncat on Windows (already enabled on POSIX via sanitizer_common).

Motivation

  • Use of wchar string APIs is common on Windows; improve parity with char* string checks.

Changes

  • Implement wcscpy/wcsncpy in asan_interceptors.cpp; check overlap and mark read/write ranges in bytes.
  • wcsncpy: compute write size in bytes (size * sizeof(wchar_t)) to avoid missed overflows when sizeof(wchar_t) != 1.
  • Use MaybeRealWcsnlen when available to bound reads.
  • Register Windows static thunk for wcscpy/wcsncpy/wcscat/wcsncat; rely on sanitizer_common interceptors for wcscat/wcsncat.
  • Tests: add wcscpy/wcsncpy/wcscat/wcsncat; flush stdout before crash; make FileCheck resilient (reuse [[ADDR]], accept function suffixes, optional column numbers, and path separators).

Testing

  • AArch64 Linux: new tests pass with check-asan locally.

Follow-up to and based on prior work in PR #90909 (author: branh, Microsoft); builds on that work and addresses review feedback. Thanks!


Full diff: https://github.com/llvm/llvm-project/pull/160493.diff

8 Files Affected:

  • (modified) compiler-rt/lib/asan/asan_interceptors.cpp (+42)
  • (modified) compiler-rt/lib/asan/asan_interceptors.h (+1)
  • (modified) compiler-rt/lib/asan/asan_win_static_runtime_thunk.cpp (+4)
  • (modified) compiler-rt/lib/sanitizer_common/sanitizer_platform_interceptors.h (+1-1)
  • (added) compiler-rt/test/asan/TestCases/wcscat.cpp (+26)
  • (added) compiler-rt/test/asan/TestCases/wcscpy.cpp (+23)
  • (added) compiler-rt/test/asan/TestCases/wcsncat.cpp (+27)
  • (added) compiler-rt/test/asan/TestCases/wcsncpy.cpp (+25)
diff --git a/compiler-rt/lib/asan/asan_interceptors.cpp b/compiler-rt/lib/asan/asan_interceptors.cpp
index 7c9a08b9083a2..2eb02fe4f0d87 100644
--- a/compiler-rt/lib/asan/asan_interceptors.cpp
+++ b/compiler-rt/lib/asan/asan_interceptors.cpp
@@ -65,6 +65,15 @@ static inline uptr MaybeRealStrnlen(const char *s, uptr maxlen) {
   return internal_strnlen(s, maxlen);
 }
 
+static inline uptr MaybeRealWcsnlen(const wchar_t *s, uptr maxlen) {
+#if SANITIZER_INTERCEPT_WCSNLEN
+  if (REAL(wcsnlen)) {
+    return REAL(wcsnlen)(s, maxlen);
+  }
+#endif
+  return internal_wcsnlen(s, maxlen);
+}
+
 void SetThreadName(const char *name) {
   AsanThread *t = GetCurrentThread();
   if (t)
@@ -570,6 +579,20 @@ INTERCEPTOR(char *, strcpy, char *to, const char *from) {
   return REAL(strcpy)(to, from);
 }
 
+INTERCEPTOR(wchar_t *, wcscpy, wchar_t *to, const wchar_t *from) {
+  void *ctx;
+  ASAN_INTERCEPTOR_ENTER(ctx, wcscpy);
+  if (!TryAsanInitFromRtl())
+    return REAL(wcscpy)(to, from);
+  if (flags()->replace_str) {
+    uptr from_size = (internal_wcslen(from) + 1) * sizeof(wchar_t);
+    CHECK_RANGES_OVERLAP("wcscpy", to, from_size, from, from_size);
+    ASAN_READ_RANGE(ctx, from, from_size);
+    ASAN_WRITE_RANGE(ctx, to, from_size);
+  }
+  return REAL(wcscpy)(to, from);
+}
+
 // Windows doesn't always define the strdup identifier,
 // and when it does it's a macro defined to either _strdup
 // or _strdup_dbg, _strdup_dbg ends up calling _strdup, so
@@ -633,6 +656,20 @@ INTERCEPTOR(char*, strncpy, char *to, const char *from, usize size) {
   return REAL(strncpy)(to, from, size);
 }
 
+INTERCEPTOR(wchar_t *, wcsncpy, wchar_t *to, const wchar_t *from, uptr size) {
+  void *ctx;
+  ASAN_INTERCEPTOR_ENTER(ctx, wcsncpy);
+  AsanInitFromRtl();
+  if (flags()->replace_str) {
+    uptr from_size =
+        Min(size, MaybeRealWcsnlen(from, size) + 1) * sizeof(wchar_t);
+    CHECK_RANGES_OVERLAP("wcsncpy", to, from_size, from, from_size);
+    ASAN_READ_RANGE(ctx, from, from_size);
+    ASAN_WRITE_RANGE(ctx, to, size * sizeof(wchar_t));
+  }
+  return REAL(wcsncpy)(to, from, size);
+}
+
 template <typename Fn>
 static ALWAYS_INLINE auto StrtolImpl(void *ctx, Fn real, const char *nptr,
                                      char **endptr, int base)
@@ -809,6 +846,11 @@ void InitializeAsanInterceptors() {
   ASAN_INTERCEPT_FUNC(strncat);
   ASAN_INTERCEPT_FUNC(strncpy);
   ASAN_INTERCEPT_FUNC(strdup);
+
+  // Intercept wcs* functions.
+  ASAN_INTERCEPT_FUNC(wcscpy);
+  ASAN_INTERCEPT_FUNC(wcsncpy);
+
 #  if ASAN_INTERCEPT___STRDUP
   ASAN_INTERCEPT_FUNC(__strdup);
 #endif
diff --git a/compiler-rt/lib/asan/asan_interceptors.h b/compiler-rt/lib/asan/asan_interceptors.h
index 3e2386eaf8092..33d4210b5815c 100644
--- a/compiler-rt/lib/asan/asan_interceptors.h
+++ b/compiler-rt/lib/asan/asan_interceptors.h
@@ -129,6 +129,7 @@ DECLARE_REAL(char*, strchr, const char *str, int c)
 DECLARE_REAL(SIZE_T, strlen, const char *s)
 DECLARE_REAL(char*, strncpy, char *to, const char *from, SIZE_T size)
 DECLARE_REAL(SIZE_T, strnlen, const char *s, SIZE_T maxlen)
+DECLARE_REAL(SIZE_T, wcsnlen, const wchar_t *s, SIZE_T maxlen)
 DECLARE_REAL(char*, strstr, const char *s1, const char *s2)
 
 #  if !SANITIZER_APPLE
diff --git a/compiler-rt/lib/asan/asan_win_static_runtime_thunk.cpp b/compiler-rt/lib/asan/asan_win_static_runtime_thunk.cpp
index 4a69b66574039..4cf6214d1c4e7 100644
--- a/compiler-rt/lib/asan/asan_win_static_runtime_thunk.cpp
+++ b/compiler-rt/lib/asan/asan_win_static_runtime_thunk.cpp
@@ -63,6 +63,10 @@ INTERCEPT_LIBRARY_FUNCTION_ASAN(strpbrk);
 INTERCEPT_LIBRARY_FUNCTION_ASAN(strspn);
 INTERCEPT_LIBRARY_FUNCTION_ASAN(strstr);
 INTERCEPT_LIBRARY_FUNCTION_ASAN(strtok);
+INTERCEPT_LIBRARY_FUNCTION(wcscat);
+INTERCEPT_LIBRARY_FUNCTION(wcscpy);
+INTERCEPT_LIBRARY_FUNCTION(wcsncat);
+INTERCEPT_LIBRARY_FUNCTION(wcsncpy);
 INTERCEPT_LIBRARY_FUNCTION_ASAN(wcslen);
 INTERCEPT_LIBRARY_FUNCTION_ASAN(wcsnlen);
 
diff --git a/compiler-rt/lib/sanitizer_common/sanitizer_platform_interceptors.h b/compiler-rt/lib/sanitizer_common/sanitizer_platform_interceptors.h
index 29987decdff45..5173389e6a14d 100644
--- a/compiler-rt/lib/sanitizer_common/sanitizer_platform_interceptors.h
+++ b/compiler-rt/lib/sanitizer_common/sanitizer_platform_interceptors.h
@@ -551,7 +551,7 @@ SANITIZER_WEAK_IMPORT void *aligned_alloc(__sanitizer::usize __alignment,
 #define SANITIZER_INTERCEPT_MALLOC_USABLE_SIZE (!SI_MAC && !SI_NETBSD)
 #define SANITIZER_INTERCEPT_MCHECK_MPROBE SI_LINUX_NOT_ANDROID
 #define SANITIZER_INTERCEPT_WCSLEN 1
-#define SANITIZER_INTERCEPT_WCSCAT SI_POSIX
+#define SANITIZER_INTERCEPT_WCSCAT 1
 #define SANITIZER_INTERCEPT_WCSDUP SI_POSIX
 #define SANITIZER_INTERCEPT_SIGNAL_AND_SIGACTION (!SI_WINDOWS && SI_NOT_FUCHSIA)
 #define SANITIZER_INTERCEPT_BSD_SIGNAL SI_ANDROID
diff --git a/compiler-rt/test/asan/TestCases/wcscat.cpp b/compiler-rt/test/asan/TestCases/wcscat.cpp
new file mode 100644
index 0000000000000..dcdff88c18ef1
--- /dev/null
+++ b/compiler-rt/test/asan/TestCases/wcscat.cpp
@@ -0,0 +1,26 @@
+// RUN: %clangxx_asan -O0 %s -o %t && not %run %t 2>&1 | FileCheck %s --check-prefix=CHECK
+// RUN: %clangxx_asan -O1 %s -o %t && not %run %t 2>&1 | FileCheck %s --check-prefix=CHECK
+// RUN: %clangxx_asan -O2 %s -o %t && not %run %t 2>&1 | FileCheck %s --check-prefix=CHECK
+// RUN: %clangxx_asan -O3 %s -o %t && not %run %t 2>&1 | FileCheck %s --check-prefix=CHECK
+
+#include <stdio.h>
+#include <wchar.h>
+
+int main() {
+  wchar_t *start = L"X means ";
+  wchar_t *append = L"dog";
+  wchar_t goodDst[12];
+  wcscpy(goodDst, start);
+  wcscat(goodDst, append);
+
+  wchar_t badDst[9];
+  wcscpy(badDst, start);
+  printf("Good so far.\n");
+  // CHECK: Good so far.
+  fflush(stdout);
+  wcscat(badDst, append); // Boom!
+  // CHECK: ERROR: AddressSanitizer: stack-buffer-overflow on address [[ADDR:0x[0-9a-f]+]] at pc {{0x[0-9a-f]+}} bp {{0x[0-9a-f]+}} sp {{0x[0-9a-f]+}}
+  // CHECK: WRITE of size {{[0-9]+}} at [[ADDR:0x[0-9a-f]+]] thread T0
+  // CHECK: #0 [[ADDR:0x[0-9a-f]+]] in wcscat{{.*}}sanitizer_common_interceptors.inc:{{[0-9]+}}
+  printf("Should have failed with ASAN error.\n");
+}
\ No newline at end of file
diff --git a/compiler-rt/test/asan/TestCases/wcscpy.cpp b/compiler-rt/test/asan/TestCases/wcscpy.cpp
new file mode 100644
index 0000000000000..414d83303a960
--- /dev/null
+++ b/compiler-rt/test/asan/TestCases/wcscpy.cpp
@@ -0,0 +1,23 @@
+// RUN: %clangxx_asan -O0 %s -o %t && not %run %t 2>&1 | FileCheck %s --check-prefix=CHECK
+// RUN: %clangxx_asan -O1 %s -o %t && not %run %t 2>&1 | FileCheck %s --check-prefix=CHECK
+// RUN: %clangxx_asan -O2 %s -o %t && not %run %t 2>&1 | FileCheck %s --check-prefix=CHECK
+// RUN: %clangxx_asan -O3 %s -o %t && not %run %t 2>&1 | FileCheck %s --check-prefix=CHECK
+
+#include <stdio.h>
+#include <wchar.h>
+
+int main() {
+  wchar_t *src = L"X means dog";
+  wchar_t goodDst[12];
+  wcscpy(goodDst, src);
+
+  wchar_t badDst[7];
+  printf("Good so far.\n");
+  // CHECK: Good so far.
+  fflush(stdout);
+  wcscpy(badDst, src); // Boom!
+  // CHECK:ERROR: AddressSanitizer: stack-buffer-overflow on address [[ADDR:0x[0-9a-f]+]] at pc {{0x[0-9a-f]+}} bp {{0x[0-9a-f]+}} sp {{0x[0-9a-f]+}}
+  // CHECK: WRITE of size {{[0-9]+}} at [[ADDR:0x[0-9a-f]+]] thread T0
+  // CHECK: #0 [[ADDR:0x[0-9a-f]+]] in wcscpy{{.*}}asan_interceptors.cpp:{{[0-9]+}}
+  printf("Should have failed with ASAN error.\n");
+}
\ No newline at end of file
diff --git a/compiler-rt/test/asan/TestCases/wcsncat.cpp b/compiler-rt/test/asan/TestCases/wcsncat.cpp
new file mode 100644
index 0000000000000..3ab7fc8f55d63
--- /dev/null
+++ b/compiler-rt/test/asan/TestCases/wcsncat.cpp
@@ -0,0 +1,27 @@
+// RUN: %clangxx_asan -O0 %s -o %t && not %run %t 2>&1 | FileCheck %s --check-prefix=CHECK
+// RUN: %clangxx_asan -O1 %s -o %t && not %run %t 2>&1 | FileCheck %s --check-prefix=CHECK
+// RUN: %clangxx_asan -O2 %s -o %t && not %run %t 2>&1 | FileCheck %s --check-prefix=CHECK
+// RUN: %clangxx_asan -O3 %s -o %t && not %run %t 2>&1 | FileCheck %s --check-prefix=CHECK
+
+#include <stdio.h>
+#include <wchar.h>
+
+int main() {
+  wchar_t *start = L"X means ";
+  wchar_t *append = L"dog";
+  wchar_t goodDst[15];
+  wcscpy(goodDst, start);
+  wcsncat(goodDst, append, 5);
+
+  wchar_t badDst[11];
+  wcscpy(badDst, start);
+  wcsncat(badDst, append, 1);
+  printf("Good so far.\n");
+  // CHECK: Good so far.
+  fflush(stdout);
+  wcsncat(badDst, append, 3); // Boom!
+  // CHECK: ERROR: AddressSanitizer: stack-buffer-overflow on address [[ADDR:0x[0-9a-f]+]] at pc {{0x[0-9a-f]+}} bp {{0x[0-9a-f]+}} sp {{0x[0-9a-f]+}}
+  // CHECK: WRITE of size {{[0-9]+}} at [[ADDR:0x[0-9a-f]+]] thread T0
+  // CHECK: #0 [[ADDR:0x[0-9a-f]+]] in wcsncat{{.*}}sanitizer_common_interceptors.inc:{{[0-9]+}}
+  printf("Should have failed with ASAN error.\n");
+}
\ No newline at end of file
diff --git a/compiler-rt/test/asan/TestCases/wcsncpy.cpp b/compiler-rt/test/asan/TestCases/wcsncpy.cpp
new file mode 100644
index 0000000000000..6177b72990a0a
--- /dev/null
+++ b/compiler-rt/test/asan/TestCases/wcsncpy.cpp
@@ -0,0 +1,25 @@
+// RUN: %clangxx_asan -O0 %s -o %t && not %run %t 2>&1 | FileCheck %s --check-prefix=CHECK
+// RUN: %clangxx_asan -O1 %s -o %t && not %run %t 2>&1 | FileCheck %s --check-prefix=CHECK
+// RUN: %clangxx_asan -O2 %s -o %t && not %run %t 2>&1 | FileCheck %s --check-prefix=CHECK
+// RUN: %clangxx_asan -O3 %s -o %t && not %run %t 2>&1 | FileCheck %s --check-prefix=CHECK
+
+#include <stdio.h>
+#include <wchar.h>
+
+int main() {
+  wchar_t *src = L"X means dog";
+  wchar_t goodDst[12];
+  wcsncpy(goodDst, src, 12);
+
+  wchar_t badDst[7];
+  wcsncpy(badDst, src, 7); // This should still work.
+  printf("Good so far.\n");
+  // CHECK: Good so far.
+  fflush(stdout);
+
+  wcsncpy(badDst, src, 15); // Boom!
+  // CHECK:ERROR: AddressSanitizer: stack-buffer-overflow on address [[ADDR:0x[0-9a-f]+]] at pc {{0x[0-9a-f]+}} bp {{0x[0-9a-f]+}} sp {{0x[0-9a-f]+}}
+  // CHECK: WRITE of size {{[0-9]+}} at [[ADDR:0x[0-9a-f]+]] thread T0
+  // CHECK: #0 [[ADDR:0x[0-9a-f]+]] in wcsncpy{{.*}}asan_interceptors.cpp:{{[0-9]+}}
+  printf("Should have failed with ASAN error.\n");
+}
\ No newline at end of file

- Implement wchar interceptors; register Windows thunk.
- wcsncpy: compute write size in bytes (size * sizeof(wchar_t)) to avoid missed overflows when sizeof(wchar_t) != 1.
- Harden tests (fflush, resilient FileCheck).

Follow-up to PR llvm#90909: builds on that work and addresses review feedback.
Refs: llvm#90909
@Cao-Wuhui
Copy link
Contributor Author

Hi! First-time contributor here. Could a maintainer please approve CI workflows (Build and Test Linux/Windows) to run?
This PR only touches compiler-rt/asan and tests; no changes under .github/workflows. Thanks!

#define SANITIZER_INTERCEPT_MCHECK_MPROBE SI_LINUX_NOT_ANDROID
#define SANITIZER_INTERCEPT_WCSLEN 1
#define SANITIZER_INTERCEPT_WCSCAT SI_POSIX
#define SANITIZER_INTERCEPT_WCSCAT 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change does more than "Enable wcscat/wcsncat on Windows" - it is enabling it on all platforms (possibly including some that don't have wcscat).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the catch! I have restricted the enablement to avoid enabling wcscat/wcsncat on non-POSIX/non-Windows platforms. Let me know if you prefer an alternative gate.

Copy link

github-actions bot commented Sep 24, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.


static inline uptr MaybeRealWcsnlen(const wchar_t *s, uptr maxlen) {
#if SANITIZER_INTERCEPT_WCSNLEN
if (REAL(wcsnlen)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow https://llvm.org/docs/CodingStandards.html#don-t-use-braces-on-simple-single-statement-bodies-of-if-else-loop-statements
(even though existing code, such as MaybeRealStrnlen() above, does not follow this convention)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don’t use braces on simple single-statement bodies of if/else/loop statements
https://llvm.org/docs/CodingStandards.html#don-t-use-braces-on-simple-single-statement-bodies-of-if-else-loop-statements

Done: removed braces in MaybeRealWcsnlen() and made MaybeRealStrnlen() consistent. Submitted as a fixup; will autosquash before landing. Thanks @thurstond.

if (!TryAsanInitFromRtl())
return REAL(wcscpy)(to, from);
if (flags()->replace_str) {
uptr from_size = (internal_wcslen(from) + 1) * sizeof(wchar_t);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from_size is also used as the to_size. Could it be called size instead?

Copy link
Contributor Author

@Cao-Wuhui Cao-Wuhui Sep 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename from_size to size in wcscpy interceptor since it’s used for both source and destination checks. Submitted as a fixup; will autosquash before landing. Thanks @thurstond.

@thurstond
Copy link
Contributor

Hi! First-time contributor here. Could a maintainer please approve CI workflows (Build and Test Linux/Windows) to run? This PR only touches compiler-rt/asan and tests; no changes under .github/workflows. Thanks!

Done. Please fix the CI findings:

…on Windows

Signed-off-by: Yixuan Cao <caoyixuan2019@email.szu.edu.cn>
…on Windows

Signed-off-by: Yixuan Cao <caoyixuan2019@email.szu.edu.cn>
…on Windows

Signed-off-by: Yixuan Cao <caoyixuan2019@email.szu.edu.cn>
@Cao-Wuhui
Copy link
Contributor Author

Cao-Wuhui commented Sep 26, 2025

Hi! First-time contributor here. Could a maintainer please approve CI workflows (Build and Test Linux/Windows) to run? This PR only touches compiler-rt/asan and tests; no changes under .github/workflows. Thanks!

Done. Please fix the CI findings:

Thanks for approving CI. I’ve addressed the reported issues:

  • Windows build: fixed static thunk macro usage by switching wcscat/wcsncat/wcscpy/wcsncpy to INTERCEPT_LIBRARY_FUNCTION_ASAN, which expands to the required 2‑arg form.
  • Style: applied LLVM style (single-statement if without braces).

@thurstond Could you please re-run CI for this PR? If anything else shows up, I’ll follow up quickly.

…on Windows

Signed-off-by: Yixuan Cao <caoyixuan2019@email.szu.edu.cn>
@thurstond
Copy link
Contributor

@thurstond Could you please re-run CI for this PR? If anything else shows up, I’ll follow up quickly.

I've re-run it. Some minor formatting nits: https://github.com/llvm/llvm-project/actions/runs/18042370940/job/51521406858?pr=160493

…on Windows

Code style.

Signed-off-by: Yixuan Cao <caoyixuan2019@email.szu.edu.cn>
@Cao-Wuhui
Copy link
Contributor Author

@thurstond Could you please re-run CI for this PR? If anything else shows up, I’ll follow up quickly.

I've re-run it. Some minor formatting nits: https://github.com/llvm/llvm-project/actions/runs/18042370940/job/51521406858?pr=160493

Thanks for re-running CI! I checked the formatting nits locally with:
python3 ./clang/tools/clang-format/git-clang-format --binary ./build/bin/clang-format --diff 5031c163ff82 HEAD
It reported the pointer-spacing and preprocessor-indentation issues; I’ve applied the fixes and verified the diff is now clean. If anything else pops up, I’ll address it quickly.

Copy link
Contributor

@thurstond thurstond left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch!

@thurstond
Copy link
Contributor

Please use a public email address per https://llvm.org/docs/DeveloperPolicy.html#email-addresses:

The LLVM project uses email to communicate to contributors outside of the GitHub platform about their past contributions. Primarily, our buildbot infrastructure uses emails to contact contributors about build and test failures.

Therefore, the LLVM community requires contributors to have a public email address associated with their GitHub commits, so please ensure that “Keep my email addresses private” is disabled in your account settings. There are many free email forwarding services available if you wish to keep your identity private.

@Cao-Wuhui
Copy link
Contributor Author

Please use a public email address per https://llvm.org/docs/DeveloperPolicy.html#email-addresses:

The LLVM project uses email to communicate to contributors outside of the GitHub platform about their past contributions. Primarily, our buildbot infrastructure uses emails to contact contributors about build and test failures.
Therefore, the LLVM community requires contributors to have a public email address associated with their GitHub commits, so please ensure that “Keep my email addresses private” is disabled in your account settings. There are many free email forwarding services available if you wish to keep your identity private.

Thanks for the reminder. I’ve updated my GitHub settings to use a public email, i.e., caoyixuan2019@email.szu.edu.cn. Please let me know if anything else is needed.

@thurstond thurstond merged commit 6ca835b into llvm:main Sep 30, 2025
9 checks passed
Copy link

@Cao-Wuhui Congratulations on having your first Pull Request (PR) merged into the LLVM Project!

Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR.

Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues.

How to do this, and the rest of the post-merge process, is covered in detail here.

If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again.

If you don't get any reports, no action is required from you. Your changes are working as expected, well done!

@DKLoehr
Copy link
Contributor

DKLoehr commented Oct 1, 2025

We're seeing the newly-added tests fail when building for mac. Chromium issue: https://g-issues.chromium.org/issues/448631142

Relevant output:

Failed Tests (4):
   AddressSanitizer-x86_64-darwin :: TestCases/wcscat.cpp
   AddressSanitizer-x86_64-darwin :: TestCases/wcscpy.cpp
   AddressSanitizer-x86_64-darwin :: TestCases/wcsncat.cpp
   AddressSanitizer-x86_64-darwin :: TestCases/wcsncpy.cpp

Snippet of one of the failures

/Volumes/Work/s/w/ir/cache/builder/src/third_party/llvm/compiler-rt/test/asan/TestCases/wcsncpy.cpp:23:12: error: CHECK: expected string not found in input
  // CHECK: #0 [[ADDR:0x[0-9a-f]+]] in wcsncpy{{.*}}asan_interceptors.cpp:{{[0-9]+}}
            ^
 <stdin>:4:45: note: scanning from here
 WRITE of size 60 at 0x7ff7bc7f6e2c thread T0
                                             ^
 <stdin>:10:7: note: possible intended match here
  #0 0x000103708a6f in main wcsncpy.cpp:9
  
...

Input was:
 <<<<<<
             1: Good so far. 
             2: ================================================================= 
             3: ==63539==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ff7bc7f6e2c at pc 0x00010396a241 bp 0x7ff7bc7f6d90 sp 0x7ff7bc7f6540 
             4: WRITE of size 60 at 0x7ff7bc7f6e2c thread T0 
 check:23'0                                                 X error: no match found
             5:  #0 0x00010396a240 in wcsncpy+0x4e0 (libclang_rt.asan_osx_dynamic.dylib:x86_64+0x50240) 
 check:23'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
             6:  #1 0x000103708bd8 in main wcsncpy.cpp:20 

It seems like it might be missing the asan_interceptors suffix on line 5:.

RiverDave pushed a commit that referenced this pull request Oct 1, 2025
…ows (#160493)

Summary
- Add ASan interceptors for wcscpy/wcsncpy on all platforms.
- Enable wcscat/wcsncat on Windows (already enabled on POSIX via
sanitizer_common).

Motivation
- Use of wchar string APIs is common on Windows; improve parity with
char* string checks.

Changes
- Implement wcscpy/wcsncpy in asan_interceptors.cpp; check overlap and
mark read/write ranges in bytes.
- wcsncpy: compute write size in bytes (size * sizeof(wchar_t)) to avoid
missed overflows when sizeof(wchar_t) != 1.
- Use MaybeRealWcsnlen when available to bound reads.
- Register Windows static thunk for wcscpy/wcsncpy/wcscat/wcsncat; rely
on sanitizer_common interceptors for wcscat/wcsncat.
- Tests: add wcscpy/wcsncpy/wcscat/wcsncat; flush stdout before crash;
use resilient FileCheck patterns (reuse [[ADDR]], wildcard for function
suffixes and paths, flexible line numbers).

Testing
- AArch64 Linux: new tests pass with check-asan locally.

Follow-up to and based on prior work in PR #90909 (author: branh,
Microsoft); builds on that work and addresses review feedback. Thanks!

---------

Signed-off-by: Yixuan Cao <caoyixuan2019@email.szu.edu.cn>
@Cao-Wuhui
Copy link
Contributor Author

Cao-Wuhui commented Oct 2, 2025

We're seeing the newly-added tests fail when building for mac. Chromium issue: https://g-issues.chromium.org/issues/448631142

Thanks for the report! The macOS failures are tracked in Chromium issue 448631142. We also observed Android flakes where the ASan “ERROR:” header is not on stderr (only ==...==ABORTING shows up); see Android Buildbot 186/12821.

Both issues are addressed in the follow-up test-only PR #161624:

  • Android: route reports to stderr via %env_asan_opts=log_to_stderr=1, and print/flush the pre-crash marker to stderr to avoid stdout/stderr reordering.
  • Darwin: relax the stack-frame check to only require the function name (wcscpy/wcsncpy/wcscat/wcsncat) to tolerate libclang_rt.asan_* frames.
  • Common: reuse FileCheck var [[ADDR]] and make wide string literals const wchar_t*.

Happy to iterate if anything else pops up.

thurstond pushed a commit that referenced this pull request Oct 2, 2025
…161624)

### Summary
Stabilize ASan wchar tests across Darwin and Android. NFC: test-only.
Follow-up to PR #160493 (adds wchar interceptors/tests).

### Motivation
- Darwin: The top frame often resolves to `libclang_rt.asan_*` rather
than a source file, so strict checks that include file/line can fail.
See Chromium issue
[448631142](https://g-issues.chromium.org/issues/448631142).
- Android: The “ERROR:” header can go to logcat instead of stderr, so
FileCheck may not see it; stdout/stderr reordering also makes pre-crash
markers racy. See Android Buildbot
[186/12821](https://lab.llvm.org/buildbot/#/builders/186/builds/12821).

### Changes
- Android:
- Force reports to stderr via `%env_asan_opts=log_to_stderr=1`, avoiding
the “ERROR:” header going to logcat.
- Print the pre-crash “Good so far.” to stderr and `fflush(stderr)` to
avoid stdout/stderr reordering.
- Darwin:
- Relax the stack-frame check to only require the function name
(`wcscpy/wcsncpy/wcscat/wcsncat`) to tolerate `libclang_rt.asan_*`
frames.
- Common:
  - Reuse FileCheck var `[[ADDR]]` instead of redefining.
- Make wide string literals `const wchar_t*` to silence
`-Wwritable-strings`.

### Risk
- NFC: test-only; no change to runtime behavior.

### References
- Follow-up to PR #160493.
- Chromium: [448631142](https://g-issues.chromium.org/issues/448631142)
(Darwin failures).
- Android Buildbot:
[186/12821](https://lab.llvm.org/buildbot/#/builders/186/builds/12821).

Signed-off-by: Yixuan Cao <caoyixuan2019@email.szu.edu.cn>
mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Oct 3, 2025
…ows (llvm#160493)

Summary
- Add ASan interceptors for wcscpy/wcsncpy on all platforms.
- Enable wcscat/wcsncat on Windows (already enabled on POSIX via
sanitizer_common).

Motivation
- Use of wchar string APIs is common on Windows; improve parity with
char* string checks.

Changes
- Implement wcscpy/wcsncpy in asan_interceptors.cpp; check overlap and
mark read/write ranges in bytes.
- wcsncpy: compute write size in bytes (size * sizeof(wchar_t)) to avoid
missed overflows when sizeof(wchar_t) != 1.
- Use MaybeRealWcsnlen when available to bound reads.
- Register Windows static thunk for wcscpy/wcsncpy/wcscat/wcsncat; rely
on sanitizer_common interceptors for wcscat/wcsncat.
- Tests: add wcscpy/wcsncpy/wcscat/wcsncat; flush stdout before crash;
use resilient FileCheck patterns (reuse [[ADDR]], wildcard for function
suffixes and paths, flexible line numbers).

Testing
- AArch64 Linux: new tests pass with check-asan locally.

Follow-up to and based on prior work in PR llvm#90909 (author: branh,
Microsoft); builds on that work and addresses review feedback. Thanks!

---------

Signed-off-by: Yixuan Cao <caoyixuan2019@email.szu.edu.cn>
mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Oct 3, 2025
…lvm#161624)

### Summary
Stabilize ASan wchar tests across Darwin and Android. NFC: test-only.
Follow-up to PR llvm#160493 (adds wchar interceptors/tests).

### Motivation
- Darwin: The top frame often resolves to `libclang_rt.asan_*` rather
than a source file, so strict checks that include file/line can fail.
See Chromium issue
[448631142](https://g-issues.chromium.org/issues/448631142).
- Android: The “ERROR:” header can go to logcat instead of stderr, so
FileCheck may not see it; stdout/stderr reordering also makes pre-crash
markers racy. See Android Buildbot
[186/12821](https://lab.llvm.org/buildbot/#/builders/186/builds/12821).

### Changes
- Android:
- Force reports to stderr via `%env_asan_opts=log_to_stderr=1`, avoiding
the “ERROR:” header going to logcat.
- Print the pre-crash “Good so far.” to stderr and `fflush(stderr)` to
avoid stdout/stderr reordering.
- Darwin:
- Relax the stack-frame check to only require the function name
(`wcscpy/wcsncpy/wcscat/wcsncat`) to tolerate `libclang_rt.asan_*`
frames.
- Common:
  - Reuse FileCheck var `[[ADDR]]` instead of redefining.
- Make wide string literals `const wchar_t*` to silence
`-Wwritable-strings`.

### Risk
- NFC: test-only; no change to runtime behavior.

### References
- Follow-up to PR llvm#160493.
- Chromium: [448631142](https://g-issues.chromium.org/issues/448631142)
(Darwin failures).
- Android Buildbot:
[186/12821](https://lab.llvm.org/buildbot/#/builders/186/builds/12821).

Signed-off-by: Yixuan Cao <caoyixuan2019@email.szu.edu.cn>
thurstond added a commit that referenced this pull request Oct 5, 2025
… on Windows" (#162021)

Reverts #160493 due to buildbot failures e.g.,
#160493 (comment)

The fix-forward (#161624) still
had failures on Darwin, and was reverted in
#162001 i.e., this pull request
completes the revert to green for this patch stack.
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Oct 5, 2025
…cat/wcsncat on Windows" (#162021)

Reverts llvm/llvm-project#160493 due to buildbot failures e.g.,
llvm/llvm-project#160493 (comment)

The fix-forward (llvm/llvm-project#161624) still
had failures on Darwin, and was reverted in
llvm/llvm-project#162001 i.e., this pull request
completes the revert to green for this patch stack.
Cao-Wuhui added a commit to Cao-Wuhui/llvm-project that referenced this pull request Oct 5, 2025
thurstond pushed a commit that referenced this pull request Oct 6, 2025
… wchar tests on Darwin/Android (#162028)

### Summary
Reland: wcscpy/wcsncpy interceptors and stabilize wchar tests on
Darwin/Android. Functional reland (runtime + tests).

### Context
Reland of #160493 and #161624; previously reverted by #162021 and
#162001 to restore green.

### Motivation
- Restore wchar interceptors (wcscpy/wcsncpy), broaden ASan coverage,
and improve Windows parity with narrow-string checks.
- Make tests robust across Darwin/Android to keep bots green.

### Runtime (wcscpy/wcsncpy)
- Add overlap checks; mark read/write ranges in bytes.
- Use MaybeRealWcsnlen when available to bound reads.
- Register Windows static runtime thunk where applicable.

### Tests (wcscpy/wcsncpy/wcscat/wcsncat)
- Android: keep `%env_asan_opts=log_to_stderr=1` so the ASan header is
on stderr.
- Darwin: tolerate reordering by putting all four key lines in one DAG
group:

```cpp
// CHECK-DAG: Good so far.
// CHECK-DAG: ERROR: AddressSanitizer: stack-buffer-overflow on address [[ADDR:...]] at pc {{...}} bp {{...}} sp {{...}}
// CHECK-DAG: WRITE of size {{[0-9]+}} at [[ADDR]] thread T0
// CHECK-DAG: #0 {{0x[0-9a-f]+}} in <func>
```

### Risk
- Functional reland (runtime + tests), intended to restore functionality
and maintain stability across platforms.

---------

Signed-off-by: Yixuan Cao <caoyixuan2019@email.szu.edu.cn>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants