Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RISCV][sanitizer] Fix sanitizer support for different virtual memory layout #66743

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

hau-hsu
Copy link
Contributor

@hau-hsu hau-hsu commented Sep 19, 2023

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 19, 2023

@llvm/pr-subscribers-backend-risc-v
@llvm/pr-subscribers-compiler-rt-sanitizer

@llvm/pr-subscribers-llvm-transforms

Changes

This PR combines the following reviews from Phabricator:

Other related (and merged) reviews are:


Full diff: https://github.com/llvm/llvm-project/pull/66743.diff

4 Files Affected:

  • (modified) compiler-rt/lib/asan/asan_mapping.h (+5-2)
  • (modified) compiler-rt/lib/sanitizer_common/sanitizer_linux.cpp (+9-8)
  • (modified) compiler-rt/lib/sanitizer_common/sanitizer_platform.h (+4-2)
  • (modified) llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp (+1-1)
diff --git a/compiler-rt/lib/asan/asan_mapping.h b/compiler-rt/lib/asan/asan_mapping.h
index c5f95c07a21056c..91fe60db6329ac3 100644
--- a/compiler-rt/lib/asan/asan_mapping.h
+++ b/compiler-rt/lib/asan/asan_mapping.h
@@ -72,7 +72,10 @@
 // || `[0x2000000000, 0x23ffffffff]` || LowShadow  ||
 // || `[0x0000000000, 0x1fffffffff]` || LowMem     ||
 //
-// Default Linux/RISCV64 Sv39 mapping:
+// Default Linux/RISCV64 Sv39 mapping with SHADOW_OFFSET == 0xd55550000;
+// (the exact location of SHADOW_OFFSET may vary depending the dynamic probing
+//  by FindDynamicShadowStart).
+//
 // || `[0x1555550000, 0x3fffffffff]` || HighMem    ||
 // || `[0x0fffffa000, 0x1555555fff]` || HighShadow ||
 // || `[0x0effffa000, 0x0fffff9fff]` || ShadowGap  ||
@@ -186,7 +189,7 @@
 #  elif SANITIZER_FREEBSD && defined(__aarch64__)
 #    define ASAN_SHADOW_OFFSET_CONST 0x0000800000000000
 #  elif SANITIZER_RISCV64
-#    define ASAN_SHADOW_OFFSET_CONST 0x0000000d55550000
+#    define ASAN_SHADOW_OFFSET_DYNAMIC
 #  elif defined(__aarch64__)
 #    define ASAN_SHADOW_OFFSET_CONST 0x0000001000000000
 #  elif defined(__powerpc64__)
diff --git a/compiler-rt/lib/sanitizer_common/sanitizer_linux.cpp b/compiler-rt/lib/sanitizer_common/sanitizer_linux.cpp
index d2b3b63f3a7a3bd..a394d784d9b061b 100644
--- a/compiler-rt/lib/sanitizer_common/sanitizer_linux.cpp
+++ b/compiler-rt/lib/sanitizer_common/sanitizer_linux.cpp
@@ -1109,7 +1109,8 @@ uptr GetMaxVirtualAddress() {
 #if SANITIZER_NETBSD && defined(__x86_64__)
   return 0x7f7ffffff000ULL;  // (0x00007f8000000000 - PAGE_SIZE)
 #elif SANITIZER_WORDSIZE == 64
-# if defined(__powerpc64__) || defined(__aarch64__) || defined(__loongarch__)
+#    if defined(__powerpc64__) || defined(__aarch64__) || \
+        defined(__loongarch__) || SANITIZER_RISCV64
   // On PowerPC64 we have two different address space layouts: 44- and 46-bit.
   // We somehow need to figure out which one we are using now and choose
   // one of 0x00000fffffffffffUL and 0x00003fffffffffffUL.
@@ -1118,18 +1119,18 @@ uptr GetMaxVirtualAddress() {
   // This should (does) work for both PowerPC64 Endian modes.
   // Similarly, aarch64 has multiple address space layouts: 39, 42 and 47-bit.
   // loongarch64 also has multiple address space layouts: default is 47-bit.
+  // RISC-V also has multiple address space layouts: 32, 39, 48 and 57,
+  // default is 47-bit for RISCV64.
   return (1ULL << (MostSignificantSetBitIndex(GET_CURRENT_FRAME()) + 1)) - 1;
-#elif SANITIZER_RISCV64
-  return (1ULL << 38) - 1;
-# elif SANITIZER_MIPS64
+#    elif SANITIZER_MIPS64
   return (1ULL << 40) - 1;  // 0x000000ffffffffffUL;
-# elif defined(__s390x__)
+#    elif defined(__s390x__)
   return (1ULL << 53) - 1;  // 0x001fffffffffffffUL;
-#elif defined(__sparc__)
+#    elif defined(__sparc__)
   return ~(uptr)0;
-# else
+#    else
   return (1ULL << 47) - 1;  // 0x00007fffffffffffUL;
-# endif
+#    endif
 #else  // SANITIZER_WORDSIZE == 32
 # if defined(__s390__)
   return (1ULL << 31) - 1;  // 0x7fffffff;
diff --git a/compiler-rt/lib/sanitizer_common/sanitizer_platform.h b/compiler-rt/lib/sanitizer_common/sanitizer_platform.h
index c1ca5c9ca44783b..1c864731f5e5c80 100644
--- a/compiler-rt/lib/sanitizer_common/sanitizer_platform.h
+++ b/compiler-rt/lib/sanitizer_common/sanitizer_platform.h
@@ -284,7 +284,7 @@
 // For such platforms build this code with -DSANITIZER_CAN_USE_ALLOCATOR64=0 or
 // change the definition of SANITIZER_CAN_USE_ALLOCATOR64 here.
 #ifndef SANITIZER_CAN_USE_ALLOCATOR64
-#  if SANITIZER_RISCV64 || SANITIZER_IOS || SANITIZER_DRIVERKIT
+#  if SANITIZER_IOS || SANITIZER_DRIVERKIT
 #    define SANITIZER_CAN_USE_ALLOCATOR64 0
 #  elif defined(__mips64) || defined(__hexagon__)
 #    define SANITIZER_CAN_USE_ALLOCATOR64 0
@@ -303,7 +303,9 @@
 #    define SANITIZER_MMAP_RANGE_SIZE FIRST_32_SECOND_64(1ULL << 32, 1ULL << 40)
 #  endif
 #elif SANITIZER_RISCV64
-#  define SANITIZER_MMAP_RANGE_SIZE FIRST_32_SECOND_64(1ULL << 32, 1ULL << 38)
+// FIXME: Set to 47 to ensure it works with both Sv39 and Sv48; however,
+// this will not work correctly on Sv57.
+#  define SANITIZER_MMAP_RANGE_SIZE FIRST_32_SECOND_64(1ULL << 32, 1ULL << 47)
 #elif defined(__aarch64__)
 #  if SANITIZER_APPLE
 #    if SANITIZER_OSX || SANITIZER_IOSSIM
diff --git a/llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp b/llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp
index bde5fba20f3b7a6..7ef4b086618f9d8 100644
--- a/llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp
+++ b/llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp
@@ -107,7 +107,7 @@ static const uint64_t kMIPS32_ShadowOffset32 = 0x0aaa0000;
 static const uint64_t kMIPS64_ShadowOffset64 = 1ULL << 37;
 static const uint64_t kAArch64_ShadowOffset64 = 1ULL << 36;
 static const uint64_t kLoongArch64_ShadowOffset64 = 1ULL << 46;
-static const uint64_t kRISCV64_ShadowOffset64 = 0xd55550000;
+static const uint64_t kRISCV64_ShadowOffset64 = kDynamicShadowSentinel;
 static const uint64_t kFreeBSD_ShadowOffset32 = 1ULL << 30;
 static const uint64_t kFreeBSD_ShadowOffset64 = 1ULL << 46;
 static const uint64_t kFreeBSDAArch64_ShadowOffset64 = 1ULL << 47;

@vitalybuka
Copy link
Collaborator

LGTM from compiler-rt perspective
@luismarques I am ready to accept if RISCV folks do not object

@luismarques
Copy link
Contributor

I'll review this later today

@luismarques
Copy link
Contributor

I'll review this later today

Actually, if you're not in a big hurry to merge this please give me a couple of days to run some tests (they take a while).

@hau-hsu
Copy link
Contributor Author

hau-hsu commented Sep 25, 2023

@luismarques no hurry. Many thanks!

@luismarques
Copy link
Contributor

Can you reproduce the following? The test compiler-rt/test/asan/TestCases/strncat-overlap.cpp passes without this PR but fails with this PR. If not, what does your test environment look like?

@hau-hsu
Copy link
Contributor Author

hau-hsu commented Oct 4, 2023

I didn't see that failure previously. I can rerun the test again to try to reproduce it.
Our test environment is QEMU system mode.

hau-hsu referenced this pull request Oct 19, 2023
Implements for sv39 and sv48 VMA layout.

Userspace only has access to the bottom half of vma range. The top half
is used by kernel. There is no dedicated vsyscall or heap segment.
PIE program is allocated to start at TASK_SIZE/3*2. Maximum ASLR is
ARCH_MMAP_RND_BITS_MAX+PAGE_SHIFT=24+12=36 Loader, vdso and other
libraries are allocated below stack from the top.

Also change RestoreAddr to use 4 bits to accommodate MappingRiscv64_48

Reviewed by: MaskRay, dvyukov, asb, StephenFan, luismarques, jrtc27,
hiraditya, vitalybuka

Differential Revision: https://reviews.llvm.org/D145214

D145214 was reverted because one file was missing in the latest commit.
Luckily the file was there in the previous commit, probably the author
missed uploading that file with latest commit.

Co-authored-by: Alex Fan <alex.fan.q@gmail.com>
@@ -186,7 +189,7 @@
# elif SANITIZER_FREEBSD && defined(__aarch64__)
# define ASAN_SHADOW_OFFSET_CONST 0x0000800000000000
# elif SANITIZER_RISCV64
# define ASAN_SHADOW_OFFSET_CONST 0x0000000d55550000
# define ASAN_SHADOW_OFFSET_DYNAMIC
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to have a value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want it to be like windows that uses dynamic offset instead of a fixing value:

#  elif SANITIZER_WINDOWS64
#    define ASAN_SHADOW_OFFSET_DYNAMIC

@github-actions
Copy link

github-actions bot commented Oct 20, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

@hau-hsu hau-hsu force-pushed the fix-asan-support branch 2 times, most recently from 78b2582 to 81ba032 Compare October 20, 2023 03:40
@hau-hsu
Copy link
Contributor Author

hau-hsu commented Oct 21, 2023

Can you reproduce the following? The test compiler-rt/test/asan/TestCases/strncat-overlap.cpp passes without this PR but fails with this PR. If not, what does your test environment look like?

Hi @luismarques what's the error message you got for strncat-overlap.cpp?

I tried to rebase the patches and run in our QEMU.
I didn't see the strncat-overlap.cpp failure, but I got strncpy-overflow.cpp failure instead.
It failed because the expected output is like

#0 0x2aacb6b128 in strncpy /home/hauh/work/SCT-2830-asan-upstream/llvm-project/compiler-rt/lib/asan/asan_interceptors.cpp:600:5

but after patching the output becomes

#0 0x2acde863c6 in $d.27 /home/root/llvm-project/compiler-rt/lib/asan/asan_interceptors.cpp:600:5

The function name strncpy becomes a mapping symbol ($d.27). It's weird.
I want to know if the strncat-overlap.cpp failure is because of the similar issue.
Thanks

@luismarques
Copy link
Contributor

Hi @luismarques what's the error message you got for strncat-overlap.cpp?

I can check that next week, if that's OK.

@hauhsu
Copy link

hauhsu commented Oct 26, 2023

Of course, thanks for your help :)

@luismarques
Copy link
Contributor

Of course, thanks for your help :)

Sorry for the delay. Here's the output I had with the original patches. It shows two things:

  1. The raw output of running strncat-overlap.cpp.tmp (without the FileCheck)
  2. The output of the complete check asan

https://gist.github.com/luismarques/ea23dc29ead9889382e0fc90c7f1ae97

Let me try the updated PR as well.

Copy link
Collaborator

@vitalybuka vitalybuka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please rebase if this is still relevant?

@llvmbot llvmbot added the compiler-rt:asan Address sanitizer label Mar 8, 2024
@hau-hsu
Copy link
Contributor Author

hau-hsu commented Mar 8, 2024

@vitalybuka I rebased it. But I haven't tested it yet.

@hau-hsu
Copy link
Contributor Author

hau-hsu commented Apr 22, 2024

Hi @luismarques
Sorry I was working on other projects. Now I am back for the asan PR.
I rebased the commits upon tag llvmorg-18.1.3 (c13b748) and rerun the tests on QEMU (system mode).
The failed cases with and without the PR now becomes 9 and 11.

Do we need to do more investigation for this PR?

Thanks

Detailed test results:

With this PR:

Failed Tests (9):
  AddressSanitizer-riscv64-linux :: TestCases/Linux/stack-trace-dlclose.cpp
  AddressSanitizer-riscv64-linux :: TestCases/Posix/strndup_oob_test.cpp
  AddressSanitizer-riscv64-linux :: TestCases/calloc-overflow.cpp
  AddressSanitizer-riscv64-linux :: TestCases/debug_stacks.cpp
  AddressSanitizer-riscv64-linux :: TestCases/log-path_test.cpp
  AddressSanitizer-riscv64-linux :: TestCases/malloc-size-too-big.cpp
  AddressSanitizer-riscv64-linux :: TestCases/strncpy-overflow.cpp
  AddressSanitizer-riscv64-linux :: TestCases/use-after-free-right.cpp
  AddressSanitizer-riscv64-linux :: TestCases/use-after-free.cpp


Testing Time: 1274.49s

Total Discovered Tests: 721
  Unsupported      : 199 (27.60%)
  Passed           : 512 (71.01%)
  Expectedly Failed:   1 (0.14%)
  Failed           :   9 (1.25%)

Without this PR:

Failed Tests (11):
  AddressSanitizer-riscv64-linux :: TestCases/Linux/stack-trace-dlclose.cpp
  AddressSanitizer-riscv64-linux :: TestCases/Posix/strndup_oob_test.cpp
  AddressSanitizer-riscv64-linux :: TestCases/Posix/strndup_oob_test2.cpp   (new)
  AddressSanitizer-riscv64-linux :: TestCases/calloc-overflow.cpp
  AddressSanitizer-riscv64-linux :: TestCases/debug_stacks.cpp
  AddressSanitizer-riscv64-linux :: TestCases/log-path_test.cpp
  AddressSanitizer-riscv64-linux :: TestCases/malloc-size-too-big.cpp
  AddressSanitizer-riscv64-linux :: TestCases/sanity_check_pure_c.c (new)
  AddressSanitizer-riscv64-linux :: TestCases/strncpy-overflow.cpp
  AddressSanitizer-riscv64-linux :: TestCases/use-after-free-right.cpp
  AddressSanitizer-riscv64-linux :: TestCases/use-after-free.cpp


Testing Time: 1644.78s

Total Discovered Tests: 721
  Unsupported      : 199 (27.60%)
  Passed           : 510 (70.74%)
  Expectedly Failed:   1 (0.14%)
  Failed           :  11 (1.53%)

D92403 has use AP32 by default for RISCV64 and then D126825 has move
this logic into sanitizer_common/sanitizer_platform.h which will effect
both asan and lsan, however asan are using AP64 and lsan using AP32
before D126825, and AP64 should work well for all 64-bit platform (ASan use that
before, so that's kind of evidence), not sure why we choose AP32 for lsan
before.

We have 2 option here: 1) only limited lsan use AP32 for RISCV64, 2) or also
relax that for lsan to using AP64.
…width of virtual-memory system.

The original asan port was support Sv39 only, however Sv48 support has
landed to linux kernel[1] for a while, so we are trying to make it support
both Sv39 and Sv48, the key point is we need to use a dynamic offset for the
shadow region, this bring extra runtime overhead, but compare to other
overhead in ASan, this should be acceptable.

[1] https://www.phoronix.com/news/Linux-5.17-RISC-V-sv48
@hau-hsu hau-hsu force-pushed the fix-asan-support branch 2 times, most recently from 2dd3155 to c04382f Compare May 28, 2024 09:14
This patch enables sanitizers for sv57 virtual memory mode.
Alloctor checks whether SANITIZER_MMAP_RANGE_SIZE matches possible mmap
regions:
sanitizer_allocator_primary32.h:292 "((res)) <((kNumPossibleRegions))"

Since SANITIZER_MMAP_RANGE_SIZE only controls "possible" mmap regions,
setting it to (1 << 57) also works for sv39 and sv48.
@hau-hsu
Copy link
Contributor Author

hau-hsu commented Jun 4, 2024

Hi @vitalybuka @luismarques @PiJoules I know this is an old PR.
I have rebased the commits and tested in our QEMU system mode for sv39, sv48, sv57 virtual memory layouts.
(More info about the 3 memory layout: https://www.kernel.org/doc/html/v6.4/riscv/vm-layout.html)
Please help to review again.

Test result:
The test baseline is sv39 mode, without any commits in the PR.
There are only 2 new failures for sv57:

TestCases/Linux/allocator_oom_test.cpp
TestCases/Linux/quarantine_size_mb.cpp

After quick triage I think they are because the required memory is too large in sv57 mode, the original test
conditions won't work. I'll have another PR if necessary later.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants