Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sanitizer_symbolizer] Use correct format strings for uptr #89815

Conversation

DimitryAndric
Copy link
Collaborator

When compiling the common sanitizer libraries, there are many warnings about format specifiers, similar to:

compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_markup.cpp:31:32: warning: format specifies type 'void *' but the argument has type 'uptr' (aka 'unsigned long') [-Wformat]
   31 |   buffer->AppendF(kFormatData, DI->start);
      |                   ~~~~~~~~~~~  ^~~~~~~~~
compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_markup_constants.h:33:46: note: format string is defined here
   33 | constexpr const char *kFormatData = "{{{data:%p}}}";
      |                                              ^~
      |                                              %lu
compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_markup.cpp:46:43: warning: format specifies type 'void *' but the argument has type 'uptr' (aka 'unsigned long') [-Wformat]
   46 |   buffer->AppendF(kFormatFrame, frame_no, address);
      |                   ~~~~~~~~~~~~            ^~~~~~~
compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_markup_constants.h:36:48: note: format string is defined here
   36 | constexpr const char *kFormatFrame = "{{{bt:%u:%p}}}";
      |                                                ^~
      |                                                %lu
...

This is because uptr is dependent on the platform, and can be either unsigned long long, unsigned long, or unsigned int.

To fix the warnings, define the correct printf modifier for uptr and sptr as SANITIZER_PTR_MOD, and use it in the various format strings in sanitizer_symbolizer_markup_constants.h.

@llvmbot
Copy link
Collaborator

llvmbot commented Apr 23, 2024

@llvm/pr-subscribers-compiler-rt-sanitizer

Author: Dimitry Andric (DimitryAndric)

Changes

When compiling the common sanitizer libraries, there are many warnings about format specifiers, similar to:

compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_markup.cpp:31:32: warning: format specifies type 'void *' but the argument has type 'uptr' (aka 'unsigned long') [-Wformat]
   31 |   buffer->AppendF(kFormatData, DI->start);
      |                   ~~~~~~~~~~~  ^~~~~~~~~
compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_markup_constants.h:33:46: note: format string is defined here
   33 | constexpr const char *kFormatData = "{{{data:%p}}}";
      |                                              ^~
      |                                              %lu
compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_markup.cpp:46:43: warning: format specifies type 'void *' but the argument has type 'uptr' (aka 'unsigned long') [-Wformat]
   46 |   buffer->AppendF(kFormatFrame, frame_no, address);
      |                   ~~~~~~~~~~~~            ^~~~~~~
compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_markup_constants.h:36:48: note: format string is defined here
   36 | constexpr const char *kFormatFrame = "{{{bt:%u:%p}}}";
      |                                                ^~
      |                                                %lu
...

This is because uptr is dependent on the platform, and can be either unsigned long long, unsigned long, or unsigned int.

To fix the warnings, define the correct printf modifier for uptr and sptr as SANITIZER_PTR_MOD, and use it in the various format strings in sanitizer_symbolizer_markup_constants.h.


Full diff: https://github.com/llvm/llvm-project/pull/89815.diff

2 Files Affected:

  • (modified) compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h (+3)
  • (modified) compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_markup_constants.h (+8-5)
diff --git a/compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h b/compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h
index 294e330c4d5611..a977ff8ff7ea7e 100644
--- a/compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h
+++ b/compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h
@@ -142,13 +142,16 @@ namespace __sanitizer {
 // 64-bit Windows uses LLP64 data model.
 typedef unsigned long long uptr;
 typedef signed long long sptr;
+#    define SANITIZER_PTR_MOD "ll"
 #else
 #  if (SANITIZER_WORDSIZE == 64) || SANITIZER_APPLE || SANITIZER_WINDOWS
 typedef unsigned long uptr;
 typedef signed long sptr;
+#    define SANITIZER_PTR_MOD "l"
 #  else
 typedef unsigned int uptr;
 typedef signed int sptr;
+#    define SANITIZER_PTR_MOD ""
 #  endif
 #endif  // defined(_WIN64)
 #if defined(__x86_64__)
diff --git a/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_markup_constants.h b/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_markup_constants.h
index 83643504e1289e..5a465a34718324 100644
--- a/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_markup_constants.h
+++ b/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_markup_constants.h
@@ -26,20 +26,23 @@ constexpr const char *kFormatDemangle = "{{{symbol:%s}}}";
 constexpr uptr kFormatDemangleMax = 1024;  // Arbitrary.
 
 // Function name or equivalent from PC location.
-constexpr const char *kFormatFunction = "{{{pc:%p}}}";
+constexpr const char *kFormatFunction = "{{{pc:0x%" SANITIZER_PTR_MOD "x}}}";
 constexpr uptr kFormatFunctionMax = 64;  // More than big enough for 64-bit hex.
 
 // Global variable name or equivalent from data memory address.
-constexpr const char *kFormatData = "{{{data:%p}}}";
+constexpr const char *kFormatData = "{{{data:0x%" SANITIZER_PTR_MOD "x}}}";
 
 // One frame in a backtrace (printed on a line by itself).
-constexpr const char *kFormatFrame = "{{{bt:%u:%p}}}";
+constexpr const char *kFormatFrame = "{{{bt:%d:0x%" SANITIZER_PTR_MOD "x}}}";
 
 // Module contextual element.
-constexpr const char *kFormatModule = "{{{module:%d:%s:elf:%s}}}";
+constexpr const char *kFormatModule =
+    "{{{module:%" SANITIZER_PTR_MOD "d:%s:elf:%s}}}";
 
 // mmap for a module segment.
-constexpr const char *kFormatMmap = "{{{mmap:%p:0x%x:load:%d:%s:0x%x}}}";
+constexpr const char *kFormatMmap =
+    "{{{mmap:0x%" SANITIZER_PTR_MOD "x:0x%" SANITIZER_PTR_MOD
+    "x:load:%" SANITIZER_PTR_MOD "d:%s:0x%" SANITIZER_PTR_MOD "x}}}";
 
 // Dump trigger element.
 #define FORMAT_DUMPFILE "{{{dumpfile:%s:%s}}}"

Copy link

github-actions bot commented Apr 23, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@@ -26,20 +26,23 @@ constexpr const char *kFormatDemangle = "{{{symbol:%s}}}";
constexpr uptr kFormatDemangleMax = 1024; // Arbitrary.

// Function name or equivalent from PC location.
constexpr const char *kFormatFunction = "{{{pc:%p}}}";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if uptr used for pointer, it should be %p but printf arg should be caster to (void*)
if it 's uptr for size, then %zx (for hex), or %zu should be used

@DimitryAndric
Copy link
Collaborator Author

Casting seems uglier to me, though. On the other hand, it's difficult to make printf-like functions type-safe in any way...

@DimitryAndric DimitryAndric force-pushed the users/DimitryAndric/sanitizer-fix-symbolizer-markup-constants-1 branch from 8c45d7d to 409b407 Compare April 24, 2024 16:12
@@ -96,7 +97,7 @@ static void RenderModule(InternalScopedString *buffer,
for (uptr i = 0; i < module.uuid_size(); i++)
buildIdBuffer.AppendF("%02x", module.uuid()[i]);

buffer->AppendF(kFormatModule, moduleId, module.full_name(),
buffer->AppendF(kFormatModule, static_cast<int>(moduleId), module.full_name(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here, I'd rather update
"{{{module:%zu:%s:elf:%s}}}";

accessBuffer.data(), range.beg - module.base_address());
buffer->AppendF(
kFormatMmap, reinterpret_cast<void *>(range.beg),
static_cast<unsigned int>(range.end - range.beg),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's unlucky sanitizer convension, but uptr is equivalend to size_t.

static_cast<unsigned int>(range.end - range.beg), is uptr so size format %z should be used

So can you try to "{{{mmap:%p:0x%zx:load:%zu:%s:0x%zx}}}";

@DimitryAndric DimitryAndric force-pushed the users/DimitryAndric/sanitizer-fix-symbolizer-markup-constants-1 branch from 409b407 to ebb6f7e Compare April 24, 2024 22:00
static_cast<int>(moduleId), accessBuffer.data(),
static_cast<unsigned int>(range.beg - module.base_address()));
buffer->AppendF(kFormatMmap, reinterpret_cast<void *>(range.beg),
static_cast<size_t>(range.end - range.beg),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to cast size_t

I am saying that compiler-rt (sanitizers) use uptr to format as size_t.
It's all over the code, it should be OK to pass uptr into %zx.

When compiling the common sanitizer libraries, there are many warnings
about format specifiers, similar to:

    compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_markup.cpp:31:32: warning: format specifies type 'void *' but the argument has type 'uptr' (aka 'unsigned long') [-Wformat]
       31 |   buffer->AppendF(kFormatData, DI->start);
          |                   ~~~~~~~~~~~  ^~~~~~~~~
    compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_markup_constants.h:33:46: note: format string is defined here
       33 | constexpr const char *kFormatData = "{{{data:%p}}}";
          |                                              ^~
          |                                              %lu
    compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_markup.cpp:46:43: warning: format specifies type 'void *' but the argument has type 'uptr' (aka 'unsigned long') [-Wformat]
       46 |   buffer->AppendF(kFormatFrame, frame_no, address);
          |                   ~~~~~~~~~~~~            ^~~~~~~
    compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_markup_constants.h:36:48: note: format string is defined here
       36 | constexpr const char *kFormatFrame = "{{{bt:%u:%p}}}";
          |                                                ^~
          |                                                %lu
    ...

This is because `uptr` is dependent on the platform, and can be either
`unsigned long long`, `unsigned long`, or `unsigned int`.

To fix the warnings, cast the arguments to the expected type of the
format strings.
@DimitryAndric DimitryAndric force-pushed the users/DimitryAndric/sanitizer-fix-symbolizer-markup-constants-1 branch from ebb6f7e to f718eb8 Compare April 24, 2024 22:13
@vitalybuka
Copy link
Collaborator

Thank you!

@DimitryAndric DimitryAndric merged commit 0f329e0 into main Apr 25, 2024
4 checks passed
@DimitryAndric DimitryAndric deleted the users/DimitryAndric/sanitizer-fix-symbolizer-markup-constants-1 branch April 25, 2024 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants