Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NFC sanitizer_symbolizer] Move Fuchsia specific code. #73192

Conversation

avillega
Copy link
Contributor

Moves sanitizer symbolizer code that is specific for
fuchsia into its own _fuchsia.cpp file.
This is preparation to enable symbolizer markup in
linux.

Created using spr 1.3.4
@llvmbot
Copy link
Collaborator

llvmbot commented Nov 23, 2023

@llvm/pr-subscribers-compiler-rt-sanitizer

Author: Andres Villegas (avillega)

Changes

Moves sanitizer symbolizer code that is specific for
fuchsia into its own _fuchsia.cpp file.
This is preparation to enable symbolizer markup in
linux.


Full diff: https://github.com/llvm/llvm-project/pull/73192.diff

4 Files Affected:

  • (modified) compiler-rt/lib/sanitizer_common/CMakeLists.txt (+2)
  • (modified) compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_markup.cpp (+23-87)
  • (added) compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_markup.h (+45)
  • (added) compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_markup_fuchsia.cpp (+83)
diff --git a/compiler-rt/lib/sanitizer_common/CMakeLists.txt b/compiler-rt/lib/sanitizer_common/CMakeLists.txt
index 61e832a30eb3767..fb7584c298a1c94 100644
--- a/compiler-rt/lib/sanitizer_common/CMakeLists.txt
+++ b/compiler-rt/lib/sanitizer_common/CMakeLists.txt
@@ -88,6 +88,7 @@ set(SANITIZER_SYMBOLIZER_SOURCES
   sanitizer_symbolizer_libcdep.cpp
   sanitizer_symbolizer_mac.cpp
   sanitizer_symbolizer_markup.cpp
+  sanitizer_symbolizer_markup_fuchsia.cpp
   sanitizer_symbolizer_posix_libcdep.cpp
   sanitizer_symbolizer_report.cpp
   sanitizer_symbolizer_report_fuchsia.cpp
@@ -195,6 +196,7 @@ set(SANITIZER_IMPL_HEADERS
   sanitizer_symbolizer_internal.h
   sanitizer_symbolizer_libbacktrace.h
   sanitizer_symbolizer_mac.h
+  sanitizer_symbolizer_markup.h
   sanitizer_syscall_generic.inc
   sanitizer_syscall_linux_aarch64.inc
   sanitizer_syscall_linux_arm.inc
diff --git a/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_markup.cpp b/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_markup.cpp
index c7332af7d9efd5a..c364e1e300225b9 100644
--- a/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_markup.cpp
+++ b/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_markup.cpp
@@ -8,105 +8,41 @@
 //
 // This file is shared between various sanitizers' runtime libraries.
 //
-// Implementation of offline markup symbolizer.
-//===----------------------------------------------------------------------===//
-
-#include "sanitizer_platform.h"
-
-#if SANITIZER_SYMBOLIZER_MARKUP
-
-#  include "sanitizer_common.h"
-#  include "sanitizer_stacktrace_printer.h"
-#  include "sanitizer_symbolizer.h"
-#  include "sanitizer_symbolizer_markup_constants.h"
-
-namespace __sanitizer {
-
 // This generic support for offline symbolizing is based on the
 // Fuchsia port.  We don't do any actual symbolization per se.
 // Instead, we emit text containing raw addresses and raw linkage
 // symbol names, embedded in Fuchsia's symbolization markup format.
-// Fuchsia's logging infrastructure emits enough information about
-// process memory layout that a post-processing filter can do the
-// symbolization and pretty-print the markup.  See the spec at:
-// https://fuchsia.googlesource.com/zircon/+/master/docs/symbolizer_markup.md
+// See the spec at:
+// https://llvm.org/docs/SymbolizerMarkupFormat.html
+//===----------------------------------------------------------------------===//
 
-// This is used by UBSan for type names, and by ASan for global variable names.
-// It's expected to return a static buffer that will be reused on each call.
-const char *Symbolizer::Demangle(const char *name) {
-  static char buffer[kFormatDemangleMax];
-  internal_snprintf(buffer, sizeof(buffer), kFormatDemangle, name);
-  return buffer;
-}
+#include "sanitizer_symbolizer_markup.h"
 
-// This is used mostly for suppression matching.  Making it work
-// would enable "interceptor_via_lib" suppressions.  It's also used
-// once in UBSan to say "in module ..." in a message that also
-// includes an address in the module, so post-processing can already
-// pretty-print that so as to indicate the module.
-bool Symbolizer::GetModuleNameAndOffsetForPC(uptr pc, const char **module_name,
-                                             uptr *module_address) {
-  return false;
-}
+#include "sanitizer_common.h"
+#include "sanitizer_stacktrace_printer.h"
+#include "sanitizer_symbolizer.h"
+#include "sanitizer_symbolizer_markup_constants.h"
 
-// This is mainly used by hwasan for online symbolization. This isn't needed
-// since hwasan can always just dump stack frames for offline symbolization.
-bool Symbolizer::SymbolizeFrame(uptr addr, FrameInfo *info) { return false; }
-
-// This is used in some places for suppression checking, which we
-// don't really support for Fuchsia.  It's also used in UBSan to
-// identify a PC location to a function name, so we always fill in
-// the function member with a string containing markup around the PC
-// value.
-// TODO(mcgrathr): Under SANITIZER_GO, it's currently used by TSan
-// to render stack frames, but that should be changed to use
-// RenderStackFrame.
-SymbolizedStack *Symbolizer::SymbolizePC(uptr addr) {
-  SymbolizedStack *s = SymbolizedStack::New(addr);
-  char buffer[kFormatFunctionMax];
-  internal_snprintf(buffer, sizeof(buffer), kFormatFunction, addr);
-  s->info.function = internal_strdup(buffer);
-  return s;
-}
+namespace __sanitizer {
 
-// Always claim we succeeded, so that RenderDataInfo will be called.
-bool Symbolizer::SymbolizeData(uptr addr, DataInfo *info) {
-  info->Clear();
-  info->start = addr;
-  return true;
+void MarkupStackTracePrinter::RenderData(InternalScopedString *buffer,
+                                         const char *format, const DataInfo *DI,
+                                         const char *strip_path_prefix) {
+  buffer->AppendF(kFormatData, DI->start);
 }
 
-class MarkupStackTracePrinter : public StackTracePrinter {
-  // We ignore the format argument to __sanitizer_symbolize_global.
-  void RenderData(InternalScopedString *buffer, const char *format,
-                  const DataInfo *DI, const char *strip_path_prefix) override {
-    buffer->AppendF(kFormatData, DI->start);
-  }
-
-  bool RenderNeedsSymbolization(const char *format) override { return false; }
-
-  // We don't support the stack_trace_format flag at all.
-  void RenderFrame(InternalScopedString *buffer, const char *format,
-                   int frame_no, uptr address, const AddressInfo *info,
-                   bool vs_style, const char *strip_path_prefix) override {
-    CHECK(!RenderNeedsSymbolization(format));
-    buffer->AppendF(kFormatFrame, frame_no, address);
-  }
-
- protected:
-  ~MarkupStackTracePrinter();
-};
-
-StackTracePrinter *StackTracePrinter::NewStackTracePrinter() {
-  return new (GetGlobalLowLevelAllocator()) MarkupStackTracePrinter();
+bool MarkupStackTracePrinter::RenderNeedsSymbolization(const char *format) {
+  return false;
 }
 
-Symbolizer *Symbolizer::PlatformInit() {
-  return new (symbolizer_allocator_) Symbolizer({});
+// We don't support the stack_trace_format flag at all.
+void MarkupStackTracePrinter::RenderFrame(InternalScopedString *buffer,
+                                          const char *format, int frame_no,
+                                          uptr address, const AddressInfo *info,
+                                          bool vs_style,
+                                          const char *strip_path_prefix) {
+  CHECK(!RenderNeedsSymbolization(format));
+  buffer->AppendF(kFormatFrame, frame_no, address);
 }
 
-void Symbolizer::LateInitialize() { Symbolizer::GetOrInit(); }
-
 }  // namespace __sanitizer
-
-#endif  // SANITIZER_SYMBOLIZER_MARKUP
diff --git a/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_markup.h b/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_markup.h
new file mode 100644
index 000000000000000..7cebe520e9bebb4
--- /dev/null
+++ b/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_markup.h
@@ -0,0 +1,45 @@
+//===-- sanitizer_symbolizer_markup.h -----------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+//  This file is shared between various sanitizers' runtime libraries.
+//
+//  Header for the offline markup symbolizer.
+//===----------------------------------------------------------------------===//
+#ifndef SANITIZER_SYMBOLIZER_MARKUP_H
+#define SANITIZER_SYMBOLIZER_MARKUP_H
+
+#include "sanitizer_common.h"
+#include "sanitizer_stacktrace_printer.h"
+#include "sanitizer_symbolizer.h"
+
+namespace __sanitizer {
+
+class MarkupStackTracePrinter : public StackTracePrinter {
+ public:
+  // We don't support the stack_trace_format flag at all.
+  void RenderFrame(InternalScopedString *buffer, const char *format,
+                   int frame_no, uptr address, const AddressInfo *info,
+                   bool vs_style, const char *strip_path_prefix = "") override;
+
+  bool RenderNeedsSymbolization(const char *format) override;
+
+  // We ignore the format argument to __sanitizer_symbolize_global.
+  void RenderData(InternalScopedString *buffer, const char *format,
+                  const DataInfo *DI,
+                  const char *strip_path_prefix = "") override;
+
+ private:
+  void RenderContext(InternalScopedString *buffer);
+
+ protected:
+  ~MarkupStackTracePrinter() {}
+};
+
+}  // namespace __sanitizer
+
+#endif  // SANITIZER_SYMBOLIZER_MARKUP_H
diff --git a/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_markup_fuchsia.cpp b/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_markup_fuchsia.cpp
new file mode 100644
index 000000000000000..f6c49aa59d7b559
--- /dev/null
+++ b/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_markup_fuchsia.cpp
@@ -0,0 +1,83 @@
+//===-- sanitizer_symbolizer_markup_fuchsia.cpp ---------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file is shared between various sanitizers' runtime libraries.
+//
+// Fuchsia specific implementation of offline markup symbolizer.
+//===----------------------------------------------------------------------===//
+#include "sanitizer_platform.h"
+
+#if SANITIZER_SYMBOLIZER_MARKUP
+
+#  include "sanitizer_common.h"
+#  include "sanitizer_stacktrace_printer.h"
+#  include "sanitizer_symbolizer.h"
+#  include "sanitizer_symbolizer_markup.h"
+#  include "sanitizer_symbolizer_markup_constants.h"
+
+namespace __sanitizer {
+
+// This is used by UBSan for type names, and by ASan for global variable names.
+// It's expected to return a static buffer that will be reused on each call.
+const char *Symbolizer::Demangle(const char *name) {
+  static char buffer[kFormatDemangleMax];
+  internal_snprintf(buffer, sizeof(buffer), kFormatDemangle, name);
+  return buffer;
+}
+
+// This is used mostly for suppression matching.  Making it work
+// would enable "interceptor_via_lib" suppressions.  It's also used
+// once in UBSan to say "in module ..." in a message that also
+// includes an address in the module, so post-processing can already
+// pretty-print that so as to indicate the module.
+bool Symbolizer::GetModuleNameAndOffsetForPC(uptr pc, const char **module_name,
+                                             uptr *module_address) {
+  return false;
+}
+
+// This is mainly used by hwasan for online symbolization. This isn't needed
+// since hwasan can always just dump stack frames for offline symbolization.
+bool Symbolizer::SymbolizeFrame(uptr addr, FrameInfo *info) { return false; }
+
+// This is used in some places for suppression checking, which we
+// don't really support for Fuchsia.  It's also used in UBSan to
+// identify a PC location to a function name, so we always fill in
+// the function member with a string containing markup around the PC
+// value.
+// TODO(mcgrathr): Under SANITIZER_GO, it's currently used by TSan
+// to render stack frames, but that should be changed to use
+// RenderStackFrame.
+SymbolizedStack *Symbolizer::SymbolizePC(uptr addr) {
+  SymbolizedStack *s = SymbolizedStack::New(addr);
+  char buffer[kFormatFunctionMax];
+  internal_snprintf(buffer, sizeof(buffer), kFormatFunction, addr);
+  s->info.function = internal_strdup(buffer);
+  return s;
+}
+
+// Always claim we succeeded, so that RenderDataInfo will be called.
+bool Symbolizer::SymbolizeData(uptr addr, DataInfo *info) {
+  info->Clear();
+  info->start = addr;
+  return true;
+}
+
+// Fuchsia only uses MarkupStackTracePrinter
+StackTracePrinter *StackTracePrinter::NewStackTracePrinter() {
+  return new (GetGlobalLowLevelAllocator()) MarkupStackTracePrinter();
+}
+
+Symbolizer *Symbolizer::PlatformInit() {
+  return new (symbolizer_allocator_) Symbolizer({});
+}
+
+void Symbolizer::LateInitialize() { Symbolizer::GetOrInit(); }
+
+}  // namespace __sanitizer
+
+#endif  // SANITIZER_SYMBOLIZER_MARKUP

@avillega avillega merged commit 93a2be2 into main Nov 28, 2023
5 checks passed
@avillega avillega deleted the users/avillega/nfc-sanitizer_symbolizer-move-fuchsia-specific-code branch November 28, 2023 00:32
aheejin added a commit to aheejin/emscripten that referenced this pull request Apr 7, 2024
Many lines that call `InternalScopedString::AppendF`, including the
below,
https://github.com/llvm/llvm-project/blob/26a1d6601d727a96f4301d0d8647b5a42760ae0c/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_markup.cpp#L31
are format-checked by
https://github.com/llvm/llvm-project/blob/26a1d6601d727a96f4301d0d8647b5a42760ae0c/compiler-rt/lib/sanitizer_common/sanitizer_common.h#L650
which checks if the formats of arguments conform to that of functions
like `printf`. But this errors out because the format strings currently
do not match their arguments.

llvm/llvm-project@9c8f888#diff-5bf0494d1b61b99e55aefb25e59e41b678f20a392c2c6d77e5fbdc97c2ca4c3f
started the effort to check `__attribute__((format))` in the codebase,
but apparently there were too many violating instances, it made the code
build with `-Wno-format` (presumably as an temporary solution):
https://github.com/llvm/llvm-project/blob/26a1d6601d727a96f4301d0d8647b5a42760ae0c/compiler-rt/lib/sanitizer_common/CMakeLists.txt#L223-L224
So this adds `-Wno-format` to our cflags as well.

The reason it was fine without it until now is, we didn't build the
whole file thanks to this line:
https://github.com/emscripten-core/emscripten/blob/a9b347bfcabda59a5edff60ee18b8a0ab70aa9dc/system/lib/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_markup.cpp#L15

But recent refactoring efforts including
llvm/llvm-project#73032,
llvm/llvm-project#73192, and
llvm/llvm-project#73193 moved many parts out of
that file to be built by a specific platform (e.g. fushia) and made the
file build unconditionally. I tried to add back `#if
SANITIZER_SYMBOLIZER_MARKUP` to that file but then this line
(https://github.com/llvm/llvm-project/blob/26a1d6601d727a96f4301d0d8647b5a42760ae0c/compiler-rt/lib/sanitizer_common/sanitizer_stacktrace_printer.cpp#L67)
breaks with an undefined referenc error. So just build the library with
`-Wno-format`, which is also what the upstream is effectively doing,
seems a simpler solution.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants