-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[lldb] improve the heuristics for checking if a terminal supports Unicode #171491
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[lldb] improve the heuristics for checking if a terminal supports Unicode #171491
Conversation
|
@llvm/pr-subscribers-lldb Author: Charles Zablit (charles-zablit) ChangesThis patch improves the way lldb checks if the terminal it's opened in (if any) supports Unicode or not. On POSIX systems, we check if On Windows, we always return This is a relanding of #168603. Full diff: https://github.com/llvm/llvm-project/pull/171491.diff 5 Files Affected:
diff --git a/lldb/include/lldb/Host/Terminal.h b/lldb/include/lldb/Host/Terminal.h
index da0d05e8bd265..3d66515c18812 100644
--- a/lldb/include/lldb/Host/Terminal.h
+++ b/lldb/include/lldb/Host/Terminal.h
@@ -68,6 +68,18 @@ class Terminal {
llvm::Error SetHardwareFlowControl(bool enabled);
+ /// Returns whether or not the current terminal supports Unicode rendering.
+ ///
+ /// The value is cached after the first computation.
+ ///
+ /// On POSIX systems, we check if the LANG environment variable contains the
+ /// substring "UTF-8", case insensitive.
+ ///
+ /// On Windows, we always return true since we use the `WriteConsoleW` API
+ /// internally. Note that the default Windows codepage (437) does not support
+ /// all Unicode characters. This function does not check the codepage.
+ static bool SupportsUnicode();
+
protected:
struct Data;
diff --git a/lldb/include/lldb/Host/common/DiagnosticsRendering.h b/lldb/include/lldb/Host/common/DiagnosticsRendering.h
index dd33d671c24a5..3eea0647da37e 100644
--- a/lldb/include/lldb/Host/common/DiagnosticsRendering.h
+++ b/lldb/include/lldb/Host/common/DiagnosticsRendering.h
@@ -59,10 +59,27 @@ struct DiagnosticDetail {
StructuredData::ObjectSP Serialize(llvm::ArrayRef<DiagnosticDetail> details);
+/// Renders an array of DiagnosticDetail instances.
+///
+/// \param[in] stream
+/// The stream to render the diagnostics to.
+/// \param offset_in_command
+/// An optional offset to the column position of the diagnostic in the
+/// source.
+/// \param show_inline
+/// Whether to show the diagnostics inline.
+/// \param details
+/// The array of DiagnosticsDetail to render.
+/// \param force_ascii
+/// Whether to force ascii rendering. If false, Unicode characters will be
+/// used if the output file supports them.
+///
+/// \see lldb_private::Terminal::SupportsUnicode
void RenderDiagnosticDetails(Stream &stream,
std::optional<uint16_t> offset_in_command,
bool show_inline,
- llvm::ArrayRef<DiagnosticDetail> details);
+ llvm::ArrayRef<DiagnosticDetail> details,
+ bool force_ascii = false);
class DiagnosticError
: public llvm::ErrorInfo<DiagnosticError, CloneableECError> {
diff --git a/lldb/source/Host/common/DiagnosticsRendering.cpp b/lldb/source/Host/common/DiagnosticsRendering.cpp
index f2cd3968967fb..2c9d33a6c325c 100644
--- a/lldb/source/Host/common/DiagnosticsRendering.cpp
+++ b/lldb/source/Host/common/DiagnosticsRendering.cpp
@@ -7,6 +7,8 @@
//===----------------------------------------------------------------------===//
#include "lldb/Host/common/DiagnosticsRendering.h"
+#include "lldb/Host/Terminal.h"
+
#include <cstdint>
using namespace lldb_private;
@@ -85,7 +87,8 @@ static llvm::raw_ostream &PrintSeverity(Stream &stream,
void RenderDiagnosticDetails(Stream &stream,
std::optional<uint16_t> offset_in_command,
bool show_inline,
- llvm::ArrayRef<DiagnosticDetail> details) {
+ llvm::ArrayRef<DiagnosticDetail> details,
+ bool force_ascii) {
if (details.empty())
return;
@@ -97,12 +100,8 @@ void RenderDiagnosticDetails(Stream &stream,
return;
}
- // Since there is no other way to find this out, use the color
- // attribute as a proxy for whether the terminal supports Unicode
- // characters. In the future it might make sense to move this into
- // Host so it can be customized for a specific platform.
llvm::StringRef cursor, underline, vbar, joint, hbar, spacer;
- if (stream.AsRawOstream().colors_enabled()) {
+ if (Terminal::SupportsUnicode() && !force_ascii) {
cursor = "˄";
underline = "˜";
vbar = "│";
diff --git a/lldb/source/Host/common/Terminal.cpp b/lldb/source/Host/common/Terminal.cpp
index 436dfd8130d9b..dd1dc75133f45 100644
--- a/lldb/source/Host/common/Terminal.cpp
+++ b/lldb/source/Host/common/Terminal.cpp
@@ -400,6 +400,21 @@ llvm::Error Terminal::SetHardwareFlowControl(bool enabled) {
#endif // LLDB_ENABLE_TERMIOS
}
+bool Terminal::SupportsUnicode() {
+ static std::optional<bool> result;
+ if (result)
+ return result.value();
+#ifdef _WIN32
+ return true;
+#else
+ const char *lang_var = std::getenv("LANG");
+ if (!lang_var)
+ return false;
+ result = llvm::StringRef(lang_var).lower().find("utf-8") != std::string::npos;
+#endif
+ return result.value();
+}
+
TerminalState::TerminalState(Terminal term, bool save_process_group)
: m_tty(term) {
Save(term, save_process_group);
diff --git a/lldb/unittests/Host/common/DiagnosticsRenderingTest.cpp b/lldb/unittests/Host/common/DiagnosticsRenderingTest.cpp
index 851b478def32e..78fc52ae11ad1 100644
--- a/lldb/unittests/Host/common/DiagnosticsRenderingTest.cpp
+++ b/lldb/unittests/Host/common/DiagnosticsRenderingTest.cpp
@@ -10,7 +10,7 @@ class ErrorDisplayTest : public ::testing::Test {};
std::string Render(std::vector<DiagnosticDetail> details) {
StreamString stream;
- RenderDiagnosticDetails(stream, 0, true, details);
+ RenderDiagnosticDetails(stream, 0, true, details, true);
return stream.GetData();
}
} // namespace
|
JDevlieghere
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somewhat unfortunate to have to have an argument just for the test, but every alternative I can think of is objectively worse.
| std::string Render(std::vector<DiagnosticDetail> details) { | ||
| StreamString stream; | ||
| RenderDiagnosticDetails(stream, 0, true, details); | ||
| RenderDiagnosticDetails(stream, 0, true, details, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| RenderDiagnosticDetails(stream, 0, true, details, true); | |
| RenderDiagnosticDetails(stream, 0, true, details, /*force_ascii=*/true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, thanks!
lldb/source/Host/common/Terminal.cpp
Outdated
| } | ||
|
|
||
| bool Terminal::SupportsUnicode() { | ||
| static std::optional<bool> result; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LLDB coding style would be g_result
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, thanks!
The only other reasonable thing I could think of was to have an if statement in the test and expecting one output or the other depending on whether unicode is supported or not. |
|
It seems a number of lldb-shell tests failed on Windows hosts after this patch |
|
I opened a fix here: #171685 |
This patch fixes issues introduced by #171491 when running tests in CI. The shell tests expect certain characters when matching diagnostics. With #171491, those characters can either be Unicode specific characters or their ASCII equivalent. The tests were always expecting the ASCII version. This patch fixes this by using a regex to match one or the other.
…#171685) This patch fixes issues introduced by llvm/llvm-project#171491 when running tests in CI. The shell tests expect certain characters when matching diagnostics. With llvm/llvm-project#171491, those characters can either be Unicode specific characters or their ASCII equivalent. The tests were always expecting the ASCII version. This patch fixes this by using a regex to match one or the other.
…orts Unicode (llvm#171491)" This reverts commit 8445909.
This patch improves the way lldb checks if the terminal it's opened in (if any) supports Unicode or not.
On POSIX systems, we check if
LANGcontainsUTF-8.On Windows, we always return
truesince we use theWriteToConsoleWapi.This is a relanding of #168603.
The tests failed because the bots support Unicode but the tests expect ASCII. To avoid different outputs depending on the environment the tests are running in, this patch always force ASCII in the tests.