Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for paste filtering and bracketed paste mode #9034

Merged
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
3 changes: 3 additions & 0 deletions .github/actions/spelling/expect/expect.txt
Expand Up @@ -149,6 +149,7 @@ bgidx
Bgk
BGR
BGRA
bh
BHID
biblioscape
bigobj
Expand All @@ -162,6 +163,7 @@ BITOPERATION
bitsavers
bitset
BKCOLOR
bke
skyline75489 marked this conversation as resolved.
Show resolved Hide resolved
BKGND
Bksp
blog
Expand Down Expand Up @@ -1281,6 +1283,7 @@ Loewen
LOGFONT
LOGFONTW
logissue
lol
lowercased
loword
lparam
Expand Down
60 changes: 3 additions & 57 deletions src/cascadia/TerminalControl/TermControl.cpp
Expand Up @@ -11,6 +11,7 @@
#include <WinUser.h>
#include <LibraryResources.h>
#include "../../types/inc/GlyphWidth.hpp"
#include "../../types/inc/Utils.hpp"

#include "TermControl.g.cpp"
#include "TermControlAutomationPeer.h"
Expand Down Expand Up @@ -2033,65 +2034,10 @@ namespace winrt::Microsoft::Terminal::TerminalControl::implementation

// Method Description:
// - Pre-process text pasted (presumably from the clipboard)
// before sending it over the terminal's connection, converting
// Windows-space \r\n line-endings to \r line-endings
// - Also converts \n line-endings to \r line-endings
// before sending it over the terminal's connection.
void TermControl::_SendPastedTextToConnection(const std::wstring& wstr)
{
// Some notes on this implementation:
//
// - std::regex can do this in a single line, but is somewhat
// overkill for a simple search/replace operation (and its
// performance guarantees aren't exactly stellar)
// - The STL doesn't have a simple string search/replace method.
// This fact is lamentable.
// - We search for \n, and when we find it we copy the string up to
// the \n (but not including it). Then, we check the if the
// previous character is \r, if its not, then we had a lone \n
// and so we append our own \r

std::wstring stripped;
stripped.reserve(wstr.length());

std::wstring::size_type pos = 0;
std::wstring::size_type begin = 0;

while ((pos = wstr.find(L"\n", pos)) != std::wstring::npos)
{
// copy up to but not including the \n
stripped.append(wstr.cbegin() + begin, wstr.cbegin() + pos);
if (!(pos > 0 && (wstr.at(pos - 1) == L'\r')))
{
// there was no \r before the \n we did not copy,
// so append our own \r (this effectively replaces the \n
// with a \r)
stripped.push_back(L'\r');
}
++pos;
begin = pos;
}

// If we entered the while loop even once, begin would be non-zero
// (because we set begin = pos right after incrementing pos)
// So, if begin is still zero at this point it means we never found a newline
// and we can just write the original string
if (begin == 0)
{
_connection.WriteInput(wstr);
}
else
{
// copy over the part after the last \n
stripped.append(wstr.cbegin() + begin, wstr.cend());

// we may have removed some characters, so we may not need as much space
// as we reserved earlier
stripped.shrink_to_fit();
_connection.WriteInput(stripped);
}

_terminal->ClearSelection();
_terminal->TrySnapOnInput();
_terminal->WritePastedText(wstr);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very satisfying 🎉

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad! 😁

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to keep the other terminal operations in this method to make WritePastedText more lightweight and decoupling from non-paste-related things. WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure! I was wondering if the other two operations are something the Core should handle too.

@zadjii-msft if we want Core to be reusable, should it be given more responsibilities for figuring out when to clear the selection and snap on input? I feel like it should -- so maybe we should move the two lines under WritePastedText into WritePastedText. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I agree, that should probably move down below too. That makes sense. Doesn't make sense to block this PR over it though, IMO. We could probably just get away with

Suggested change
_terminal->WritePastedText(wstr);
// TODO:projects/5 The calls to ClearSelection and TrySnapOnInput below could probably get moved inside Terminal::WritePastedText.
_terminal->WritePastedText(wstr);

and have me come back to it 😛

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh the bot automerged it. Whoops. I'm sure this whole area's about to be torn up anyways

}

// Method Description:
Expand Down
21 changes: 21 additions & 0 deletions src/cascadia/TerminalCore/Terminal.cpp
Expand Up @@ -397,6 +397,27 @@ void Terminal::Write(std::wstring_view stringView)
_stateMachine->ProcessString(stringView);
}

void Terminal::WritePastedText(std::wstring_view stringView)
{
auto option = ::Microsoft::Console::Utils::FilterOption::CarriageReturnNewline |
::Microsoft::Console::Utils::FilterOption::ControlCodes;

std::wstring filtered = ::Microsoft::Console::Utils::FilterStringForPaste(stringView, option);
if (IsXtermBracketedPasteModeEnabled())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here's an idea. Maybe to handle Jantari's case (since they want to paste literal control sequences), we can strip control codes only when bracketed paste mode is enabled.

@j4james, do you think that's a reasonable compromise? It solves the case where a clipboard payload can break out of bracketed paste and doesn't make unguarded paste any worse.

At the same time, I still can't see the value in pasting raw control codes here today in Common Era 2021, so...

The only time I ever did it was when I was actively trying to blow up bracketed paste in xterm/rxvt.

Otherwise, that's what ^V is for.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, my vote is let's strip them always.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to just strip them always, too. Also I don't like the idea of coupling "control codes filtering" and "bracketed paste mode". This is what @j4james told me specifically not to do in #7508 (comment)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record, I'm in agreement with you guys. I think it's preferable that they're always filtered.

{
filtered.insert(0, L"\x1b[200~");
filtered.append(L"\x1b[201~");
}

if (_pfnWriteInput)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we don't need to do any of this work if there is no _pfnWriteInput ! 😁

{
_pfnWriteInput(filtered);
}

ClearSelection();
TrySnapOnInput();
}

// Method Description:
// - Attempts to snap to the bottom of the buffer, if SnapOnInput is true. Does
// nothing if SnapOnInput is set to false, or we're already at the bottom of
Expand Down
3 changes: 3 additions & 0 deletions src/cascadia/TerminalCore/Terminal.hpp
Expand Up @@ -66,6 +66,9 @@ class Microsoft::Terminal::Core::Terminal final :
// Write goes through the parser
void Write(std::wstring_view stringView);

// WritePastedText goes directly to the connection
void WritePastedText(std::wstring_view stringView);

[[nodiscard]] std::shared_lock<std::shared_mutex> LockForReading();
[[nodiscard]] std::unique_lock<std::shared_mutex> LockForWriting();

Expand Down
13 changes: 13 additions & 0 deletions src/types/inc/utils.hpp
Expand Up @@ -52,6 +52,19 @@ namespace Microsoft::Console::Utils
bool StringToUint(const std::wstring_view wstr, unsigned int& value);
std::vector<std::wstring_view> SplitString(const std::wstring_view wstr, const wchar_t delimiter) noexcept;

enum FilterOption
{
None = 0,
// Convert CR+LF and LF-only line endings to CR-only.
CarriageReturnNewline = 1u << 0,
// For security reasons, remove most control characters.
ControlCodes = 1u << 1,
};

DEFINE_ENUM_FLAG_OPERATORS(FilterOption)

std::wstring FilterStringForPaste(const std::wstring_view wstr, const FilterOption option);

constexpr uint16_t EndianSwap(uint16_t value)
{
return (value & 0xFF00) >> 8 |
Expand Down
75 changes: 75 additions & 0 deletions src/types/ut_types/UtilsTests.cpp
Expand Up @@ -23,6 +23,7 @@ class UtilsTests
TEST_METHOD(TestSwapColorPalette);
TEST_METHOD(TestGuidToString);
TEST_METHOD(TestSplitString);
TEST_METHOD(TestFilterStringForPaste);
TEST_METHOD(TestStringToUint);
TEST_METHOD(TestColorFromXTermColor);

Expand Down Expand Up @@ -131,6 +132,80 @@ void UtilsTests::TestSplitString()
VERIFY_ARE_EQUAL(L"789", result.at(2));
}

void UtilsTests::TestFilterStringForPaste()
{
// Test carriage return
const std::wstring noNewLine = L"Hello World";
VERIFY_ARE_EQUAL(L"Hello World", FilterStringForPaste(noNewLine, FilterOption::CarriageReturnNewline));

const std::wstring singleCR = L"Hello World\r";
VERIFY_ARE_EQUAL(L"Hello World\r", FilterStringForPaste(singleCR, FilterOption::CarriageReturnNewline));

const std::wstring singleLF = L"Hello World\n";
VERIFY_ARE_EQUAL(L"Hello World\r", FilterStringForPaste(singleLF, FilterOption::CarriageReturnNewline));

const std::wstring singleCRLF = L"Hello World\r\n";
VERIFY_ARE_EQUAL(L"Hello World\r", FilterStringForPaste(singleCRLF, FilterOption::CarriageReturnNewline));

const std::wstring multiCR = L"Hello\rWorld\r";
VERIFY_ARE_EQUAL(L"Hello\rWorld\r", FilterStringForPaste(multiCR, FilterOption::CarriageReturnNewline));

const std::wstring multiLF = L"Hello\nWorld\n";
VERIFY_ARE_EQUAL(L"Hello\rWorld\r", FilterStringForPaste(multiLF, FilterOption::CarriageReturnNewline));

const std::wstring multiCRLF = L"Hello\r\nWorld\r\n";
VERIFY_ARE_EQUAL(L"Hello\rWorld\r", FilterStringForPaste(multiCRLF, FilterOption::CarriageReturnNewline));

const std::wstring multiCR_NoNewLine = L"Hello\rWorld\r123";
VERIFY_ARE_EQUAL(L"Hello\rWorld\r123", FilterStringForPaste(multiCR_NoNewLine, FilterOption::CarriageReturnNewline));

const std::wstring multiLF_NoNewLine = L"Hello\nWorld\n123";
VERIFY_ARE_EQUAL(L"Hello\rWorld\r123", FilterStringForPaste(multiLF_NoNewLine, FilterOption::CarriageReturnNewline));

const std::wstring multiCRLF_NoNewLine = L"Hello\r\nWorld\r\n123";
VERIFY_ARE_EQUAL(L"Hello\rWorld\r123", FilterStringForPaste(multiCRLF_NoNewLine, FilterOption::CarriageReturnNewline));

// Test control code filtering
const std::wstring noNewLineWithControlCodes = L"Hello\x01\x02\x03 123";
VERIFY_ARE_EQUAL(L"Hello 123", FilterStringForPaste(noNewLineWithControlCodes, FilterOption::ControlCodes));

const std::wstring singleCRWithControlCodes = L"Hello World\r\x01\x02\x03 123";
VERIFY_ARE_EQUAL(L"Hello World\r 123", FilterStringForPaste(singleCRWithControlCodes, FilterOption::ControlCodes));

const std::wstring singleLFWithControlCodes = L"Hello World\n\x01\x02\x03 123";
VERIFY_ARE_EQUAL(L"Hello World\n 123", FilterStringForPaste(singleLFWithControlCodes, FilterOption::ControlCodes));

const std::wstring singleCRLFWithControlCodes = L"Hello World\r\n\x01\x02\x03 123";
VERIFY_ARE_EQUAL(L"Hello World\r\n 123", FilterStringForPaste(singleCRLFWithControlCodes, FilterOption::ControlCodes));

VERIFY_ARE_EQUAL(L"Hello World\r 123", FilterStringForPaste(singleCRWithControlCodes, FilterOption::CarriageReturnNewline | FilterOption::ControlCodes));
VERIFY_ARE_EQUAL(L"Hello World\r 123", FilterStringForPaste(singleLFWithControlCodes, FilterOption::CarriageReturnNewline | FilterOption::ControlCodes));
VERIFY_ARE_EQUAL(L"Hello World\r 123", FilterStringForPaste(singleCRLFWithControlCodes, FilterOption::CarriageReturnNewline | FilterOption::ControlCodes));

const std::wstring multiCRWithControlCodes = L"Hello\r\x01\x02\x03World\r\x01\x02\x03 123";
VERIFY_ARE_EQUAL(L"Hello\rWorld\r 123", FilterStringForPaste(multiCRWithControlCodes, FilterOption::ControlCodes));

const std::wstring multiLFWithControlCodes = L"Hello\n\x01\x02\x03World\n\x01\x02\x03 123";
VERIFY_ARE_EQUAL(L"Hello\nWorld\n 123", FilterStringForPaste(multiLFWithControlCodes, FilterOption::ControlCodes));

const std::wstring multiCRLFWithControlCodes = L"Hello\r\nWorld\r\n\x01\x02\x03 123";
VERIFY_ARE_EQUAL(L"Hello\r\nWorld\r\n 123", FilterStringForPaste(multiCRLFWithControlCodes, FilterOption::ControlCodes));

VERIFY_ARE_EQUAL(L"Hello\rWorld\r 123", FilterStringForPaste(multiCRWithControlCodes, FilterOption::CarriageReturnNewline | FilterOption::ControlCodes));
VERIFY_ARE_EQUAL(L"Hello\rWorld\r 123", FilterStringForPaste(multiLFWithControlCodes, FilterOption::CarriageReturnNewline | FilterOption::ControlCodes));
VERIFY_ARE_EQUAL(L"Hello\rWorld\r 123", FilterStringForPaste(multiCRLFWithControlCodes, FilterOption::CarriageReturnNewline | FilterOption::ControlCodes));

const std::wstring multiLineWithLotsOfControlCodes = L"e\bc\bh\bo\b \b'.\b!\b:\b\b \bke\bS\b \bi3\bl \bld\bK\bo\b -1\b+\b9 +\b2\b-1'\b >\b \b/\bt\bm\bp\b/\bl\bo\bl\b\r\nsleep 1\r\nmd5sum /tmp/lol";

VERIFY_ARE_EQUAL(L"echo '.!: keS i3l ldKo -1+9 +2-1' > /tmp/lol\rsleep 1\rmd5sum /tmp/lol",
FilterStringForPaste(multiLineWithLotsOfControlCodes, FilterOption::CarriageReturnNewline | FilterOption::ControlCodes));

// Test Unicode content
const std::wstring unicodeString = L"你好\r\n\x01世界\x02\r\n123";
VERIFY_ARE_EQUAL(L"你好\r世界\r123",
FilterStringForPaste(unicodeString, FilterOption::CarriageReturnNewline | FilterOption::ControlCodes));
}

void UtilsTests::TestStringToUint()
{
bool success = false;
Expand Down
84 changes: 84 additions & 0 deletions src/types/utils.cpp
Expand Up @@ -428,6 +428,90 @@ catch (...)
return {};
}

// Routine Description:
// - Pre-process text pasted (presumably from the clipboard) with provided option.
// Arguments:
// - wstr - String to process.
// - option - option to use.
// Return Value:
// - The result string.
std::wstring Utils::FilterStringForPaste(const std::wstring_view wstr, const FilterOption option)
{
std::wstring filtered;
filtered.reserve(wstr.length());

const auto isControlCode = [](wchar_t c) {
if (c >= L'\x20' && c <= L'\x7f')
{
// Printable ASCII characters + DEL.
return false;
}
skyline75489 marked this conversation as resolved.
Show resolved Hide resolved

if (c > L'\x7f')
{
// Not a control code.
return false;
}
skyline75489 marked this conversation as resolved.
Show resolved Hide resolved

// All ASCII control codes will be removed except HT(0x09), LF(0x0a),
// CR(0x0d) and DEL(0x7F).
return c >= L'\x00' && c <= L'\x08' ||
c >= L'\x0b' && c <= L'\x0c' ||
c >= L'\x0e' && c <= L'\x1f';
};

std::wstring::size_type pos = 0;
std::wstring::size_type begin = 0;

while (pos < wstr.size())
skyline75489 marked this conversation as resolved.
Show resolved Hide resolved
{
const wchar_t c = wstr.at(pos);

if (WI_IsFlagSet(option, FilterOption::CarriageReturnNewline) && c == L'\n')
{
// copy up to but not including the \n
filtered.append(wstr.cbegin() + begin, wstr.cbegin() + pos);
if (!(pos > 0 && (wstr.at(pos - 1) == L'\r')))
{
// there was no \r before the \n we did not copy,
// so append our own \r (this effectively replaces the \n
// with a \r)
filtered.push_back(L'\r');
}
++pos;
begin = pos;
}
else if (WI_IsFlagSet(option, FilterOption::ControlCodes) && isControlCode(c))
{
// copy up to but not including the control code
filtered.append(wstr.cbegin() + begin, wstr.cbegin() + pos);
++pos;
begin = pos;
}
else
{
++pos;
}
}

// If we entered the while loop even once, begin would be non-zero
// (because we set begin = pos right after incrementing pos)
// So, if begin is still zero at this point it means we never found a newline
// and we can just write the original string
if (begin == 0)
{
return std::wstring(wstr);
skyline75489 marked this conversation as resolved.
Show resolved Hide resolved
}
else
{
filtered.append(wstr.cbegin() + begin, wstr.cend());
// we may have removed some characters, so we may not need as much space
// as we reserved earlier
filtered.shrink_to_fit();
return filtered;
}
}

// Routine Description:
// - Shorthand check if a handle value is null or invalid.
// Arguments:
Expand Down