Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce VTInt to represent VT parameters #12207

Merged
2 commits merged into from
Apr 20, 2022
Merged

Introduce VTInt to represent VT parameters #12207

2 commits merged into from
Apr 20, 2022

Conversation

lhecker
Copy link
Member

@lhecker lhecker commented Jan 20, 2022

This commit replaces our use of size_t to represent VT parameters with
int32_t. While unsigned integers have the inherent benefit of being less
ambiguous and enjoying two's complement, our buffer coordinates use signed
integers. Since a number of VT functions need to convert their parameters
to coordinates, this commit makes the conversion easier.
The benefit of this change becomes even more apparent if one considers
that a number of places performed unsafe conversions
of their size_t parameters to int or short already.

Files that had to be modified were converted to use til
wrappers instead of COORD or SMALL_RECT wherever possible.

References

This commit contains about 20% of the work for #4015.

PR Checklist

  • I work here
  • Tests added/passed

Validation Steps Performed

I'm mostly relying on our unit tests here. Both OpenConsole and WT appear to work fine.

@lhecker lhecker force-pushed the dev/lhecker/issue-4015-vt branch 7 times, most recently from 64238ab to f5a50ef Compare January 21, 2022 17:53
@lhecker lhecker marked this pull request as ready for review January 21, 2022 19:32
@lhecker lhecker added Area-CodeHealth Issues related to code cleanliness, linting, rules, warnings, errors, static analysis, etc. Issue-Task It's a feature request, but it doesn't really need a major design. Product-Conhost For issues in the Console codebase Product-Terminal The new Windows Terminal. labels Jan 21, 2022
const short absoluteY = viewOrigin.Y + y;
COORD newPos{ absoluteX, absoluteY };
const til::point viewOrigin{ viewport.Origin() };
auto newPos = til::unwrap_coord(viewOrigin + pos);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commit contains quite a number of explicit wrapping and unwrapping of COORD/SMALL_RECT into til::point/til::rect, or explicit narrowings into SHORTs, because only a part of the project was converted to the til helpers. In a future PR these changes will be undone, since barely anything will use COORD/SMALL_RECT then.

@@ -12,6 +12,58 @@

namespace til // Terminal Implementation Library. Also: "Today I Learned"
{
struct inclusive_rect
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This struct is the new replacement for SMALL_RECT. It uses the same underlying til::CoordType as til::rect, but represents a rectangle with inclusive coordinates.
The benefit is a simplified conversion between inclusive_rect and rect without overflow checking.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was asked whether this struct is still part of the PR title (introducing VTInt). I'd argue: Yes.
The goal of VTInt is to turn size_t -> int narrowings into primitive int32_t -> int32_t copies/casts. This requires us to replace any use of shorts in this code though, which includes any use of the SMALL_RECT struct. This is where inclusive_rect enters the scene.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This concerns me a little, because I know there are parts of conhost that definitely use SMALL_RECT with exclusive coordinates. It's tricky enough as it is not knowing whether a SMALL_RECT is inclusive or exclusive, but it'll be a whole lot worse if we're storing exclusive coordinates in a type named inclusive_rect.

Or are you dealing with this on a case by case basis, and only using til::rect for the exclusive cases? Sorry if this is dumb question - I've only really skimmed the PR.

Copy link
Member Author

@lhecker lhecker Mar 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've painstakingly replaced all inclusive SMALL_RECTs in our entire code base with til::rect as part of my til::rect-refactorization. This is one of the reasons I removed implicit constructors and conversion operators, as it helped me tremendously when detecting these misuses of SMALL_RECT and prevent any (well... most?) new bugs within my refactorization.

Basically...

Or are you dealing with this on a case by case basis, and only using til::rect for the exclusive cases?

-> Yes! Just like in this PR, I'll use this inclusive_rect only for actually inclusive coordinates in my follow-up PR as well. (I mean... I might've made mistakes, but I tried my absolute best to catch them all.)

In case I made mistakes I think the new code makes the situation only slightly worse, because SMALL_RECT is already publicly documented to be used for only inclusive coordinates.

// that is generally how SMALL_RECTs are handled in console code and via the APIs.
constexpr inclusive_rect to_inclusive_rect() const
{
return { left, top, right - 1, bottom - 1 };
Copy link
Member Author

@lhecker lhecker Jan 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conversion between rect and inclusive_rect performs no overflow checks.
I personally consider that an acceptable risk, since we simultaneously upgrade from 16 to 32 bit integers for all intermediate coordinate representations like this one, despite only accepting 16 bit coordinates from clients.
What about you all?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that sounds fine, but you should perhaps document that assumption in the comment.

@lhecker lhecker requested a review from miniksa January 25, 2022 15:16
src/terminal/adapter/DispatchTypes.hpp Outdated Show resolved Hide resolved
Comment on lines 132 to 133
// For numeric parameters, both 0 and omitted values will default to 1.
return has_value() && _value != 0 ? _value : 1;
return _value > 0 ? _value : 1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there a reason for this change? I think both should compile to the same thing (unless maybe that's affected by the new type?), but the point of the original code was to make it clearer under what conditions the default would be applied.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I made this change is that I actually found it difficult/confusing to understand when what value is returned initially. I distinctly remember that I was debugging something and it took me a second to realize that has_value() && _value != 0 basically means _value > 0. Afterwards I found it easier to understand what breakpoints I need to set and how VT integer parameters are processed.
Truth be told it only took me a few seconds to understand, but still I find this new code a lot more obvious personally.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course I don't mind reverting this. 🙂

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to let a third party decide, but I specifically avoided using _value > 0 because I was trying to make the intent clearer (not successfully it seems). If we're going to keep the > 0 change, though, I think it at least needs an updated comment, because it makes no sense now without also explaining that we're using negative numbers to represent omitted values.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we need the comment at a minimum. I don't have a preference which way the code falls, but it needs to be clear negative numbers are intentional.

@@ -12,6 +12,58 @@

namespace til // Terminal Implementation Library. Also: "Today I Learned"
{
struct inclusive_rect
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This concerns me a little, because I know there are parts of conhost that definitely use SMALL_RECT with exclusive coordinates. It's tricky enough as it is not knowing whether a SMALL_RECT is inclusive or exclusive, but it'll be a whole lot worse if we're storing exclusive coordinates in a type named inclusive_rect.

Or are you dealing with this on a case by case basis, and only using til::rect for the exclusive cases? Sorry if this is dumb question - I've only really skimmed the PR.

@lhecker
Copy link
Member Author

lhecker commented Apr 19, 2022

I've managed to merge all 220 commits to main since March, but it's practically impossible to do a proper job at actually adapting the PR to the new code.
I'll thus rebase this PR so that I can modify the entire diff as a whole and make more pin-point changes.

@lhecker
Copy link
Member Author

lhecker commented Apr 19, 2022

Before:

38 files changed, 718 insertions(+), 663 deletions(-)

After:

36 files changed, 573 insertions(+), 490 deletions(-)

That's at least something!

@github-actions
Copy link

github-actions bot commented Apr 19, 2022

@check-spelling-bot Report

Unrecognized words, please review:

  • ommitted
Previously acknowledged words that are now absent azurewebsites Consolescreen css cxcy DCompile debolden deconstructed devicefamily GETKEYSTATE guardxfg LLVM LPCHARSETINFO MAPVIRTUALKEY MSDL ned newcursor NOWAIT pgorepro pgort PGU redistributable Timeline timelines unintense UWA UWAs VKKEYSCAN WResult xfg
To accept these unrecognized words as correct (and remove the previously acknowledged and now absent words), run the following commands

... in a clone of the git@github.com:microsoft/terminal.git repository
on the dev/lhecker/issue-4015-vt branch:

update_files() {
perl -e '
my @expect_files=qw('".github/actions/spelling/expect/alphabet.txt
.github/actions/spelling/expect/expect.txt
.github/actions/spelling/expect/web.txt"');
@ARGV=@expect_files;
my @stale=qw('"$patch_remove"');
my $re=join "|", @stale;
my $suffix=".".time();
my $previous="";
sub maybe_unlink { unlink($_[0]) if $_[0]; }
while (<>) {
if ($ARGV ne $old_argv) { maybe_unlink($previous); $previous="$ARGV$suffix"; rename($ARGV, $previous); open(ARGV_OUT, ">$ARGV"); select(ARGV_OUT); $old_argv = $ARGV; }
next if /^(?:$re)(?:(?:\r|\n)*$| .*)/; print;
}; maybe_unlink($previous);'
perl -e '
my $new_expect_file=".github/actions/spelling/expect/0da5bd77269f6060a38e6e075ce97a9b0e3e6991.txt";
use File::Path qw(make_path);
use File::Basename qw(dirname);
make_path (dirname($new_expect_file));
open FILE, q{<}, $new_expect_file; chomp(my @words = <FILE>); close FILE;
my @add=qw('"$patch_add"');
my %items; @items{@words} = @words x (1); @items{@add} = @add x (1);
@words = sort {lc($a)."-".$a cmp lc($b)."-".$b} keys %items;
open FILE, q{>}, $new_expect_file; for my $word (@words) { print FILE "$word\n" if $word =~ /\w/; };
close FILE;
system("git", "add", $new_expect_file);
'
}

comment_json=$(mktemp)
curl -L -s -S \
  --header "Content-Type: application/json" \
  "https://api.github.com/repos/microsoft/terminal/issues/comments/1103255802" > "$comment_json"
comment_body=$(mktemp)
jq -r .body < "$comment_json" > $comment_body
rm $comment_json

patch_remove=$(perl -ne 'next unless s{^</summary>(.*)</details>$}{$1}; print' < "$comment_body")
  

patch_add=$(perl -e '$/=undef;
$_=<>;
s{<details>.*}{}s;
s{^#.*}{};
s{\n##.*}{};
s{(?:^|\n)\s*\*}{}g;
s{\s+}{ }g;
print' < "$comment_body")
  
update_files
rm $comment_body
git add -u
✏️ Contributor please read this

By default the command suggestion will generate a file named based on your commit. That's generally ok as long as you add the file to your commit. Someone can reorganize it later.

⚠️ The command is written for posix shells. You can copy the contents of each perl command excluding the outer ' marks and dropping any '"/"' quotation mark pairs into a file and then run perl file.pl from the root of the repository to run the code. Alternatively, you can manually insert the items...

If the listed items are:

  • ... misspelled, then please correct them instead of using the command.
  • ... names, please add them to .github/actions/spelling/allow/names.txt.
  • ... APIs, you can add them to a file in .github/actions/spelling/allow/.
  • ... just things you're using, please add them to an appropriate file in .github/actions/spelling/expect/.
  • ... tokens you only need in one place and shouldn't generally be used, you can add an item in an appropriate file in .github/actions/spelling/patterns/.

See the README.md in each directory for more information.

🔬 You can test your commits without appending to a PR by creating a new branch with that extra change and pushing it to your fork. The check-spelling action will run in response to your push -- it doesn't require an open pull request. By using such a branch, you can limit the number of typos your peers see you make. 😉

🗜️ If you see a bunch of garbage

If it relates to a ...

well-formed pattern

See if there's a pattern that would match it.

If not, try writing one and adding it to a patterns/{file}.txt.

Patterns are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your lines.

Note that patterns can't match multiline strings.

binary-ish string

Please add a file path to the excludes.txt file instead of just accepting the garbage.

File paths are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your files.

^ refers to the file's path from the root of the repository, so ^README\.md$ would exclude README.md (on whichever branch you're using).

Comment on lines +121 to +122
// A negative value indicates that the parameter was omitted.
return _value < 0 ? defaultValue : _value;
Copy link
Member Author

@lhecker lhecker Apr 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out that MSVC is really picky about how you write conditions. ô_O
Good: _value < 0 ? defaultValue : _value
Bad: _value >= 0 ? _value : defaultValue

In gcc/clang it produces the same assembly. Well, honestly, this change is so insanely trivial it doesn't really matter, but I was curious and here we are. As mentioned elsewhere in this PR I personally prefer this new code as it's immediately obvious what this line does (arithmetically) without having to check other functions. But... eh.

Comment on lines +441 to +443
const auto vtCoords = _winToVTCoord(position);
const auto encodedX = _encodeDefaultCoordinate(vtCoords.X);
const auto encodedY = _encodeDefaultCoordinate(vtCoords.Y);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These 3 functions return int instead of short now, so we need to change the code below to use narrow casts.

@lhecker lhecker marked this pull request as ready for review April 19, 2022 23:34
Comment on lines +147 to +166
union
{
CoordType left = 0;
CoordType Left;
};
union
{
CoordType top = 0;
CoordType Top;
};
union
{
CoordType right = 0;
CoordType Right;
};
union
{
CoordType bottom = 0;
CoordType Bottom;
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is brilliant

Comment on lines +819 to +824
// NOTE: This will convert from INCLUSIVE on the way in because
// that is generally how SMALL_RECTs are handled in console code and via the APIs.
explicit constexpr rect(const inclusive_rect other) noexcept :
rect{ other.Left, other.Top, other.Right + 1, other.Bottom + 1 }
{
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yesssssssssssssssssssssssssss

@lhecker lhecker added the AutoMerge Marked for automatic merge by the bot when requirements are met label Apr 20, 2022
@ghost
Copy link

ghost commented Apr 20, 2022

Hello @lhecker!

Because this pull request has the AutoMerge label, I will be glad to assist with helping to merge this pull request once all check-in policies pass.

p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (@msftbot) and give me an instruction to get started! Learn more here.

@ghost ghost merged commit 7af134c into main Apr 20, 2022
@ghost ghost deleted the dev/lhecker/issue-4015-vt branch April 20, 2022 11:21
return { rect.X, rect.Y };
}

constexpr COORD unwrap_coord(const point rect)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I almost would have preferred wrap_COORD and unwrap_COORD so we knew they were talking about the type. Theoretically, they could mean any type of coord right now. Eh.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll only have about a few dozen calls to these wrap/unwrap functions throughout the entire project in the future in mostly just two files (propsheet / console API dispatcher).
So it should be easy to rename them in the future. I prefer the lowercase name personally for no specific reason apart from how it looks.


constexpr point wrap_coord(const COORD rect) noexcept
{
return { rect.X, rect.Y };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😢 you said we would migrate to lowercase in the future, but then your new code used uppercase!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm doing a batch rename of s/(\.|->)(X|Y|Left|Top|Right|Bottom)/$1\L$2/g throughout the entire project at the end, after I've migrated everything (\L toggles lowercasing in GNU sed for the rest of the expression).
In this particular case however I'm using uppercase X/Y because that's a COORD.


constexpr bool operator==(const inclusive_rect& rhs) const noexcept
{
return __builtin_memcmp(this, &rhs, sizeof(rhs)) == 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this build on ARM64?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep!

Comment on lines +125 to +140
template<typename T, typename = std::enable_if_t<std::is_enum_v<T>>>
constexpr operator T() const noexcept
{
// For most selective parameters, omitted values will default to 0.
return static_cast<T>(value_or(0));
}

constexpr operator size_t() const noexcept
constexpr operator VTInt() const noexcept
{
// For numeric parameters, both 0 and omitted values will default to 1.
return has_value() && _value != 0 ? _value : 1;
// The parameter is omitted if _value is less than 0.
return _value <= 0 ? 1 : _value;
}

private:
std::make_signed<size_t>::type _value;
VTInt _value;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the part that scares me -- I suspect, however, that @j4james authored tests making sure that conversions to enum vs integer work properly :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd agree with reverting these changes.
But generally, it used to convert intptr_t to enums whose underlying type is size_t, and the new code converts int to int, so we should be fine, if not "finer" than before.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I added any direct tests of the class itself, but it should at least have some coverage from its usage in code paths that do get tested.

Copy link
Member

@DHowett DHowett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Post-merge approval!

@ghost
Copy link

ghost commented May 24, 2022

🎉Windows Terminal Preview v1.14.143 has been released which incorporates this pull request.:tada:

Handy links:

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-CodeHealth Issues related to code cleanliness, linting, rules, warnings, errors, static analysis, etc. AutoMerge Marked for automatic merge by the bot when requirements are met Issue-Task It's a feature request, but it doesn't really need a major design. Product-Conhost For issues in the Console codebase Product-Terminal The new Windows Terminal.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants