Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Windows] Ignore case optionally in AXPlatformNodeTextRangeProviderWin::FindText #39922

Merged
merged 9 commits into from Apr 11, 2023

Conversation

yaakovschectman
Copy link
Contributor

When ignore_case is true, convert needle and haystack strings to lowercase before performing search,

flutter/flutter#117013

Pre-launch Checklist

  • I read the Contributor Guide and followed the process outlined there for submitting PRs.
  • I read the Tree Hygiene wiki page, which explains my responsibilities.
  • I read and followed the Flutter Style Guide and the C++, Objective-C, Java style guides.
  • I listed at least one issue that this PR fixes in the description above.
  • I added new tests to check the change I am making or feature I am adding, or Hixie said the PR is test-exempt. See testing the engine for instructions on writing and running engine tests.
  • I updated/added relevant documentation (doc comments with ///).
  • I signed the CLA.
  • All existing and new tests are passing.

If you need help, consider asking for advice on the #hackers-new channel on Discord.

Comment on lines 436 to 438
static bool StringSearch(const std::u16string& search_string,
const std::u16string& find_in,
static bool StringSearch(std::u16string& search_string,
std::u16string& find_in,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please let me know if you can think of a better way to allow us to optionally copy the needle and haystack strings and then continue with the rest of the method. I attempted using string_view, but the problem with it is that if we create new copies of the strings within the if statement to convert to lowercase and assign to the views, the underlying memory is freed when the if statement completes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than us trying to implement a case-insensitive compare, I wonder if we shouldn't just delegate to underlying Win32/macOS APIs. The idea of "case" is really finnicky -- e.g. Chinese, Japanese, Korean don't even have a concept of case, and even in French there have historically been differences between France and Canada's handling of accented characters (though IIRC France now matches Canada's handling), Turkish dotted vs non-dotted 'i' etc.

Does Chromium implement case comparison itself?

If we are going to do it ourselves, we should pass the string by value rather than mutate an incoming string.

Copy link
Member

@cbracken cbracken Feb 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Windows, I see lstrcmpiW which calls out to CompareStringEx, and looks like it does what we'd want.

Copy link
Member

@cbracken cbracken Feb 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The notes on lstrcmpiW suggest some of the horrors intrinsic in implementing this sort of thing on our own, e.g. should ignore case be interpreted as ignore kana type in Japan? According to that function's implementation, apparently the answer is yes. e.g. か (hiragana) and カ (katakana) are each the phonetic "ka" sound, just expressed using two different phonetic alphabets.

In the case where the user is requesting a case-insensitive compare for some reason, we probably want to be consistent with what the rest of the system is doing, so I'd be tempted to use a system API for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These functions compare strings, and StringSearch needs to search for a substring. Chromium appears to have an entire i18n library which provides a StringSearch method (see https://github.com/chromium/chromium/blob/main/base/i18n/string_search.h#L44 ), and furthermore on a 3rd party icu library.

Copy link
Member

@cbracken cbracken Feb 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at their implementation, it looks like they're building on top of ICU and doing things like accent handling etc. As I say, I'd be tempted to avoid trying to recreate ICU/Win32 APIs here - between just languages I'm familiar with (French and Japanese) there are already difficulties, but I imagine a huge ton more in languages that I have no idea about.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw their search implementation does look very standalone and we could probably simply adopt it as an option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are you looking at as being standalone? It appears that StringSearch creates and calls to FixedPatternStringSearch, which makes use of at least UStringSearch and UCollator from ICU

Copy link
Member

@cbracken cbracken Feb 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, I was looking at string_search.h,cc which depend only on ICU (which we already depend on in Flutter) and a few FML-like utilities (which should be easy to replace).

@yaakovschectman yaakovschectman marked this pull request as ready for review February 27, 2023 21:39
@yaakovschectman yaakovschectman force-pushed the TextUnitFormat branch 2 times, most recently from 2a7732a to 67269ba Compare February 28, 2023 17:28
EXPECT_UIA_FIND_TEXT(range, L"more text", false, owner);
// EXPECT_UIA_FIND_TEXT(range, L"MoRe TeXt", true, owner);
EXPECT_UIA_FIND_TEXT(range, L"MoRe TeXt", true, owner);
EXPECT_UIA_FIND_TEXT(range, L"more", false, owner);
Copy link
Member

@cbracken cbracken Feb 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we adopt Chromium's ICU-based StringSearch implementation, we should add further tests like checking that eleve is found in the string élève and (if they also do kana-type insensitive comparison, which I don't see any reference to, but I'm not an ICU expert) that カタカナ is found in the string かたかな.

These are nice-to-have for the case where you're searching for a particular piece of text in a document. For example, here's me searching the wiki page for katakana with the string カタカナ ("katakana" written using katakana phonetics) but matching both that string and かたかな ("katakana" written using hiragana phonetics) in the same page:
image

@yaakovschectman
Copy link
Contributor Author

I expect the current test to fail, I just want to check before messing with the BUILD files more.

@flutter-dashboard
Copy link

Golden file changes have been found for this pull request. Click here to view and triage (e.g. because this is an intentional change).

If you are still iterating on this change and are not ready to resolve the images on the Flutter Gold dashboard, consider marking this PR as a draft pull request above. You will still be able to view image results on the dashboard, commenting will be silenced, and the check will not try to resolve itself until marked ready for review.

Changes reported for pull request #39922 at sha ec9de30

@flutter-dashboard
Copy link

Golden file changes are available for triage from new commit, Click here to view.

Changes reported for pull request #39922 at sha db91100

@yaakovschectman
Copy link
Contributor Author

@cbracken Now that we are using fuller ICU data on desktop, can you give this PR a look?

Copy link
Member

@cbracken cbracken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great - this was a painful one! Thanks for fixing!

@yaakovschectman yaakovschectman merged commit 8d57e6f into flutter:main Apr 11, 2023
36 of 37 checks passed
@yaakovschectman yaakovschectman deleted the TextUnitFormat branch April 11, 2023 18:55
engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request Apr 11, 2023
engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request Apr 11, 2023
engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request Apr 11, 2023
zhongwuzw pushed a commit to zhongwuzw/engine that referenced this pull request Apr 14, 2023
…in::FindText` (flutter#39922)

When `ignore_case` is `true`, convert needle and haystack strings to
lowercase before performing search,

flutter/flutter#117013

## Pre-launch Checklist

- [x] I read the [Contributor Guide] and followed the process outlined
there for submitting PRs.
- [x] I read the [Tree Hygiene] wiki page, which explains my
responsibilities.
- [x] I read and followed the [Flutter Style Guide] and the [C++,
Objective-C, Java style guides].
- [x] I listed at least one issue that this PR fixes in the description
above.
- [x] I added new tests to check the change I am making or feature I am
adding, or Hixie said the PR is test-exempt. See [testing the engine]
for instructions on writing and running engine tests.
- [x] I updated/added relevant documentation (doc comments with `///`).
- [ ] I signed the [CLA].
- [x] All existing and new tests are passing.

If you need help, consider asking for advice on the #hackers-new channel
on [Discord].

<!-- Links -->
[Contributor Guide]:
https://github.com/flutter/flutter/wiki/Tree-hygiene#overview
[Tree Hygiene]: https://github.com/flutter/flutter/wiki/Tree-hygiene
[Flutter Style Guide]:
https://github.com/flutter/flutter/wiki/Style-guide-for-Flutter-repo
[C++, Objective-C, Java style guides]:
https://github.com/flutter/engine/blob/main/CONTRIBUTING.md#style
[testing the engine]:
https://github.com/flutter/flutter/wiki/Testing-the-engine
[CLA]: https://cla.developers.google.com/
[flutter/tests]: https://github.com/flutter/tests
[breaking change policy]:
https://github.com/flutter/flutter/wiki/Tree-hygiene#handling-breaking-changes
[Discord]: https://github.com/flutter/flutter/wiki/Chat
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
2 participants