New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Windows] Ignore case optionally in AXPlatformNodeTextRangeProviderWin::FindText
#39922
Conversation
static bool StringSearch(const std::u16string& search_string, | ||
const std::u16string& find_in, | ||
static bool StringSearch(std::u16string& search_string, | ||
std::u16string& find_in, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please let me know if you can think of a better way to allow us to optionally copy the needle and haystack strings and then continue with the rest of the method. I attempted using string_view
, but the problem with it is that if we create new copies of the strings within the if statement to convert to lowercase and assign to the views, the underlying memory is freed when the if statement completes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than us trying to implement a case-insensitive compare, I wonder if we shouldn't just delegate to underlying Win32/macOS APIs. The idea of "case" is really finnicky -- e.g. Chinese, Japanese, Korean don't even have a concept of case, and even in French there have historically been differences between France and Canada's handling of accented characters (though IIRC France now matches Canada's handling), Turkish dotted vs non-dotted 'i' etc.
Does Chromium implement case comparison itself?
If we are going to do it ourselves, we should pass the string by value rather than mutate an incoming string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Windows, I see lstrcmpiW which calls out to CompareStringEx, and looks like it does what we'd want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The notes on lstrcmpiW
suggest some of the horrors intrinsic in implementing this sort of thing on our own, e.g. should ignore case be interpreted as ignore kana type in Japan? According to that function's implementation, apparently the answer is yes. e.g. か (hiragana) and カ (katakana) are each the phonetic "ka" sound, just expressed using two different phonetic alphabets.
In the case where the user is requesting a case-insensitive compare for some reason, we probably want to be consistent with what the rest of the system is doing, so I'd be tempted to use a system API for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These functions compare strings, and StringSearch
needs to search for a substring. Chromium appears to have an entire i18n
library which provides a StringSearch
method (see https://github.com/chromium/chromium/blob/main/base/i18n/string_search.h#L44 ), and furthermore on a 3rd party icu
library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at their implementation, it looks like they're building on top of ICU and doing things like accent handling etc. As I say, I'd be tempted to avoid trying to recreate ICU/Win32 APIs here - between just languages I'm familiar with (French and Japanese) there are already difficulties, but I imagine a huge ton more in languages that I have no idea about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fwiw their search implementation does look very standalone and we could probably simply adopt it as an option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are you looking at as being standalone? It appears that StringSearch
creates and calls to FixedPatternStringSearch
, which makes use of at least UStringSearch
and UCollator
from ICU
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, I was looking at string_search.h,cc which depend only on ICU (which we already depend on in Flutter) and a few FML-like utilities (which should be easy to replace).
2a7732a
to
67269ba
Compare
EXPECT_UIA_FIND_TEXT(range, L"more text", false, owner); | ||
// EXPECT_UIA_FIND_TEXT(range, L"MoRe TeXt", true, owner); | ||
EXPECT_UIA_FIND_TEXT(range, L"MoRe TeXt", true, owner); | ||
EXPECT_UIA_FIND_TEXT(range, L"more", false, owner); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we adopt Chromium's ICU-based StringSearch implementation, we should add further tests like checking that eleve
is found in the string élève
and (if they also do kana-type insensitive comparison, which I don't see any reference to, but I'm not an ICU expert) that カタカナ
is found in the string かたかな
.
These are nice-to-have for the case where you're searching for a particular piece of text in a document. For example, here's me searching the wiki page for katakana with the string カタカナ ("katakana" written using katakana phonetics) but matching both that string and かたかな ("katakana" written using hiragana phonetics) in the same page:
dc61700
to
fa5f1dc
Compare
I expect the current test to fail, I just want to check before messing with the BUILD files more. |
Golden file changes have been found for this pull request. Click here to view and triage (e.g. because this is an intentional change). If you are still iterating on this change and are not ready to resolve the images on the Flutter Gold dashboard, consider marking this PR as a draft pull request above. You will still be able to view image results on the dashboard, commenting will be silenced, and the check will not try to resolve itself until marked ready for review. |
Golden file changes are available for triage from new commit, Click here to view. |
@cbracken Now that we are using fuller ICU data on desktop, can you give this PR a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great - this was a painful one! Thanks for fixing!
…eProviderWin::FindText` (flutter/engine#39922)
…eProviderWin::FindText` (flutter/engine#39922)
…eProviderWin::FindText` (flutter/engine#39922)
…in::FindText` (flutter#39922) When `ignore_case` is `true`, convert needle and haystack strings to lowercase before performing search, flutter/flutter#117013 ## Pre-launch Checklist - [x] I read the [Contributor Guide] and followed the process outlined there for submitting PRs. - [x] I read the [Tree Hygiene] wiki page, which explains my responsibilities. - [x] I read and followed the [Flutter Style Guide] and the [C++, Objective-C, Java style guides]. - [x] I listed at least one issue that this PR fixes in the description above. - [x] I added new tests to check the change I am making or feature I am adding, or Hixie said the PR is test-exempt. See [testing the engine] for instructions on writing and running engine tests. - [x] I updated/added relevant documentation (doc comments with `///`). - [ ] I signed the [CLA]. - [x] All existing and new tests are passing. If you need help, consider asking for advice on the #hackers-new channel on [Discord]. <!-- Links --> [Contributor Guide]: https://github.com/flutter/flutter/wiki/Tree-hygiene#overview [Tree Hygiene]: https://github.com/flutter/flutter/wiki/Tree-hygiene [Flutter Style Guide]: https://github.com/flutter/flutter/wiki/Style-guide-for-Flutter-repo [C++, Objective-C, Java style guides]: https://github.com/flutter/engine/blob/main/CONTRIBUTING.md#style [testing the engine]: https://github.com/flutter/flutter/wiki/Testing-the-engine [CLA]: https://cla.developers.google.com/ [flutter/tests]: https://github.com/flutter/tests [breaking change policy]: https://github.com/flutter/flutter/wiki/Tree-hygiene#handling-breaking-changes [Discord]: https://github.com/flutter/flutter/wiki/Chat
When
ignore_case
istrue
, convert needle and haystack strings to lowercase before performing search,flutter/flutter#117013
Pre-launch Checklist
///
).If you need help, consider asking for advice on the #hackers-new channel on Discord.