Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: extend Match class to access group's start/end index #45486

Closed
yan-zaitsev opened this issue Mar 26, 2021 · 8 comments
Closed
Labels
area-core-library SDK core library issues (core, async, ...); use area-vm or area-web for platform specific libraries. library-core status-blocked Blocked from making progress by another (referenced) issue type-enhancement A request for a change that isn't a bug

Comments

@yan-zaitsev
Copy link

dart --version: Dart SDK version: 2.10.5 (stable) (Tue Jan 19 13:05:37 2021 +0100) on "macos_x64"

My feature request is to extend
Match class: https://github.com/dart-lang/sdk/blob/master/sdk/lib/core/pattern.dart
so it will be possible to get start/end position of founded groups.

My case:

I want to find some substring in text and bold it in UI for user so he will see entered search text in founded texts.
I am doing it using simple Regex:

 final stringRanges = RegExp("${RegExp.escape(searchInput)}", caseSensitive: false, unicode: true)
        .allMatches(text)
        .toList();

It will return me List<Match> which I could render in UI.

Problem

When I want to use more complex Regex with groups:

final stringRanges = RegExp("begin(${RegExp.escape(searchInput)})end", caseSensitive: false, unicode: true)
        .allMatches(text)
        .toList();

At this moment, stringRanges contains more complex Match with groups. I want to parse every group in match with its start/end positions.

Solution

Match class has method String? group(int group); to get string value of founded group.
I will happy to have api like Match? groupMatch(int group); to get more information about single group.
For me it will be enough to have simpler api:

class StringRange {
  int start;
  int end;
 StringRange(this.start, this.end);
}

abstract class Match {
...
StringRange? groupRange(int group);
...
}
@devoncarew devoncarew added area-core-library SDK core library issues (core, async, ...); use area-vm or area-web for platform specific libraries. library-core labels Mar 28, 2021
@lrhn
Copy link
Member

lrhn commented Jun 29, 2021

The reason that functionality is not available is that it's not possible in JavaScript.
The JavaScript match object only provides access to captures as strings, not their position in the original string.

@yan-zaitsev
Copy link
Author

yan-zaitsev commented Jun 29, 2021

@lrhn I don't know JS very well, but, maybe, something from https://stackoverflow.com/questions/1985594/how-to-find-indices-of-groups-in-javascript-regular-expressions-match could be useful?

If it is not possible, please, close the task

@iarkh
Copy link
Contributor

iarkh commented Jul 1, 2021

@iarkh I don't know JS very well, but, maybe, something from https://stackoverflow.com/questions/1985594/how-to-find-indices-of-groups-in-javascript-regular-expressions-match could be useful?

If it is not possible, please, close the task

@yan-zaitsev, seems like this is a question for @lrhn, isn't it?

@lrhn
Copy link
Member

lrhn commented Jul 1, 2021

I have thought about whether we could reasonably make capture group indices available in JS before, and have not found a good way.
The approach in the stackoverflow link here works for linear regexps, but captures can be inside repetitions as well, and while you could potentially unroll some RegExp repetitions to make it possible to capture "everything up to the capture group" as a separate capture group, I can guarantee that I can create RegExps where that's not possible. And it would be a rewrite that we have to do at run-time in JS compiled code. We'd be better off creating our own JS-compiled RegExp implementation if that's what we wanted (we don't want to do that, the one in browsers is pretty efficient).

So, if you really, really want to know the indices of RegExp capture groups, you'll have to massage your RegExp yourself to capture "everything before the capture" as well. It's not something we can automate in an efficient and consistent way.

So, closing as not planned (until JS changes to make the information available).
(It is annoying because the underlying engine in the VM, and in Chrome and Firefox, does have the information. It remembers the position of the captures and can even create the capture strings lazily from those indices when you first ask for them).

@lrhn lrhn closed this as completed Jul 1, 2021
@lrhn lrhn added the closed-not-planned Closed as we don't intend to take action on the reported issue label Jul 1, 2021
@mraleph
Copy link
Member

mraleph commented Jul 1, 2021

@lrhn RegExp Match Indices are at Stage 4 in TC-39 (see https://github.com/tc39/proposal-regexp-match-indices) and are shipping in M91 and FF Nightly and Safari preview. So maybe you could reconsider closing this issue.

@lrhn lrhn reopened this Jul 1, 2021
@lrhn lrhn added status-blocked Blocked from making progress by another (referenced) issue and removed closed-not-planned Closed as we don't intend to take action on the reported issue labels Jul 1, 2021
@lrhn
Copy link
Member

lrhn commented Jul 1, 2021

That's great news. We'll still have to decide when we think the browsers we support are sufficiently updated to make the switch
(and then it'll probably be a breaking change to the RegExpMatch class. There are other null-safety related changes I'd like to make if we are breaking it anyway).

@lrhn
Copy link
Member

lrhn commented Sep 7, 2021

Proof of concept: https://dart-review.googlesource.com/c/sdk/+/212582

(Definitely needs more design work, and then we need to decide whether we think it's well-enough supported on the web).

@lrhn lrhn added the type-enhancement A request for a change that isn't a bug label Sep 14, 2021
@mraleph
Copy link
Member

mraleph commented Mar 28, 2025

Duplicate of #42307

@mraleph mraleph marked this as a duplicate of #42307 Mar 28, 2025
@mraleph mraleph closed this as completed Mar 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-core-library SDK core library issues (core, async, ...); use area-vm or area-web for platform specific libraries. library-core status-blocked Blocked from making progress by another (referenced) issue type-enhancement A request for a change that isn't a bug
Projects
None yet
Development

No branches or pull requests

5 participants