Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a switch to Select-String that returns the matching parts only, analogous to grep -o #7712

Closed
mklement0 opened this issue Sep 5, 2018 · 19 comments
Labels
Committee-Reviewed PS-Committee has reviewed this and made a decision First-Time-Issue Easy issues first time contributors can work on to learn about this project Hacktoberfest Potential candidate to participate in Hacktoberfest Issue-Enhancement the issue is more of a feature request than a bug Resolution-No Activity Issue has had no activity for 6 months or more Up-for-Grabs Up-for-grabs issues are not high priorities, and may be opportunities for external contributors WG-Cmdlets-Utility cmdlets in the Microsoft.PowerShell.Utility module

Comments

@mklement0
Copy link
Contributor

mklement0 commented Sep 5, 2018

Related: #7713 and #7867

Sometimes all you want Select-String to do is to output only the matching parts of the input lines as strings, similar to what grep -o does on Unix-like platforms; e.g.:

# Extract only the parts that match the regex, each on its own line.
PSonUnix> "line1`nline2`nline3" | grep -o '[0-9]'
1
2
3

The equivalent Select-String solution is currently cumbersome:

PS> "line1", "line2", "line3" | Select-String '[0-9]' | ForEach-Object { $_.Matches[0].Value }
1
2
3

If we introduced a switch named, say, -MatchingPartOnly (name TBD, could have an alias of -o)
-OnlyMatching (see decision below), the command could be simplified to:

PS> "line1", "line2", "line3" | Select-String '[0-9]' -OnlyMatching

As an alias, -om could be considered (just -o could break existing code that used it for -OutVariable).

This would also speed up processing, because constructing [Microsoft.PowerShell.Commands.MatchInfo] instances can be bypassed.

Environment data

Written as of:

PowerShell Core 6.1.0-preview.4
@iSazonov iSazonov added Issue-Enhancement the issue is more of a feature request than a bug WG-Cmdlets-Utility cmdlets in the Microsoft.PowerShell.Utility module labels Sep 6, 2018
@iSazonov
Copy link
Collaborator

iSazonov commented Sep 6, 2018

@SteveL-MSFT Can you approve?

@iSazonov
Copy link
Collaborator

iSazonov commented Sep 6, 2018

I think this should works for -SimpleMatch too.

@SteveL-MSFT
Copy link
Member

@PowerShell/powershell-committee reviewed this, seems fine to add but should match the grep description of the parameter and call it -OnlyMatching

@SteveL-MSFT SteveL-MSFT added Committee-Reviewed PS-Committee has reviewed this and made a decision Up-for-Grabs Up-for-grabs issues are not high priorities, and may be opportunities for external contributors and removed Review - Committee The PR/Issue needs a review from the PowerShell Committee labels Nov 7, 2018
@iSazonov
Copy link
Collaborator

iSazonov commented Nov 8, 2018

The OnlyMatching name doesn't correlate with "output" 😕

@SteveL-MSFT SteveL-MSFT added the First-Time-Issue Easy issues first time contributors can work on to learn about this project label Jan 1, 2019
@TylerLeonhardt TylerLeonhardt added the Hacktoberfest Potential candidate to participate in Hacktoberfest label Feb 19, 2019
@HumanEquivalentUnit
Copy link
Contributor

Sometimes all you want Select-String to do is to output only the matching parts of the input lines as strings

Seems like a lot of the time, that's all I want -match to do as well; and this would get roughly as close as my imaginary -keep operator (the inverse of -replace):

PS C:\> 'a word and here' | Select-String -AllMatches -MatchingPartOnly -Pattern'\w{4}'
word
here
PS C:\> 'a word and here' | sls -a -m '\w{4}'
word
here
PS C:\> 'a word and here' -keep '\w{4}'
word
here

😃 Yay!

@mklement0
Copy link
Contributor Author

@HumanEquivalentUnit, I quite like the idea of a -keep (or perhaps -extract) operator . Can I suggest you create a feature request for it?

@HumanEquivalentUnit
Copy link
Contributor

@mklement0 I did; #7958 but it looked like a duplicate at the time, of your linked issue which is basically the same request.

@mklement0
Copy link
Contributor Author

Oops! Completely forgot about #7867's -matchall proposal, which is indeed the same in essence - thanks.

@mklement0
Copy link
Contributor Author

@SteveL-MSFT There's one remaining design question to answer:

grep -o returns multiple matches on each line:

$ echo foo | grep -o o
o
o

If we follow this logic, -OnlyMatching would effectively invariably imply -AllMatches.

However, it may make more sense to have -OnlyMatching only report the first match by default, with the option to report all if -AllMatches is also specified.

@iSazonov
Copy link
Collaborator

It is related to question about SimpleMatch + AllMatches.

@mklement0
Copy link
Contributor Author

Yes: here we don't strictly have a backward-compatibility problem, because -OnlyMatching will be a new feature.

Technically we are therefore free to to support combining -SimpleMatch and -AllMatches with -OnlyMatching to output all matching literal substrings (even multiple ones on a single line).

However, given the committee decision not to fix #11102 in order to support combining -SimpleMatch and -AllMatches (in the default, non--OnlyMatching case), this would lead to an awkward asymmetry.

@vexx32
Copy link
Collaborator

vexx32 commented May 28, 2020

/cc @SteveL-MSFT

@iSazonov
Copy link
Collaborator

I believe we could have an symmetry and PowerShell Committee could discuss both cases:

  • -SimpleMatch + -AllMatches
  • -OnlyMatching + -AllMatches

@vazome
Copy link

vazome commented Aug 8, 2020

If someone curious about a current way of doing so, you can use this:

$value = 'aa' #or even [System.IO.File]::ReadLines($filename)
$options = [Text.RegularExpressions.RegexOptions]::IgnoreCase -bor [Text.RegularExpressions.RegexOptions]::CultureInvariant
$regex = 'REGEX'
$properselectstring = [regex]::Matches($value,$regex,$options)
$properselectstring.value

@Kissaki
Copy link

Kissaki commented Oct 16, 2021

This issue is labeled Up-for-Grabs but PR #10696 seems to tackle it? If so, wouldn't it be adequate to remove the up for grabs label?

Copy link
Contributor

This issue has not had any activity in 6 months, if this is a bug please try to reproduce on the latest version of PowerShell and reopen a new issue and reference this issue if this is still a blocker for you.

Copy link
Contributor

This issue has not had any activity in 6 months, if this is a bug please try to reproduce on the latest version of PowerShell and reopen a new issue and reference this issue if this is still a blocker for you.

1 similar comment
Copy link
Contributor

This issue has not had any activity in 6 months, if this is a bug please try to reproduce on the latest version of PowerShell and reopen a new issue and reference this issue if this is still a blocker for you.

@microsoft-github-policy-service microsoft-github-policy-service bot added the Resolution-No Activity Issue has had no activity for 6 months or more label Nov 16, 2023
Copy link
Contributor

This issue has been marked as "No Activity" as there has been no activity for 6 months. It has been closed for housekeeping purposes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Committee-Reviewed PS-Committee has reviewed this and made a decision First-Time-Issue Easy issues first time contributors can work on to learn about this project Hacktoberfest Potential candidate to participate in Hacktoberfest Issue-Enhancement the issue is more of a feature request than a bug Resolution-No Activity Issue has had no activity for 6 months or more Up-for-Grabs Up-for-grabs issues are not high priorities, and may be opportunities for external contributors WG-Cmdlets-Utility cmdlets in the Microsoft.PowerShell.Utility module
Projects
None yet
8 participants