Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating regex tertiary methods #19622

Merged
merged 19 commits into from Apr 29, 2022
Merged

Conversation

slnguyen
Copy link
Contributor

This PR addresses #17226 and does the following:

  • Replaces string.search and bytes.search (returns a regexMatch) with 'string.find' and bytes.find, which returns a 'byteIndex' that represents the offset where a regular expression and string/byte match
  • Replaces 'string.matches' and bytes.matches (returns a regexMatch) with string.startsWith and bytes.startsWith, which returns true if a string/byte starts with a given regular expression
  • Deprecates string.matches and bytes.matches
  • Renames the pattern argument to sep in string.split and bytes.split
  • Updates current tests and adds deprecation tests to reflect the above changes
  • Adds documentation to reflect the above changes

---
Signed-off-by: slnguyen <26969019+slnguyen@users.noreply.github.com>
---
Signed-off-by: slnguyen <26969019+slnguyen@users.noreply.github.com>
…d adding string.find and bytes.find which return byteIndex

---
Signed-off-by: slnguyen <26969019+slnguyen@users.noreply.github.com>
---
Signed-off-by: slnguyen <26969019+slnguyen@users.noreply.github.com>
… argument to sep in split function

---
Signed-off-by: slnguyen <26969019+slnguyen@users.noreply.github.com>
---
Signed-off-by: slnguyen <26969019+slnguyen@users.noreply.github.com>
---
Signed-off-by: slnguyen <26969019+slnguyen@users.noreply.github.com>
---
Signed-off-by: slnguyen <26969019+slnguyen@users.noreply.github.com>
---
Signed-off-by: slnguyen <26969019+slnguyen@users.noreply.github.com>
…ex_tertiary

---
Signed-off-by: slnguyen <26969019+slnguyen@users.noreply.github.com>
…and bytes.startsWith

---
Signed-off-by: slnguyen <26969019+slnguyen@users.noreply.github.com>
---
Signed-off-by: slnguyen <26969019+slnguyen@users.noreply.github.com>
---
Signed-off-by: slnguyen <26969019+slnguyen@users.noreply.github.com>
---
Signed-off-by: slnguyen <26969019+slnguyen@users.noreply.github.com>
---
Signed-off-by: slnguyen <26969019+slnguyen@users.noreply.github.com>
Copy link
Member

@lydia-duncan lydia-duncan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like there's a mention of string.search on line 64 in the file that should be replaced with string.find.

Overall, I mostly was pointing out additional references and things to adjust. The changes themselves are otherwise pretty good, though there's one or two other things I had some suggestions on.

modules/standard/Regex.chpl Outdated Show resolved Hide resolved
modules/standard/Regex.chpl Outdated Show resolved Hide resolved
modules/standard/Regex.chpl Outdated Show resolved Hide resolved
modules/standard/Regex.chpl Outdated Show resolved Hide resolved
modules/standard/Regex.chpl Show resolved Hide resolved
proc string.search(pattern: regex(string), ref captures ...?k):regexMatch
{
return pattern.search(this, (...captures));
}


pragma "last resort"
deprecated "the 'needle' argument is deprecated, use 'pattern' instead"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one also doesn't mention that there's a new name for the function.

should be strings or types that strings can cast to.
:returns: an :record:`regexMatch` object representing the offset in the
receiving string where a match occurred
:returns: true if string starts with :arg pattern:, false otherwise
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:arg pattern: is only useful at the start of a line, where it generates some formatting for it. I think what you meant is ":arg:pattern" but I don't know that that will actually generate a link so it's probably best to surround pattern with two backtics.

---
Signed-off-by: slnguyen <26969019+slnguyen@users.noreply.github.com>
---
Signed-off-by: slnguyen <26969019+slnguyen@users.noreply.github.com>
@@ -1215,7 +1216,7 @@ proc string.search(needle: regex(string), ref captures ...?k):regexMatch
/* Search the receiving string for a regular expression already compiled
by calling :proc:`regex.search`. Search for matches at any offset.

.. warning:: the search function with a receiving string is deprecated
.. warning:: the search function with a receiving string is deprecated, use regex search instead
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks a little off, should it be on the following line?

Copy link
Member

@lydia-duncan lydia-duncan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the minor formatting thing, looks good! Thanks!

---
Signed-off-by: slnguyen <26969019+slnguyen@users.noreply.github.com>
---
Signed-off-by: slnguyen <26969019+slnguyen@users.noreply.github.com>
@slnguyen slnguyen merged commit 7fdba55 into chapel-lang:main Apr 29, 2022
vasslitvinov added a commit to vasslitvinov/chapel that referenced this pull request Dec 1, 2022
…ng#19622

Signed-off-by: Vassily Litvinov <vasslitvinov@users.noreply.github.com>
vasslitvinov added a commit that referenced this pull request Dec 1, 2022
Remove some features deprecated ≥two releases ago

Remove the following deprecated features,
shown here with the date and PR of deprecation:

* formals `needles`, `patterns` in string|bytes.find() et al. [2/24 #19222]
* formal `dotnl` in regex.compile() [2/23 #17191]
* method list.`indexOf`() [1/20 #19059]
* class `BadRegexpError` [3/3 #19308]
* fields regexMatch.`offset`,`size` [3/07 #19365]
* string|bytes.`search`(),`split(pattern)` et al. [4/29 #19622]
* channel.`match`() and some channel methods with `ref error` [3/23 #17448]
* formal `i` in domain.contains() et al. [1/24 #19075]
* operators `|` `&` `^` on rectangular domains [1/30 #19103]
* methods domain.`isSuper`(),`isSubset`() [1/30 #19104]
* functions `isRectangularDom`(), `isRectangularDom`() et al. [Oct'2021 #18534]

Notes on removing channel.`match`() et al.

* I retained this method and avoided deprecation warnings upon calling it:
  `proc _channel.search(re:regex(?)):regexMatch`

  For that, I converted a deprecated public method that it uses
  to a non-deprecated private one.

* I moved `test/deprecated/regexp-rechan.chpl`
  back to `test/regex/ferguson/rechan.chpl`
  as it contains functionality not tested otherwise.

  For that, I converted calls to match() to calls to search()
  prepending "^" to the pattern they are matching for.
  This is what we expect users to do to adjust their codes.

  This changed the ending channel offset in some cases.
  I noted this in #20864.

r: @lydia-duncan - thanks!
ct-clmsn pushed a commit to ct-clmsn/chapel that referenced this pull request Feb 11, 2023
…ng#19622

Signed-off-by: Vassily Litvinov <vasslitvinov@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants