Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

strutils.rfind start parameter is unecessarily unusual #11430

Closed
c-blake opened this issue Jun 7, 2019 · 2 comments

Comments

@c-blake
Copy link
Contributor

commented Jun 7, 2019

The current Nim stdlib rfind uses start to refer to the high index of the slice searched backwards (the "start of search activity" not the "start of data being searched"), and does not support any low index search constraint - it always searches to index 0. In the forward (find) cases, both meanings (activity&data) of "start" coincide. So, there's no issue.

The reverse cases, well..It's unusual to say the least. C++ takes only a lower index bound/fence post. They call it pos: http://www.cplusplus.com/reference/string/string/rfind/ . Meanwhile Python takes both slice bounds and calls one start and the other end. (except for end, same names, same basic operations, different meaning..at odds with (simplified) tables like this #4218). C++/Python cover a lot of programmers.

Impact. I just did a search of every single package in nimble (or at least 833 of them!). None uses that third rfind parameter except https://github.com/Yardanico/nimpylib which incorrectly assumes start means the same thing in strutils as in Python (and he also passes a last parameter that doesn't exist - so clearly that's "never used code").

4 of those nimble packages even define their own proc called rfind which also take a "start-ish-word=start of data convention".

In short, I feel like there are strong "programming in the large" expectations about what extra parameters to rfind do, when they exist. Nim's rfind seems just different enough now to cause headaches porting code and changing it will probably bug no one.

The Nim strutils.rfinds are also limited in that you have to slice the string to do a bounded below search, but not bounded above.

Example

import strutils
let path = "/a/b/c/foo.tar.gz"
let slash = path.rfind('/')       # gives 6
echo path.rfind('.', start=slash) # gives -1 not 14
echo slash + path[slash..^1].rfind('.')  # 14 as expected

Possible Solutions

Since from my search of nimble packages no one uses the third argument (well, 1 incorrect usage), you can probably call it whatever you want or even..

  1. Delete the parameter. Make people slice the string if they want to bound the search. It's probably not quite as fast, but at least doesn't create any knee-jerk, false expectations. Not all Nim code is nimble, obviously. Not one use there that would be impacted though strongly suggests deletion would not cause much trouble.

  2. Be like every other string library I could find and add the other index bound/fencepost for good measure and better similarity with strutils.find which all take a start and last:

proc rfind*(s: string, sub: char, start=0, last=0): int =
  result = -1
  let last = if last==0: s.high else: last
  for i in countdown(last, start):
    if sub == s[i]: return i

or if you want to change the name "just in case":

proc rfind*(s: string, sub: char, lowIndex=0, highIndex=0): int =

The set[char] and string rfind would need similar easy fixes.

  1. Leave as-is but change the name of the parameter to be more suggestive/more different from forward find:
proc rfind*(s: string, sub: char, highIndex=-1): int =

or if you are strangely attatched to "activity speak":

proc rfind*(s: string, sub: char, loopBeginsAt=-1): int =
  1. Really hammer the reader over the head in the doc comment as to what is going on since it is very likely to not be what they expect at a quick glance.

Personally, I lean toward 2 to minimize effort porting code from other languages. 1) is only first for textual flow/being the least work.

@GULPF GULPF added the Stdlib label Jun 7, 2019

@Araq

This comment has been minimized.

Copy link
Member

commented Jun 8, 2019

Meh, I guess we should do what the other languages do.

@c-blake

This comment has been minimized.

Copy link
Contributor Author

commented Jun 12, 2019

Ok. I'll do a PR today.

Quick question - we could actually maintain positional parameter passing backward compatibility if we make the signature the reverse of the start/last order of strutils.find:

proc rfind*(s: string, sub: char, last = -1, start: Natural = 0): int

Then it's just renaming start->last and adding a new start. Like I said, though, use of the parameter at all is very rare. So, that aspect of backward compatibility is of limited utility. It also (to me) seems incoherent with being "like strutils.find" which would suggest:

proc rfind*(s: string, sub: char, start: Natural = 0, last = -1): int

It's debatable whether the reverse slice parameter order acts as a "usefully redundant visual cue" to indicate the "reverseness of it all" or if it seems more like a "booby trap" from moving back/forth between slice notation and parameter passing notation.

I'd go with (start, last), myself and put in sharply worded note in changelog.md for the 0.1% of users who might have been using it outside of nimble.

@Araq Araq closed this in bde899d Jun 15, 2019

Araq added a commit that referenced this issue Jun 15, 2019

[bugfix] fixes #11430
[bugfix] fixes #11430

narimiran added a commit to narimiran/Nim that referenced this issue Jun 16, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.