Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

startswith and endswith look suspicous #12917

Closed
eschnett opened this issue Sep 2, 2015 · 7 comments
Closed

startswith and endswith look suspicous #12917

eschnett opened this issue Sep 2, 2015 · 7 comments

Comments

@eschnett
Copy link
Contributor

eschnett commented Sep 2, 2015

The definitions for startswith and endswith for strings look suspicious (base/string/util.jl):

startswith(str::AbstractString, chars::Chars) = !isempty(str) && str[start(str)] in chars
endswith(str::AbstractString, chars::Chars) = !isempty(str) && str[end] in chars
  1. Why start(str)? Isn't this always 1?
  2. Why end? Shouldn't this be endof(str) instead?
@yuyichao
Copy link
Contributor

yuyichao commented Sep 2, 2015

end is lowered to endof

@nalimilan
Copy link
Member

...but indeed 1 and start(str) are equivalent for strings, and I think for all AbstractArrays. Not a big deal though.

@simonster
Copy link
Member

If we're going for maximum generality it should be first(str) and last(str). start is definitely wrong since a custom string type could have some other way of defining iteration. I also find the behavior of accepting a Set{Char} or AbstractVector{Char} a bit questionable, since startswith doesn't accept a Vector{T<:AbstractString} or Set{T<:AbstractString}.

@yuyichao
Copy link
Contributor

yuyichao commented Sep 2, 2015

@simonster I made that mistake some time ago too. first and last returns the element not the index. endsof is the right way to get the last index and IIRC we don't have a generic way to get the start index and for most places we just use 1.

@simonster
Copy link
Member

We don't need the index here, just the first and last characters.

@yuyichao
Copy link
Contributor

yuyichao commented Sep 2, 2015

Ahh. That's right

@stevengj
Copy link
Member

stevengj commented Sep 2, 2015

I agree that first and last are the right choice, although the current implementation should work for our current string types even when the last character has a multi-unit encoding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants