Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rename String => AbstractString #8872

Merged
merged 1 commit into from
Nov 3, 2014
Merged

Conversation

StefanKarpinski
Copy link
Sponsor Member

This seems like a pretty big improvement in the intention of the AbstractString type. It also clears the path for using the String name for the standard string type, i.e. a future replacement for both ASCIIString and UTF8String.

@johnmyleswhite
Copy link
Member

+1

1 similar comment
@JeffBezanson
Copy link
Sponsor Member

+1

@ivarne
Copy link
Sponsor Member

ivarne commented Nov 1, 2014

Seems like this would pave the way for a Integer to AbstractInteger rename, to keep things consistent across the language.

Just want to once again suggest that we could have String vs Str to keep a nice parallel to Integer vs Int. This would also not interfere with behaviour of old code. If we rather want to prefix abstract types with Abstract*, I would be fine with that tough.

@StefanKarpinski
Copy link
Sponsor Member Author

I disagree: I think that numeric abstract types like Number, Real and Integer are already pretty clearly abstract so the Abstract prefix is redundant and annoying. I'm not sure what the exact rule is by which this should be decided, but it seems to me that there is some consistent psychological rule here.

@JeffBezanson
Copy link
Sponsor Member

I agree with Stefan there. I think a prefixing convention only speaks to an absence of appropriate names. Number is self-evidently abstract, as would be Collection, Iterable, ProbabilityDistribution, Algorithm, etc.

Actually very few names in Base use the Abstract prefix right now. The only ones are AbstractArray (and derivatives) and AbstractRNG. There doesn't seem to be a good word for a general array-like thing, or for an abstract random number generator. And so many languages have a concrete type called string that reclaiming it as the name of an abstraction seems hopeless.

@jakebolewski
Copy link
Member

It's funny that this will be the most breaking change in 0.4 so far...

@StefanKarpinski
Copy link
Sponsor Member Author

I'm going to merge this as soon as Travis is green since this is clearly an improvement and with the const alias it isn't breaking. I've been enjoying using the name Str for a string type in my sk/bytes branch.

@StefanKarpinski
Copy link
Sponsor Member Author

Thanks to @ivarne for the Str name idea.

@aviks
Copy link
Member

aviks commented Nov 3, 2014

What's going to be the upgrade path for code that wants to work with both 0.3 and 0.4? Should compat contain an alias from String to AbstractString for 0.3.x?

@StefanKarpinski
Copy link
Sponsor Member Author

For now, no action needs to be taken but yes, that's what we should do.

@ivarne
Copy link
Sponsor Member

ivarne commented Nov 3, 2014

😄 Happy to be useful (at least until I start disliking my own suggestion).

@pao @aviks We can't (yet) deprecate other things than functions, so it won't cause problems for compatibility before we decide to remove the alias for String in 0.5, in preparation for reusing the String name for something else in 0.6. Then things will be problematic for 0.5 and 0.3 compatibility.

Edit: Sorry for the wrong mention

@pao
Copy link
Member

pao commented Nov 4, 2014

no problem @ivarne (this was worth reading in any case)

@jiahao jiahao mentioned this pull request Nov 8, 2014
@stevengj
Copy link
Member

I have to say that after a couple of months of this I'm really not enjoying typing AbstractString all over the place, nor am I enjoying explaining it to newbies. I would much prefer @ivarne's suggestion of keeping String abstract and using Str for a future unified UTF-8/ASCII type, analogous to Integer and Int. Strings are such a common and primitive type that they deserve short type names.

I don't remember ever being confused about whether String was abstract in previous Julia versions, so I'm not sure I understand Jeff's arguments about this being "hopeless".

Can we revisit this choice? (It reminds me of the array+scalar thing in #5810: reasonable in principle, endlessly annoying in practice.)

@nalimilan
Copy link
Member

Integer vs. Int is IMHO one of the most confusing details about types, so I'm not in favor of introducing another similar subtlety. I think this is different from the array+scalar issue, as here there's a clear possibility of mistakes, with people using String instead of Str and not getting the specialized code they expect.

@stevengj
Copy link
Member

@nalimilan, making several similar naming choices arguably reduces confusion. Unless you are proposing to rename Integer to AbstractInteger (ugh), you can just as easily argue that it is better to have more parallels to this rather than making Integer the exception, so that the distinction has a better chance of sinking in.

Furthermore, type declarations are most commonly used for function arguments. And for function arguments it is usually better to accidentally undertype (since there is no performance penalty) than to accidentally overtype.

The problem I have is that it is easy to remember to write f(s::String) = ... for any function that is supposed to work with arbitrary strings, and hard to remember (and harder to type) f(s::AbstractString). (There is the same problem with arrays: most code that operates on Array should really be for AbstractArray, but even experienced Julia programmers often forget to type the Abstract. Adding more names like this does not seem an improvement...it just makes Julia more annoying.)

But the basic argument is that shorter names for common things are better, as long as the abbreviations are clear, and the abbreviation Str for a string type is quite recognizable.

@nalimilan
Copy link
Member

Currently you cannot be consistent with both the Integer/Int and AbstractArray/Array pairs. I could make the same argument against your proposal: do you think we should rename Array to Arr? :-p

So the consistency argument can only hold if we choose one rule and stick to it... I think this discussion already started somewhere, though I don't remember on which issue/PR.

@stevengj
Copy link
Member

@nalimilan, the point is that the abstract vs. concrete confusion argument works both ways since we already have both conventions in Julia, so basically your only argument cancels out.

What remains after this cancellation is that:

  • shorter is better for common symbols (as long as it is recognizable: Arr fails this test, unfortunately).
  • types are most commonly specified in function arguments, where accidental overtyping is worse than accidental undertyping. Typing f(s::___String) is way more common than Array{___String}, as a quick grep of Julia base will verify. (An accidental string overtyping just came up again: listen(::UTF8String) throws error #9435)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants