rename String => AbstractString #8872

Merged
merged 1 commit into from Nov 3, 2014

Projects

None yet

9 participants

@StefanKarpinski
Member

This seems like a pretty big improvement in the intention of the AbstractString type. It also clears the path for using the String name for the standard string type, i.e. a future replacement for both ASCIIString and UTF8String.

@johnmyleswhite
Member

+1

@JeffBezanson
Member

+1

@ivarne
Contributor
ivarne commented Nov 1, 2014

Seems like this would pave the way for a Integer to AbstractInteger rename, to keep things consistent across the language.

Just want to once again suggest that we could have String vs Str to keep a nice parallel to Integer vs Int. This would also not interfere with behaviour of old code. If we rather want to prefix abstract types with Abstract*, I would be fine with that tough.

@StefanKarpinski
Member

I disagree: I think that numeric abstract types like Number, Real and Integer are already pretty clearly abstract so the Abstract prefix is redundant and annoying. I'm not sure what the exact rule is by which this should be decided, but it seems to me that there is some consistent psychological rule here.

@JeffBezanson
Member

I agree with Stefan there. I think a prefixing convention only speaks to an absence of appropriate names. Number is self-evidently abstract, as would be Collection, Iterable, ProbabilityDistribution, Algorithm, etc.

Actually very few names in Base use the Abstract prefix right now. The only ones are AbstractArray (and derivatives) and AbstractRNG. There doesn't seem to be a good word for a general array-like thing, or for an abstract random number generator. And so many languages have a concrete type called string that reclaiming it as the name of an abstraction seems hopeless.

@jakebolewski
Member

It's funny that this will be the most breaking change in 0.4 so far...

@StefanKarpinski
Member

I'm going to merge this as soon as Travis is green since this is clearly an improvement and with the const alias it isn't breaking. I've been enjoying using the name Str for a string type in my sk/bytes branch.

@StefanKarpinski
Member

Thanks to @ivarne for the Str name idea.

@aviks
Member
aviks commented Nov 3, 2014

What's going to be the upgrade path for code that wants to work with both 0.3 and 0.4? Should compat contain an alias from String to AbstractString for 0.3.x?

@StefanKarpinski
Member

For now, no action needs to be taken but yes, that's what we should do.

@ivarne
Contributor
ivarne commented Nov 3, 2014

😄 Happy to be useful (at least until I start disliking my own suggestion).

@pao @aviks We can't (yet) deprecate other things than functions, so it won't cause problems for compatibility before we decide to remove the alias for String in 0.5, in preparation for reusing the String name for something else in 0.6. Then things will be problematic for 0.5 and 0.3 compatibility.

Edit: Sorry for the wrong mention

@StefanKarpinski StefanKarpinski merged commit bdc960c into master Nov 3, 2014

1 check failed

continuous-integration/travis-ci The Travis CI build could not complete due to an error
Details
@pao
Member
pao commented Nov 4, 2014

no problem @ivarne (this was worth reading in any case)

@jiahao jiahao referenced this pull request Nov 8, 2014
Closed

String not defined #8946

@stevengj
Member

I have to say that after a couple of months of this I'm really not enjoying typing AbstractString all over the place, nor am I enjoying explaining it to newbies. I would much prefer @ivarne's suggestion of keeping String abstract and using Str for a future unified UTF-8/ASCII type, analogous to Integer and Int. Strings are such a common and primitive type that they deserve short type names.

I don't remember ever being confused about whether String was abstract in previous Julia versions, so I'm not sure I understand Jeff's arguments about this being "hopeless".

Can we revisit this choice? (It reminds me of the array+scalar thing in #5810: reasonable in principle, endlessly annoying in practice.)

@nalimilan
Contributor

Integer vs. Int is IMHO one of the most confusing details about types, so I'm not in favor of introducing another similar subtlety. I think this is different from the array+scalar issue, as here there's a clear possibility of mistakes, with people using String instead of Str and not getting the specialized code they expect.

@stevengj
Member

@nalimilan, making several similar naming choices arguably reduces confusion. Unless you are proposing to rename Integer to AbstractInteger (ugh), you can just as easily argue that it is better to have more parallels to this rather than making Integer the exception, so that the distinction has a better chance of sinking in.

Furthermore, type declarations are most commonly used for function arguments. And for function arguments it is usually better to accidentally undertype (since there is no performance penalty) than to accidentally overtype.

The problem I have is that it is easy to remember to write f(s::String) = ... for any function that is supposed to work with arbitrary strings, and hard to remember (and harder to type) f(s::AbstractString). (There is the same problem with arrays: most code that operates on Array should really be for AbstractArray, but even experienced Julia programmers often forget to type the Abstract. Adding more names like this does not seem an improvement...it just makes Julia more annoying.)

But the basic argument is that shorter names for common things are better, as long as the abbreviations are clear, and the abbreviation Str for a string type is quite recognizable.

@nalimilan
Contributor

Currently you cannot be consistent with both the Integer/Int and AbstractArray/Array pairs. I could make the same argument against your proposal: do you think we should rename Array to Arr? :-p

So the consistency argument can only hold if we choose one rule and stick to it... I think this discussion already started somewhere, though I don't remember on which issue/PR.

@stevengj
Member

@nalimilan, the point is that the abstract vs. concrete confusion argument works both ways since we already have both conventions in Julia, so basically your only argument cancels out.

What remains after this cancellation is that:

  • shorter is better for common symbols (as long as it is recognizable: Arr fails this test, unfortunately).
  • types are most commonly specified in function arguments, where accidental overtyping is worse than accidental undertyping. Typing f(s::___String) is way more common than Array{___String}, as a quick grep of Julia base will verify. (An accidental string overtyping just came up again: #9435)
@jiahao jiahao deleted the sk/abstractstring branch Jan 18, 2015
@RatanRSur RatanRSur referenced this pull request in adambard/learnxinyminutes-docs Oct 10, 2015
Merged

change deprecated String type to AbstractString as per 0.4 spec #1447

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment