Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor Inconsistency String#bytes vs Bytes type #3551

Closed
lribeiro opened this issue Nov 15, 2016 · 11 comments
Closed

Minor Inconsistency String#bytes vs Bytes type #3551

lribeiro opened this issue Nov 15, 2016 · 11 comments

Comments

@lribeiro
Copy link

lribeiro commented Nov 15, 2016

typeof("324".bytes) -> Array(UInt8)
typeof(Bytes) -> Slice(UInt8)

Shouldn't the String#bytes method, return a Slice(UInt8) instead of an Array(UInt8) ?

@asterite
Copy link
Member

An array is usually more useful as it's mutable and it's a copy of the underlying string bytes. If you want the bytes as a slice for performance reasons you can use to_slice or to_unsafe.

@asterite
Copy link
Member

I mean, if somebody eventually introduces, say, a Downcase type, we then have String#downcase and it would be inconsistent because it returns a String? I don't think so.

Also the docs say: "Returns this string's bytes as an Array(UInt8)"

@Sija
Copy link
Contributor

Sija commented Nov 16, 2016

@asterite There's also Slice#to_a to easily do the opposite :)

@asterite
Copy link
Member

asterite commented Nov 16, 2016

I don't understand, we also have #chars and #codepoints. Should these return Slice instead of Array? Array is the preferred data type to return because it's the most flexible and useful, and it's the one that is commonly returned in API methods.

@lribeiro
Copy link
Author

When working with Bytes it mentally trips me to use String#bytes because of the name similarity.

@asterite Ruby for instance does return an Array. But it as no Slice equivalent on std lib (to my best knowledge, didn't check).

Why is Bytes an alias for Slice(UInt8) and not Array(UInt8)?

I don't have a strong preference either way, I just think it would be better if they are consistent.
It's not a big deal :)

@asterite
Copy link
Member

I was kind of against the Bytes alias... maybe this explains why :-)

Slice(UInt8) is the most common slice type and it's all over the place, so we thought Bytes would be a good alias for it. Array(UInt8) it not a very common type at all. In fact, maybe we could remove String#bytes for that matter...

@lribeiro
Copy link
Author

I found Bytes by accident reading the source code and started using it on my code, I thought it was kind of cool and it does sound better than Slice(UInt8).

If it is renamed to String#to_a that would open up space for String#bytes returning Bytes :)

@kostya
Copy link
Contributor

kostya commented Nov 18, 2016

thats why i vote for name Binary :)

@lbguilherme
Copy link
Contributor

A better name for this could be "aaa".utf8. The idea is that while all you have is a String, the fact that it is stored as an UTF8 sequence of bytes is just an implementation detail that is not disclosed. But this is the method that exposes this binary representation.

@asterite
Copy link
Member

I'm closing this, this is a minor consistency, and I'm sure there are many more bigger inconsistencies throughout the language. It's really impossible for a language to be 100% consistent. Bytes (or Slice(UInt8) is mostly used in low-level code to read/write from an IO. Array is the preferred container type, so it's OK for String#bytes to return Array(UInt8).

@j8r
Copy link
Contributor

j8r commented Jun 9, 2019

To clarify the method, it could be String#bytes_array : Array(UInt8).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants