Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String#bytes and Char#bytes should return a Slice(UInt8) #7872

Closed
watzon opened this issue Jun 8, 2019 · 8 comments
Closed

String#bytes and Char#bytes should return a Slice(UInt8) #7872

watzon opened this issue Jun 8, 2019 · 8 comments

Comments

@watzon
Copy link
Contributor

watzon commented Jun 8, 2019

Currently the methods String#bytes and Char#bytes return an Array(UInt8). However the alias Bytes is a Slice(Uint8). This creates issues semantically in the following situation:

def byte_frequency(data : Bytes) : Hash(UInt8, Int32)
  data.reduce({} of UInt8 => Int32) do |acc, byte|
    acc.has_key?(byte) ? (acc[byte] += 1) : (acc[byte] = 1)
    acc
  end
end

puts byte_frequency("Hello world".bytes)

One would expect this to work because semantics say that "Hello world".bytes will return something that can be passed into a method that accepts data : Bytes, but this will fail because an Array(UInt8) is not the same as Bytes.

@oprypin
Copy link
Member

oprypin commented Jun 9, 2019

This post can be reduced to:

I expect the word "bytes" to always mean the exact same thing

and that doesn't convince me.

@oprypin
Copy link
Member

oprypin commented Jun 9, 2019

Your type annotation there is wrong indeed, and needlessly narrows the usefulness. Should be Iterable(UInt8).

@oprypin
Copy link
Member

oprypin commented Jun 9, 2019

To further dismantle this logic,

One would expect this to work because semantics say that "Hello world".hash will return something that can be passed into a method that accepts data : Hash

(and no, semantics doesn't say either of those things)

@oprypin
Copy link
Member

oprypin commented Jun 9, 2019

As a side point, I do think that Bytes , and Hash more so, are badly named.

@watzon
Copy link
Contributor Author

watzon commented Jun 9, 2019

I just don't think it makes sense for there to be a set type of Bytes that's a Slice(UInt8) and then have a String#bytes that returns an Array(UInt8). Yes I know my example isn't perfect, but that doesn't make my point any less valid.

@oprypin
Copy link
Member

oprypin commented Jun 9, 2019

The point has no validity in the first place because it pivots only on the name to establish some kind of inconsistency. Then you talk about only one of many equally (if not more) valid possibilities to resolve the alleged inconsistency.

I think Array(UInt8) is a more appropriate return type here than Slice(UInt8).

If I were to consider this an inconsistency, my first choice would be to remove the Bytes alias. Slice really should be reserved to I/O and bindings, as it defines many unsafe methods and fewer generally useful ones. But this alias somehow makes it seem like a generally acceptable type to work with.

And you just said,

I just don't think it makes sense for there to be a set type of Bytes that's a Slice(UInt8) and then have a String#bytes that returns an Array(UInt8).

-- sure, I probably even agree with that, but I take issue with the jump to a conclusion.

@wooster0
Copy link
Contributor

wooster0 commented Jun 9, 2019

This has been discussed in the past as well: #3551.

@watzon watzon closed this as completed Jun 9, 2019
@straight-shoota
Copy link
Member

Duplicate of #3551

@straight-shoota straight-shoota marked this as a duplicate of #3551 Jun 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants