You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In general large Bytes values seem to be fine, but literal values of Bytes have bad performance and result in excessive resource usage that tends to get the Unison process killed by the OS.
Here is a fairly minimal way to reproduce:
Start up Unison with a fresh codebase: ❯ unison -C $(mktemp -d)
Open a scratch.u file and save the following (you don't need to add it):
main : '{Exception, IO} Bytes
main = do
splitmix 42 do
Random.bytes 100000
.> run main
.> add.run largeBytes
This all happens fairly quickly. Now I open my scratch file, add the watch expression > Bytes.size largeBytes and save. My ucm process hangs for a while and then gets killed by my OS.
@dolio and I talked about this a bit, and he said that Bytes literals don't have a dedicated representation in the AST and that they end up getting inefficiently encoded as [Nat] where each Nat represents a single Byte. It sounds like there are even further issues with excessive stack frames for large Bytes literals.
The text was updated successfully, but these errors were encountered:
Yeah, the problem is that e.g. 0xs0123 is actually something like fromList [0x01, 0x23] which then turns into (pseudocode):
let x0u = 0x01u -- unboxed number
x0 = Nat x0u
x1u = 0x23u
x1 = Nat x1u
l = mkList x0 x1
in Bytes.fromList l
So, there's a massive blowup from that representation. Not only is every byte represented as 8 bytes, but code has to be emitted to build the representation, and put it in a sequence. So there's a huge per byte overhead on the code representation even.
The temporary workaround I suggested is to use a string literal of the hex digits, because the code emitted for strings is much more compact. It'd be nice if that were the actual representation of bytes literals, or if they were directly represented. I guess code could be written to detect the same fromBytes [...] with all literal numbers in the compiler, but that's additional work.
In general large
Bytes
values seem to be fine, but literal values ofBytes
have bad performance and result in excessive resource usage that tends to get the Unison process killed by the OS.Here is a fairly minimal way to reproduce:
Start up Unison with a fresh codebase:
❯ unison -C $(mktemp -d)
Open a
scratch.u
file and save the following (you don't need toadd
it):.> run main
.> add.run largeBytes
This all happens fairly quickly. Now I open my scratch file, add the watch expression
> Bytes.size largeBytes
and save. My ucm process hangs for a while and then gets killed by my OS.@dolio and I talked about this a bit, and he said that
Bytes
literals don't have a dedicated representation in the AST and that they end up getting inefficiently encoded as[Nat]
where eachNat
represents a singleByte
. It sounds like there are even further issues with excessive stack frames for largeBytes
literals.The text was updated successfully, but these errors were encountered: