You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We encode literals from Python as JSON. This dramatically inflates the encoding of hl.tarray(hl.tstruct(...)) because the struct fields are repeated. We should instead use EncodedLiteral. There are some subtleties:
We need to implement an encoder in Python. We currently have a decoder (_from_encoding and _convert_from_encoding) which implements {"name":"StreamBufferSpec"}. We should write an encoder for that.
Our IR is plaintext, so we'll need to encode this binary encoding as base64 so it may be included in plaintext.
EncodedLiteral is not currently parsable (it throws UnsupportedOperationException in ir/Parser.scala). As mentioned in (2), we'll need to implement parsing.
For (3), just represent the AbstractTypedCodecSpec in terms of a virtual type and a buffer spec (assume the buffer spec is StreamBufferSpec). When parsing, we can construct an EncodedLiteral with the base64-decoded value and a codec spec we construct like this:
For (1), it should be simple enough to invert the logic from _convert_from_encoding to encode rather than decode. Write tests that round-trip in Python for lots of types.
…13814)
CHANGELOG: Pipelines that are memory-bound by copious use of
`hl.literal`, such as `vds.filter_intervals`, require substantially less
memory.
Closes#13757
What happened?
We encode literals from Python as JSON. This dramatically inflates the encoding of
hl.tarray(hl.tstruct(...))
because the struct fields are repeated. We should instead useEncodedLiteral
. There are some subtleties:_from_encoding
and_convert_from_encoding
) which implements{"name":"StreamBufferSpec"}
. We should write an encoder for that.EncodedLiteral
is not currently parsable (it throwsUnsupportedOperationException
inir/Parser.scala
). As mentioned in (2), we'll need to implement parsing.For (3), just represent the
AbstractTypedCodecSpec
in terms of a virtual typeand a buffer spec(assume the buffer spec is StreamBufferSpec). When parsing, we can construct anEncodedLiteral
with the base64-decoded value and a codec spec we construct like this:For (1), it should be simple enough to invert the logic from
_convert_from_encoding
to encode rather than decode. Write tests that round-trip in Python for lots of types.See also #13748
Version
0.2.142
Relevant log output
No response
The text was updated successfully, but these errors were encountered: