You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to create a tensorstore for time series data. I have the numerical data in a tensorstore and now I'm trying to define a new tensorstore for the DateTime index. I read that TS doesn't support the zarr datetime dtype (dtype('<M8[ns]')) yet, so I just defined the store as string dtype like this:
Then I'm just casting a Pandas datetime index to a byte string.
So now I get
index[0, :].read().result()
>> [b'']
That's fine, it's just empty.
But I just can't succesfully write values to it. I'm trying e.g. index[0, 0].write(b'a').result() or variations but I just don't get any error etc and nothing is written, I just the the [b''] as above.
Also, using 'dtype': '|S2' just defines each row as a list of empty byte strings... is this intentional? I.e. should I define S1 for any length strings? Probably not... but how do I fill the values? And is there a silent error somewhere when writing fails?
The text was updated successfully, but these errors were encountered:
NumPy itself does not support a proper variable-length string type. The "|S" dtype is a fixed-length string data type, where the length is <N>. If you store a shorter string, it will be zero-padded up to that length.
In TensorStore, the zarr |S<N> dtype is treated as an array of characters (bytes), where there is an extra dimension (after the dimensions given by shape) that corresponds to the string length. However, there appears to be a bug in the Python API handling of this character data type. We will need to fix that.
Separately, zarr does have a way of encoding variable-length strings, via the vlen-utf8 and vlen-bytes filters, but those still need to be implemented in TensorStore; they haven't previously been requested.
I'm trying to create a tensorstore for time series data. I have the numerical data in a tensorstore and now I'm trying to define a new tensorstore for the DateTime index. I read that TS doesn't support the zarr datetime dtype (
dtype('<M8[ns]')
) yet, so I just defined the store as string dtype like this:Then I'm just casting a Pandas datetime index to a byte string.
So now I get
That's fine, it's just empty.
But I just can't succesfully write values to it. I'm trying e.g.
index[0, 0].write(b'a').result()
or variations but I just don't get any error etc and nothing is written, I just the the[b'']
as above.Also, using
'dtype': '|S2'
just defines each row as a list of empty byte strings... is this intentional? I.e. should I defineS1
for any length strings? Probably not... but how do I fill the values? And is there a silent error somewhere when writing fails?The text was updated successfully, but these errors were encountered: