New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal to add Timestamp predefined ext type: -1 #207
Conversation
@frsyuki ping? |
I agree that it is good idea to add timestamp type.
I think having 1-nanosecond precision is the best. About timezone, I think it is not good idea to include timezone information with timestamp. Timestamp can represent an instant point, and time zone will be an additional information to represent the timestamp using a calendar. Therefore applications should be able to choose storing only a timestamp or both timestamp and time zone information because not all applications need to represent a timestamp using a calendar. About format, I expect 3 use cases:
To support 3 use cases, I created a proposal at #209. |
I am so glad you decided to go against the mess of bitflags and variable lengths that seemed to be winning in #130. I was really not on board with that, and it's so unlike the rest of MessagePack. I'm generally opposed to adding a bunch of new types to MessagePack. Each new type increases the complexity of decoding libraries and of the code that uses them, and is a new avenue for bugs and security issues. The current supported types are good because they are used by the majority of users. Everybody needs maps, arrays, strings and ints. Most people need floats. Lots of people need binary blobs (as evidenced by the preponderence of base64 strings in JSON.) With datetime, timestamps, bigintegers, and so on, we're starting to get into the territory of niche types for very specific use cases. This is especially problematic when these can be trivially put in a custom ext type by users, or even just an ordinary string, like an ISO 8601-encoded date. So, since the very good str/bin/ext split has passed, I'm glad MessagePack has been avoiding new types. Handling of dates and timezones is insanely complicated, more complicated than all of MessagePack, so it was weird seeing the author of Joda-Time suggesting moving all of this complexity into the protocol instead of handling it at a higher level from a more opaque type. This is really tangential to structured data serialization. If you want all of this data, just put it in an ISO 8601 string. The timestamp proposal in #209 though is simple and straightforward enough that I can get behind it. I like having explicit ext sizes to discriminate precision rather than letting it be variable, and the result is extremely efficient, since second-precision timestamps will be six bytes and (virtually) all nanosecond timestamps will be ten bytes. I'm somewhat torn on the issue of timezones, and I'd also like to err on the side of leaving it out, especially since this proposal is forward compatible: since this defines sizes 4, 8 and 12, we could always later define sizes 6, 10 and 14 which just add the timezone. One possibility we should consider is to make the 34-bit seconds field signed instead of unsigned in timestamp64. With the current proposal there's no way to represent a time before 1970 in less than timestamp96 precision. Making it signed would allow timestamp64 to represent dates from 1698-2242 instead of 1970-2514. It would be easy to implement it as signed if the seconds were placed above the nanoseconds instead of below, since downshifting from signed would extend the sign. (The more I think about it the more this seems like a bad idea, but it's something we should probably discuss at least.) Anyway I like the idea of standardizing a simple nanosecond timestamp, and I can see some immediate uses for it in my own projects. I'm on board, and if #209 is accepted you can count on MPack implementing it 👍 |
Any progress on this? I want to use Timestamp type for embulk's JSON type (Internally, it is a Msgpack) ref. embulk/embulk#417 |
As I noted in the other thread, representing date and time as a simple timestamp is next to useless, as it doesn't capture the semantic meaning of the data. However, it is entirely reasonable to question whether MsgPack wants to get into this. A newish format Temporenc has appeared (not to do with me, still Alpha I think), which provides for a compact representation of date/time. Its not perfect (as it doesn't have support for min/max) but would work well with MsgPack - ie. MsgPack would simply decide on an extension code for Temporenc, and make no further changes. |
Timestamp is now included in spec. |
In my opinion, Timestamp is likely most common data for many workload, and many language/programming environments provide feature to represent time (by Class or something such that).
I propose to add Time as first predefined ext format, to support to serialize/deserialize time objects more easily than ever.
Many programming languages have different precision about time, but some of these use nanosecond. So nanosecond precision looks enough to satisfy these requirement.