-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Dates #977
Comments
Can we spend some more time on this and add separate data types? |
@AtnNn I'm in your boat on this one but let's wait until we have the discussion phase. |
Moving back to backlog. Nested objects are more important. |
Having built in Date types and tools would be a major +1 :) |
+1 |
Why reimplementing date manipulation primitives would be necessary? Shouldn't suffice to have queries take native date types and have the driver convert to a wire format? |
@hcarvalhoalves suppose you wanted a query that says something like "Give me every event that happened on a Monday". In order for that to work on the server side, it would need a db-level function that determines the day of the date, otherwise you'd have to transfer all the data to the client first and do the processing on the client. |
@coffeemug Right, I get what you mean now. I just didn't get where the above methods fit in your example though. I expected date manipulation would look something like this:
No? Sorry if my questions are stupid, I'm still figuring out RethinkDB, I'm not sure what's possible or not. |
Yep, this is exactly right. Barring some syntactic debate, this is exactly how we'd do it, I think. |
I would like to propose a few modifications and additions to @mlucy's proposal. I'm going to call these objects times instead of dates. In the iso8601 representation, the time component should be optional as well as the time zone component. The P and T prefixes are optional for durations. One can write "1M" or "1:30" instead of "P1M" and "PT1:30". Durations can also be represented simply as a number of seconds. All operations assume that all days have exactly 86400 seconds. "23:60" is considered equal to "23:59". r.run() takes an additional time_zone argument that specifies the default time zone used in the query. The client drivers automatically set the time_zone to the local time zone of the client. The default time zone is only used where it is explicitly mentioned in this proposal. If time_zone is set to the empty string, time zone defaulting is disabled and all operation that would require using the default time zone fail.
When only one of time or other_time has a time zone, the other value is considered to be in the default time zone. If both time and other_time have no time zone, neither does the result.
There are no intervals. Times default to the default time zone if the time zone component is missing, unless all times have no time zones.
If the time is not present, the corresponding accessors return null. Ditto for the time zone.
A constructor.
Equivalent to time.time_sub("1970-01-01T00:00Z").totalSeconds()
The current time in the tz time zone. If tz is not specified, use the default time zone. If the default time zone is not specified, use "Z".
Format and parse times like strftime and strptime.
with_time_zone returns the same time in a different time zone. If the time has no time zone, it converts from the default time zone. With no arguments, it converts to the default time zone. set_time_zone simply changes the time zone component. Instead of using iso8601 string representation, times could instead be represented as a number of seconds since epoch and durations as a number of seconds. There are a few advantages:
There are also some disadvantages:
A better representation would be a new native type instead of a pseudo-type. This would have all these advantages and none of these disadvantages. |
FYI, this issue looks like the only one in 1.8-required that's non-trivial and wide open (since we didn't discuss it much). Let's try to hammer this out. (P.S. my feedback on the specific proposal coming soon) |
This is ambiguous. In ISO8610
What's the motivation for this? That sound wacky.
Could we call this
Why is sunday numbered from 0 when january is numbered from 1?
This name sounds slightly off to me. See below for discussion of how I think we should handle ISO8601 vs. epoch time.
This seems strange to have in the server. Shouldn't people be doing this in the clients once they've retrieved the date? Also, we'd be violating the standard unless we want the clients to send the whole locale to the server.
Could we call this
I think this is the biggest thing left to decide. Here's what I think we should do (although I need to think about it more):
|
Sorry to complicate matters, but I must. First, a few random thoughts:
I propose the following:
|
So, in this one particular case the So I wouldn't be incredibly opposed to doing it that way. The only large cost I can think of is that people could no longer use times as primary keys. What do you think, @AtnNn ? |
@coffeemug's proposal is actually almost exactly what I was in the process of writing. The one thing I wonder about is if we could even get away without have Also the issue about ordering is pretty moot IMO we should extend our ordering rules so they order custom types by their typenames rather than just considering them all to be objects. We can also allow for custom orderings of the different members of the type (although in this case the ordering we want happens to be the same as the one for objects. Not being able to have dates as keys is a pretty big deal breaker IMO opinion. We should just allow objects as keys there's really no good reason not to do that when we allow arrays as keys. |
I think that having people access non-existent attributes of objects as a metaphor for function calls might be a little confusing. Also, it would be confusing for Also, do we really allow arrays as primary keys? If we switch to supporting objects as primary keys, we'd have to go back and make the optional argument to |
May be you can borrow some ideas from golang time package as well. It is looking nice. |
Yeah accessing attributes for methods is not quite there. I really wish there was some way to make it work though. Arrays right now can't be pkeys but the only reason for that is space concerns and we can fix that fairly easily. Using times as keys in a secondary index is going to be really important to people though. I think making the optional argument to |
I wasn't saying we shouldn't make the optarg non-optional, I was saying that we should when/if we change the rules for primary keys. I just didn't want us to forget about it. |
In response to some comments above:
I don't like that for the same reasons @mlucy already mentioned
Agreed, this feature would be next to useless without indexing I thought about this some more and realized that there are 5 questions we need to answer, and while they're not all necessarily completely orthogonal, I think it's worth thinking about them independently:
After thinking a bit more, here's what I now think about each one of these:
|
Actually another open question (6) that's important is which types to implement. E.g.: Also FYI -- if you look at MySQL docs (http://dev.mysql.com/doc/refman/5.7/en/datetime.html) they offer a variety of types that handle relevant tradeoffs above differently (e.g. different representations on the server), and I think it's an overkill and is a poor design -- I bet having to choose between |
Implementing a native date type seems like the right solution to me. We've talked about supporting more types for a long time and, as @AtnNn mentioned above, now seems like as good time to finally address this question. However we decided the date type question now will influence how we solve the integer question etc. For any custom type, I'd say that there are at least three different representation choices that have to be made: on disk format, in memory format during processing, and the wire format as defined by the protobuf spec that driver developers (and not necessarily users) will see. These can all be distinct. A new datum type in the wire spec gives us the most freedom with the first two by hiding implementation details behind the third. The drivers then serve as another buffer between this format and the values manipulated by user programs, giving some flexibility there as well. The right way for drivers to handle this type is almost certainly to convert it to a native date type in the host language (or perhaps as a user facing class with host language friendly behavior). Users who wish to have a value that can be serialized as JSON can then make that decision for themselves according to the rules of their host language. The default node behavior if I call This seems much better than leaking a representation like |
From a backend perspective I would strongly prefer this to be of the form Another thing to consider is that if dates are represented as native datatypes then we either need a json representation of them or we can't encode rows which contain dates as json and they'll be a lot less efficient. We could of course have a json representation in addition to the native one but since that's what most drivers would be sending and receiving anyways it then seems pretty superfluous to have the native representation. |
Could we have |
@neumino yeah I think this type of polymorphism is basically accepted by everyone at this point. |
One benefit is storage efficiency. Storing a number with a few bits in front of it denoting the type is significantly more efficient than storing an object with string keys and string type description. Processing would probably be more efficient too since checking whether the datum is of a given type would likely be more efficient. If it's actually easiest to store things in the server this way, I'm not opposed to doing it. I just think that our representation choice on the server shouldn't be constrained by clients or json because these issues are completely orthogonal. We can store things on the server in any way we want, and always send them to the clients via the |
@jdoliner My point about flexibility is that we can actually choose different representations for the date type for each of four or so different stages. The on disk format doesn't need to be the same as the in memory format used by the query processing layer or the wire format consumed by the drivers or the format consumed by the end user. Given this independence, we're free to pursue different goals for each. We're also free to change the on disk format later without impacting driver developers or end users, or change the in-memory format without changing the on disk format. Since we will inevitably get this wrong now this flexibility will come in handy. Regardless of how we represent dates on the server I would prefer to hand dates to the driver using a separate datum type to the protobuf definition. This gives driver developers the most flexibility to easily convert the date type to whatever format makes most sense in the host language. If this means construction dictionaries like |
I really don't see this as a hack. If the language doesn't contain native types to represent the values I think returning an object like |
I can look into this. It might be tricky though. It depends on how we convert the cJSON objects for consumption by v8 which I'm not immediately familiar with. |
Date support would be a great addition, really looking forward to it! However, the proposed syntax makes me think of MongoDB and their inconsistent api which I'm not a big fan of. Trailing underscores and abbreviations are hard to remember:
Perhaps you could go with Mixing of camelCase and snake_case is very confusing:
It would be great if you picked one and used it consistently. |
Hi @sandstrom the sentiments you have here are definitely shared by the development team and I think everything you have an issue with is not making it into the final product. It's tough to tell because this is such a monstrous thread but we decided pretty early on in the conversation not to have As far as I know every driver we support adheres strictly to either camel case or snake case. Whichever is more standard in the language itself (JS is camel case, python and ruby are snake case). However because we on the dev team all maintain and use different drivers we all wind up with our own preferences for how to talk about things which can be confusing in issues. |
That's comforting! I've been impressed by the attention to detail that I've seen so far, so I'm glad to hear that the api won't suddenly disintegrate when adding date support :) ⛵ |
The python driver depends on the package pytz now and importing the driver fails if the package is not installed
|
Should the python driver be using the pytz package? This proposal only supports timezones as UTC offsets, which shouldn't require the Olson database. |
I might have left an unused import in there but it shouldn't actually be using the pytz package in any way. Currently the polyglot tests do require the package but it's not required to use the driver. |
Yup, it was an unused import. I've removed it if you want to keep playing with date support. |
Hum, I know that it's late but could And we could add a It would be nice if people could easily build a string out of a date object. Something like that. // HH:mm:ss
// Now returns HH:mm:ss.sssssssssssssss
r.now().hours().coerceTo('string').add(':').add(r.now().minutes().coerceTo('string')).add(':').add(r.now().seconds().coerceTo('string')) Using a regex is a workaround, but that sounds like a lot of work for a common operation. |
I think not being able to get the milliseconds out is pretty bad. I'd rather have seconds return fractions. |
It would be fine if we have |
@wmrowan, |
@neumino |
Err, I should have ack with insensitive case. Thanks @wmrowan |
@neumino I think implementing |
The review for docs has been completed and merge in |
This is finally in next. |
Commit hash? On Thu, Aug 8, 2013 at 4:30 PM, Michael Lucy notifications@github.comwrote:
|
Slow clap |
Ok, so @mglukhovsky just pointed out that slow clap is usually a sarcastic clap. Let's write this one off to cultural differences :) |
🎆 Can't wait to play with this! |
I think the comedy of errors that made up this issue's circumstances might justify the slow clap here by american standards. |
r.db('test').table('posts').filter(function(post) {
return post("date").day().eq(r.monday)
}) That's pretty cool :) I suggest now that we should have a flag "lang" in the driver, so I can use |
@neumino that wouldn't be very clean. We should host a francophone version of ReQL on a different port and make sure that all user facing text (including server generated error messages) is written in idiomatic French. |
Before we can support French and other languages that are not ASCII we need to fix our string encoding (#1181) |
I propose that we introduce three new pseudo-types: dates (by which I mean date, time, and optional timezone), durations, and intervals, all cribbed from ISO 8601 (http://en.wikipedia.org/wiki/ISO_8601), which we represent as strings.
We introduce the following terms:
We should also update the drivers to automatically render native dates/durations/intervals as the appropriate strings.
So, for example, in Ruby you could get all rows inserted in the last day with:
Or, if Ruby didn't have easy date manipulation:
The text was updated successfully, but these errors were encountered: