Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement TIME type in the backend #628

Merged
merged 28 commits into from
Dec 13, 2021
Merged

Implement TIME type in the backend #628

merged 28 commits into from
Dec 13, 2021

Conversation

eito-fis
Copy link
Contributor

@eito-fis eito-fis commented Sep 6, 2021

Fixes #426

Implements the TIME type in the backend, with options for precision and timezone.

Technical details
Some oddities due to how sqlalchemy handles TIME. In particular, it will compile using the timezone and precision attributes, but won't take them in at initialization time. As a result, we have use two additional custom types, TIME_WITH_TIMEZONE and TIME_WITHOUT_TIMEZONE, that simply extend the init method to take timezone and precision.

Additional notes:

  • We only use TIME_WITH_TIMEZONE and TIME_WITHOUT_TIMEZONE - just TIME is not supported. TIME has been removed from the type map to reflect this.
  • Precision acts strangely - the time object the data is reflected as doesn't really support precision. It has millisecond, which is equivalent to a precision of 1, but no more. As a result, currently settings precision as anything valid but greater than 1 is the same as setting a precision of 1. (I believe this is why sqlalchemy doesn't support precision.)
  • Inference to TIME_WITH_TIMEZONE is not supported as TIME_WITH_TIMEZONE is primarily intended to be used as support for legacy databases. See postgres documentation for details.

Checklist

  • My pull request has a descriptive title (not a vague title like Update index.md).
  • My pull request targets the master branch of the repository
  • My commit messages follow best practices.
  • My code follows the established code style of the repository.
  • I added tests for the changes I made (if applicable).
  • I added or updated documentation (if applicable).
  • I tried running the project locally and verified that there are no
    visible errors.

Developer Certificate of Origin

Developer Certificate of Origin
Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.


Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

@eito-fis
Copy link
Contributor Author

eito-fis commented Sep 6, 2021

I'm curious how much we care about what type the API returns. Intuitively, I would guess that consistently returning TIME with the appropriate type options would be the nicest, but if it would be less work to just return TIME WITH TIMEZONE. In fact, if we're alright with doing away with type options entirely, this would be much simpler implemented as just TIME WITHOUT TIMEZONE and TIME WITH TIMEZONE.

@kgodey
Copy link
Contributor

kgodey commented Sep 6, 2021

@eito-fis I'm in favor of going with the simpler implementation. Please update the type mapping files in mathesar/database/ to include both time types in your PR. Thanks!

@eito-fis
Copy link
Contributor Author

@mathemancer Is there a reason we use get_available_types over get_supported_types in test_alter_column_casts_data_gen? (Or in general?) I'm running into issues since I've implemented TIME_WITH_TIME_ZONE and TIME_WITHOUT_TIME_ZONE as simple extensions of the original TIME type. This works fine for get_supported_types since I can insert the custom type into the type map, but breaks with get_available_types since it just returns the base TIME type.

Happy to change the implementation of the current TIME types if there is a better way to go about this.

@mathemancer
Copy link
Contributor

@mathemancer Is there a reason we use get_available_types over get_supported_types in test_alter_column_casts_data_gen? (Or in general?) I'm running into issues since I've implemented TIME_WITH_TIME_ZONE and TIME_WITHOUT_TIME_ZONE as simple extensions of the original TIME type. This works fine for get_supported_types since I can insert the custom type into the type map, but breaks with get_available_types since it just returns the base TIME type.

Happy to change the implementation of the current TIME types if there is a better way to go about this.

Using separate types for TIME WITH TIME ZONE and TIME WITHOUT TIME ZONE is the correct move, good work there. I suggest the ischema_names dict items for time with time zone, time without time zone, and time according to the actual DB types, (using the extensions you've already set up). I.e., you should end up with the sub-dictionary:

{
    "time with time zone": db.types.datetime.TIME_WITH_TIME_ZONE,
    "time without time zone": db.types.datetime.TIME_WITHOUT_TIME_ZONE,
    "time": db.types.datetime.TIME,
}

Reflection is going to be a pain with that setup, but it solves a number of other issues. (See db/types/__init__.py for where to update the dict.) For reflection to work better, it might make sense to use actual UserDefinedType implementations, with the TIME class as a mix-in. Not sure, though, without some tinkering.

As for why those functions are used differently, the intent of get_supported_alter_column_types is precisely to separate the names we pass around for things from the arbitrary keys in ischema_names. I'm fine with using that function more if it helps. It might not be necessary if you add the proper types into ischema_names, though.

@eito-fis
Copy link
Contributor Author

eito-fis commented Oct 3, 2021

This should be in an about functional place - the last question is about inference. In particular, both TIME WITH TIME ZONE and TIME WITHOUT TIME ZONE can parse strings with and without time zones, which means that whichever is tried first when casting a string is used. I see two possible resolutions here:

  • We only ever infer one of the two. In this case, I assume it would be best to do TIME WITH TIME ZONE to ensure we don't lose information. (Although, if the user can adjust afterwards, then maybe it doesn't matter too much.)
  • We update the SQL cast function for string -> TIME WITHOUT TIME ZONE to error if there is a time zone, and then infer to either type. This would mean that we couldn't go straight from string with timezone to TIME WITHOUT TIME ZONE, and instead would have to have an intermediate step converting to TIME WITH TIME ZONE, but maybe that's alright.

Any thoughts on which would be preferred?

@kgodey
Copy link
Contributor

kgodey commented Oct 4, 2021

@eito-fis I think it's fine to inferTIME WITH TIME ZONE for all times. I assume we'd convert times with existing timezones to the corresponding time in UTC before saving. Please make sure to add a test case for that.

@eito-fis
Copy link
Contributor Author

eito-fis commented Oct 7, 2021

@kgodey I could take a shot at properly converting with timezone to without timezone, but I think it produces some un-intuitive behavior. The postgres documentation has a bit on this, but I think the core issue is about what happens when timezone causes the day to roll over. I guess it seems a bit hard to handle without date information. (In fact, re-reading the documentation, I wonder if its worth doing time with timezone at all?)

@kgodey
Copy link
Contributor

kgodey commented Oct 7, 2021

@eito-fis After reading the Postgres docs, I think it's better to infer as TIME WITHOUT TIME ZONE instead. We do need to support TIME WITH TIME ZONE since legacy databases might have fields with it, but I think it makes sense not to support inferring to it.

Please make sure to document the reasoning for this in a comment somewhere so we have the context.

@kgodey kgodey added this to the [08] Initial Data Types milestone Oct 8, 2021
@kgodey kgodey added the pr-status: review A PR awaiting review label Nov 26, 2021
@kgodey
Copy link
Contributor

kgodey commented Nov 26, 2021

@silentninja I added you as a reviewer here since you're working on TIMESTAMP.

@silentninja
Copy link
Contributor

One question I have is whether the Mathesar renderer should be a general default, instead of just being attached to the record viewset as it is now. I can't come up with any cases where another endpoint has to serialize custom types so I think its alright as is, but would like to make sure that assumption is correct.

It would be better if custom renders are listed explicitly as it is meant for rendering data from the user database. If we make it a default It would unnecessarily end up interfering with Mathesar related data(like datafile model, display_options) if we do end up extending the renderer in the future

Copy link
Contributor

@silentninja silentninja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, creating individual time types seems to be much better and flexible.

@@ -36,6 +37,9 @@
NUMERIC = PostgresType.NUMERIC.value.upper()
REAL = PostgresType.REAL.value.upper()
SMALLINT = PostgresType.SMALLINT.value.upper()
DATE = PostgresType.DATE.value.upper()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Date constant has already been initialized at line 31. The types are arranged alphabetically, so this Date constant should be removed

@silentninja silentninja added pr-status: revision A PR awaiting follow-up work from its author after review and removed pr-status: review A PR awaiting review labels Dec 7, 2021
Copy link
Contributor

@dmos62 dmos62 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I don't know this code area as well as some of the others, so I'll refrain from giving explicit approval.

@kgodey kgodey added pr-status: review A PR awaiting review and removed pr-status: revision A PR awaiting follow-up work from its author after review labels Dec 8, 2021
@kgodey
Copy link
Contributor

kgodey commented Dec 8, 2021

I moved the status back to review. I'd like to get this merged ASAP, so @mathemancer, could you please re-review and make any last changes that need to be made to merge it?

WITHOUT_TIME_ZONE_DB_TYPE = base.PostgresType.TIME_WITHOUT_TIME_ZONE.value


class TIME_WITHOUT_TIME_ZONE(SA_TIME):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Converting the class to a string still returns Time rather than TIME WITHOUT TIME ZONE. The str representation method should be overridden to return the appropriate value

super().__init__(*args, timezone=False, precision=precision, **kwargs)


class TIME_WITH_TIME_ZONE(SA_TIME):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Converting the class to a string still returns Time rather than TIME WITH TIME ZONE. The str representation method should be overridden to return the appropriate value

@mathemancer
Copy link
Contributor

mathemancer commented Dec 10, 2021

One question I have is whether the Mathesar renderer should be a general default, instead of just being attached to the record viewset as it is now. I can't come up with any cases where another endpoint has to serialize custom types so I think its alright as is, but would like to make sure that assumption is correct.

I think as-is is fine.

@mathemancer mathemancer requested review from mathemancer and removed request for silentninja December 10, 2021 07:40
Copy link
Contributor

@mathemancer mathemancer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I've made a couple of minor changes as requested by @silentninja , and everything else looks good to me.

@silentninja
Copy link
Contributor

Okay, I've made a couple of minor changes as requested by @silentninja , and everything else looks good to me.

The __name__ constant is a class property, so it should be self.__class__.__name__.

@silentninja
Copy link
Contributor

silentninja commented Dec 13, 2021

Thank @mathemancer and @eito-fis

@silentninja silentninja merged commit fb18531 into master Dec 13, 2021
@silentninja silentninja deleted the time_type branch December 13, 2021 08:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-status: review A PR awaiting review
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

Handle TIME data type in the backend
5 participants