-
-
Notifications
You must be signed in to change notification settings - Fork 206
feat: add enum support #274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
dvarrazzo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello! Thank you very much, this looks good.
Main shortcomings to address:
- we need a dumper able to roundtrip. If we pass an enum as query param, we get back the same enum (at
islevel, in the enum case). - we should support non-string enums too.
I have pushed a commit with some failing tests to your branch.
psycopg/psycopg/types/enum.py
Outdated
| register it globally. | ||
| .. note:: | ||
| Only string enums are supported. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a severe limitation. Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I know, Postgres enum labels can be string only.
I don't know how to manage enums of another types. Maybe add optional parameter to use Python enum keys instead of values?
CREATE TYPE some_enum AS ENUM ('ONE', 'TWO');class SomeEnum(int, Enum):
ONE = 1
TWO = 2
...
register_enum(info, SomeEnum, use_keys_as_labels=True, context=conn)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I have been thinking all the time during my review that adaptation was based on the names. Only towards the end I figured out you were relying on the values.
There is also the case that keys and values might be different for str enums in python, oelr that the mapping python-postgres is not 1:1 perfect.
What I had in mind is that maybe we could accept a mapping enum -> label name to customize the mapping when the postgres label doesn't match the python enum name.
class SomeEnum(str, Enum):
FOO = "VAL1"
BAR = "VAL2"
# create type as enum ('FOO', 'VAL3')
register_enum(... labels={SomeEnum.foo: "VAL3"})He adapters could use, by default, a 1:1 mapping based on names. The labels param might override the mapping only wnrre needed.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Existing serializers (like json.dumps) uses enum values, not keys.
labels parameter looks not really clean, as for me. Enum labels in Postgres and in Python should match - otherwise it looks like architecture problem.
I'd insist on use_keys_as_labels parameter, it should be more clean and clear and easy to use. Just compare:
CREATE TYPE some_enum AS ENUM ('ONE', 'TWO');class SomeEnum(Enum):
ONE = 1
TWO = 2
...
register_enum(info, SomeEnum, labels={SomeEnum.ONE: "ONE", SomeEnum.TWO: "TWO"}, context=conn)
# vs
register_enum(info, SomeEnum, use_keys_as_labels=True, context=conn)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enum labels in Postgres and in Python should match - otherwise it looks like architecture problem.
I totally agree with you, from the point of view of an application developer. However, psycopg is a driver and its main objective is to try to be complete in the features it supports. That's why, if this feature is about "enum adaptation", I feel it should cover every kind of enum and to make possible to create a flexible mapping to make Python and Postgres talk to each other the way the user desires.
That's why I think we should cover other types of enums too, especially int, because Python makes it possible to define them.
I see the inconsistency you say, regards using enum in json. That comes probably from the fact that MyEnum(str, Enum) is more a str than an Enum (str comes first in MyEnum.__mro__, so, if MyEnum doesn't have a dumper registered, the str dumper takes effect. For the same reason, MyIntEnum(int, Enum) gets dumped successfully (as an integer).
Dumping using keys works with every type of enum (str, int, pure enums): it seems a much more complete solution. ISTM that it's the str enum to be special, because it happens that its values are also valid postgres labels. So, if anything, maybe we could add a parameter use_values_as_labels=True, which could be applied only when the values are strings.
Note that, in the example you make above:
register_enum(info, SomeEnum, labels={SomeEnum.ONE: "ONE", SomeEnum.TWO: "TWO"}, context=conn)in this case, labels wouldn't be needed at all. I would use it only if there are cases in which the Python key doesn't match the Postgres label. So I'd see that this should work out of the box:
# CREATE TYPE some_enum AS ENUM ('ONE', 'TWO');
class SomeEnum(Enum):
ONE = 1
TWO = 2
register_enum(info, SomeEnum, context=conn)and only for keys that don't match you could override the mapping:
class MoreEnum(Enum):
ONE = 1
MORE = 99
register_enum(info, SomeEnum, labels={MoreEnum.MORE: "TWO"} context=conn)so it still seems that if the user has a straightforward mapping, then their register_enum() is simple; otherwise, the more messy it is, the more complex the mapping.
Maybe it's worth to think more about the scope of the feature; however, covering pure and int enums, seems a necessary thing to have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about little compromise?
What about to use some priority:
- custom labels first (
labelsparameter); - enum values (for string-based or mixed enums);
- enum keys.
If would work for most cases. Even if there will be some weird enum like:
class WeirdEnum(Enum):
ONE = "TWO"
TWO = "THREE"It will be possible to deal with it using custom labels (labels={item: item.name for item in WeirdEnum}).
Or even better variant:
class EnumLabels(Enum):
AUTO = 0 # try values first, then keys
KEYS = 1 # use keys as labels
VALUES = 2 # use values as labels
def register_enum(
info: EnumInfo,
python_type: Optional[Type[E]] = None,
labels: Union[EnumLabels, Dict[E, str]] = EnumLabels.AUTO,
context: Optional[AdaptContext] = None,
) -> NoneOr
class EnumLabels(Enum):
AUTO = 0 # try values first, then keys
KEYS = 1 # use keys as labels
VALUES = 2 # use values as labels
CUSTOM_ONLY = 3 # use custom labels only
def register_enum(
info: EnumInfo,
python_type: Optional[Type[E]] = None,
labels: EnumLabels = EnumLabels.AUTO,
custom_labels: Optional[Dict[E, str]] = None,
context: Optional[AdaptContext] = None,
) -> None|
Please also add an entry to |
|
I have made some interface and cleanup and some changes in the direction of using keys, in our enum branch. I still want to verify if everything makes sense this way. Note that in the previous branch there was an inconsistency: if you didn't provide an enum to adapt, one would have been created from the postgres enum, but it would have been a pure enum, not a string one. I want to make more tests especially regarding the default adaptation behaviour. Please let me know if what done here makes sense. |
* moved documentation to `docs/basic/adapt.rst`; * use encoding in `EnumLoader`; * oid-based dumpers; * rewritten tests; * minor fixes.
This removes the need for enums to be str-based. It also removes the asymmetry whereby automatically generated enums were *not* string based, but pure enums.
|
I have added a mapping feature too. I think, with this, enum mapping is feature-complete. The docs should describe well how the labels mapping works. A review is welcome, otherwise I think I'm happy with this feature. |
…EnumInfo The previous name were lifted from the composite adapters. However, that class would have had a more ambiguous "type" attribute, and "types" for the fields, hence the decision of using "python_" and "fields_" prefix to disambiguate. In the enum adapters context, enum is not ambiguous, so a more natural name seems preferred.
They all get created, tests can choose to use only some of them.
This is more consistent with the other register_*() functions and allow a more natural order of arguments if others must be added (e.g. mapping).
A DataError subclass is already raised on mismatch on dump. Be consistent.
This will allow mapping customization; it is also slightly faster because it doesn't require data encoding/decoding.
No need to exhaustively test encodings on ascii data or array properties tested elsewhere.
|
Looks great! Thank you very much! P. S. I guess we can close this pull request, since you polished everything? |
|
@ertaquo, no, let's not close this merge from here anyway. I have force-pushed to your branch. |
|
Merging the branch as it is. Further changes may be made in master or in followup branches. Thank you very much for the proposal and the first implementation, @ertaquo! |
Original PR #274 by ertaquo Original: psycopg/psycopg#274
Merged from original PR #274 Original: psycopg/psycopg#274
Original PR #274 by ertaquo Original: psycopg/psycopg#274
Merged from original PR #274 Original: psycopg/psycopg#274
Postgres enum support (casting from/to Python
enum.Enumtypes).Based on #273