Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Fully merge Pydantic Field with SQLAlchemy Column constructor #436

Closed
wants to merge 2 commits into from

Conversation

daniil-berg
Copy link
Contributor

@daniil-berg daniil-berg commented Sep 5, 2022

Abstract

The proposed changes modify the Field constructor to take all parameters defined on the SQLAlchemy Column directly, instead of just a few.

This makes the parameters sa_column, sa_column_args, and sa_column_kwargs completely obsolete!

There is no negative impact on current functionality. All test cases still pass without me modifying them. Test coverage is also not reduced.

Rationale

The definition of model fields should be fully in line with the goals of SQLModel, as stated in the documentation.

Intuitive to write

People coming from an SQLAlchemy background will immediately feel at home with the Field constructor, just as people from a Pydantic background already see all the familiar features of it. This honors the project's intent:

Designed to be easy to use and learn. Less time reading docs.

Until now, leveraging all Column features was a little cumbersome and unclear, as evidenced by numerous issues/questions (see below) for trying to figure out how to design all but the most simple database models. The sa_column-parameters did allow all that functionality, but sometimes led to unexpected behavior for many people.

Easy to use

Here are just a few examples of how these changes would make SQLModel easier to use.

Defining custom SQL types until now required initializing and passing a Column instance to sa_column. This led to almost all database-related Field arguments to be ignored. This is how you had to do it:

class MyModel(SQLModel, table=True):
    custom_field: int = Field(
        sa_column=Column(
            CustomType(impl=Integer()),
            index=True,
        )
    )

Whereas now, you can now write this:

class MyModel(SQLModel, table=True):
    custom_field: int = Field(
        type_=CustomType(impl=Integer()),
        index=True,
    )

Adding additional constraints, defining a different DB column name, etc. all required using sa_column_args/sa_column_kwargs like this:

class AnotherModel(SQLModel, table=True):
    foo: int = Field(
        default=42,
        sa_column_args=[
            CheckConstraint('>0')
        ],
        sa_column_kwargs={
            'name': 'bar',
            'server_onupdate': FetchedValue(),
        }
    )

Now you just need to do this:

class AnotherModel(SQLModel, table=True):
    foo: int = Field(
        CheckConstraint('>0'),
        default=42,
        name='bar',
        server_onupdate=FetchedValue(),
    )

Compatible

The package remains just as compatible with FastAPI, Pydantic and SQLAlchemy as before. If anything, allowing to pass all column parameters directly makes that compatibility even more obvious to people coming from SQLAlchemy to SQLModel.

Extensible/Short

see above

Implementation

Details (click me)

The changes cover basically only three things: The FieldInfo class, the Field function, and the get_column_from_field function. Here are the broad points:

class FieldInfo

  • SQLModel's custom FieldInfo class (inheriting from the Pydantic one) has been extended to incorporate all the column parameters that have been missing.
  • Corresponding __slots__ were defined mimicking the Pydantic approach.
  • Additionally, the attributes were all explicitly type-annotated for easier use "down the line".
  • The __init__ method has been streamlined, but a new helper method get_defined_column_kwargs was added which comes into play later to construct a corresponding Column.

def Field

  • Obviously, this is where the new function parameters have been added to be passed along to the above mentioned FieldInfo.
  • DeprecationWarnings have been added for the sa_column-parameters.
  • The default function parameter has been made keyword-only (like all the rest) and placed behind the newly added variable-length positional arguments *args. (This is technically the only backwards incompatible change.)

def get_column_from_field

  • Refactored and extensively commented.
  • Constructs the *args and **kwargs for the Column based on the provided field as before, but supporting all possible parameters now.
  • Still sets sensible defaults, such as deriving a column type (if it was not provided) based on the field type, and deciding "nullability" based on other field settings.

Affected issues (updated 2023-01-04)

A major anticipated benefit of the proposed changes is the reduction of questions about "how to do SQLAlchemy-thing XYZ".

Affected PRs (updated 2023-01-04)

It is obviously not my intention to trash other people's work. I respect the effort they put in, just think that those changes are no longer needed/sensible in conjunction with this PR.

Caveats & arguments against this

Details (click me)

Huge function signature

The Field function's signature will be even larger and perhaps overwhelming for newcomers.

My response would be that a large function signature is not an argument per se, and that all arguments remain optional, which allows the actual code to be as concise as possible.

Commitment to incorporate future additions

Changes/additions to the Column parameters in SQLAlchemy in the future would necessitate adjusting the SQLModel Field function accordingly. Although this is no different from how Pydantic changes already need to be propagated to the custom Field constructor, going the route suggested in this PR would require keeping track of both base packages (Pydantic and SQLAlchemy) in the future, to ensure maximum compatibility/familiarity.

I don't think this is such a bad thing, since SQLModel aims combine their functionality. If/when SQLAlchemy 2 drops, major changes (maybe even backwards incompatible) may be necessary anyway.

Possible future parameter conflicts

Parameter names for Pydantic's Field and SQLAlchemy's Column might conflict in the future.

This is hypothetical and it would not necessarily even cause problems, but if it does, a possible solution would be to distinguish parameters that are completely unrelated with prefixes like sa_param and py_param or something similar.

Backwards incompatible default parameter

The default parameter for the Field function is now keyword-only. Until now you can call it like this Field("a_default_value"); with the proposed changes, you will have to call it like this Field(default="a_default_value").

This is technically the only backwards incompatible change in this PR. However I have a few arguments for this:

  • The documentation so far used the explicit keyword-argument notation everywhere already. This is also why all test cases still pass. Nowhere was a default passed as a positional argument.
  • I see no reason for the "preferential treatment" of the default parameter in the Field function. This may somewhat make sense with the original Pydantic version of Field because there you would normally assign the default value to your model field directly. However, I don't find it intuitive/appropriate at all in the context of SQLModel since models are intended to represent database tables and fields represent database columns. The default value for a field/column is just one of many parameters and should not be treated differently. I do concede however that this makes is slightly less familiar to people that come from a Pydantic background, because they may be used to using a positional default.
  • This is the only way to reasonably allow using positional arguments as with SQLAlchemy's Column.
  • SQLModel is still in its infancy, so 100% backwards compatibility may not even be a requirement yet. This is ultimately for @tiangolo to decide, of course.

Future To-Dos

Details (click me)
  • Make it possible to provide the column type as the first positional argument to Field as it is with SQLAlchemy Column. This should be fairly straight-forward.
  • Expand the same logic of dropping the sa_-style parameters in favor of full integration to Relationship attributes. This may be more difficult; I haven't had the time to look into that.
  • Add documentation examples demonstrating that all familiar Column configurations work the same with SQLModel fields.

I hope you will find the time to consider this, @tiangolo. I am looking forward to discussions, counter-arguments, and proposed changes to this from other contributors and am willing to keep working on this.

@github-actions
Copy link

github-actions bot commented Sep 5, 2022

📝 Docs preview for commit c93b42e at: https://6315c6b9c76bbe33cb57fcf6--sqlmodel.netlify.app

@codecov
Copy link

codecov bot commented Sep 5, 2022

Codecov Report

Base: 98.49% // Head: 97.90% // Decreases project coverage by -0.59% ⚠️

Coverage data is based on head (bbdea72) compared to base (ea79c47).
Patch coverage: 100.00% of modified lines in pull request are covered.

❗ Current head bbdea72 differs from pull request most recent head 0c0c3aa. Consider uploading reports for the commit 0c0c3aa to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #436      +/-   ##
==========================================
- Coverage   98.49%   97.90%   -0.60%     
==========================================
  Files         185      188       +3     
  Lines        5856     6302     +446     
==========================================
+ Hits         5768     6170     +402     
- Misses         88      132      +44     
Impacted Files Coverage Δ
sqlmodel/main.py 87.81% <100.00%> (ø)
tests/test_sa_column.py 100.00% <100.00%> (ø)
tests/test_main.py 100.00% <0.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@github-actions
Copy link

github-actions bot commented Sep 5, 2022

📝 Docs preview for commit 6cb741b at: https://6315c7ab892a153d236388e6--sqlmodel.netlify.app

@github-actions
Copy link

github-actions bot commented Sep 5, 2022

📝 Docs preview for commit bbdea72 at: https://6315d1e5892a1545e9638201--sqlmodel.netlify.app

sqlmodel/main.py Show resolved Hide resolved
sqlmodel/main.py Show resolved Hide resolved
sqlmodel/main.py Show resolved Hide resolved
@nuno-andre
Copy link

nuno-andre commented Oct 30, 2022

It would be nice if ForeignKey.ondelete and ForeignKey.onupdate were also included, to avoid having to explicitly instantiate ForeignKey in such common use cases.

class AnotherModel(SQLModel, table=True):
    foo: int = Field(foreign_key='mymodel.custom_field', ondelete='CASCADE', onupdate='CASCADE')

@daniil-berg
Copy link
Contributor Author

@nuno-andre I appreciate your suggestion, but I don't think this is a good idea for two reasons:

1) PEP 20

The only reason I didn't remove the foreign_key parameter from the Field function altogether in my PR, is that I wanted to honor @tiangolo's original intent (presumably) of having a shortcut for the most common use case, namely simply referencing the column to point the key to, which I think is a valid approach. And because this keeps my proposed changes backwards compatible.

However, my actual intent with this PR is to mirror Pydantic's Field and SQLAlchemy's Column interfaces in this one function as closely as possible to make it instantly familiar to people from both backgrounds. I think this is accomplished here. You can define foreign keys with non-default attributes just like you would in SQLAlchemy:

Field(ForeignKey("table.key", ondelete="CASCADE", onupdate="CASCADE"), ...)

If we had the two parameters you proposed in addition, we would have yet another way to do the same thing. As I argued in my response to @amisadmin above:

There should be one-- and preferably only one --obvious way to do it.

I would argue that this obvious way is there now, and that it is not a big ask (or a lot of added verbosity) to pass a ForeignKey instance with those same keyword arguments, if you need them. And actually, with my proposed approach (ForeignKey as a positional argument like in SQLA), there is about the same amount of code as there would be with your suggestion and the foreign_key keyword-argument.

2) Inconsistency & partially meaningless parameters

The ondelete and onupdate parameters are only ever meaningful in conjunction with foreign keys. Yet most fields typically aren't foreign keys. In those cases these parameters are meaningless. Most (though not all) other Field parameters on the other hand are meaningful, even if used with their default values. Note that even foreign_key arguably is more meaningful since its default means that the field is not a foreign key. The default to ondelete means nothing at all, unless foreign_key is passed explicitly.

Conversely, a ForeignKey instance encapsulates all the foreign key logic in one object and thus abstracts it from Field. That way for Field the meaning is either "this will be a foreign key" or "this will not be a foreign key". A foreign key will be treated the same as any other of the countless parameters a field/column may have. Why should it be treated differently than the column type or a constraint for example? In my opinion, consistency is important, especially when the interface is so complex already.


In conclusion, although I recognize the benefit of having those dedicated parameters, I think the drawbacks outweigh this benefit significantly and therefore I don't think these parameters should be included. If you want to pursue this further, just like I suggested in the other discussion above, maybe you'll want to offer your own PR, when/if this is merged, and @tiangolo may agree with you and disagree with my reasoning.

I hope this makes sense. Thanks for the feedback though.

@zmievsa
Copy link

zmievsa commented Jan 3, 2023

This is amazing work. Are there any plans on reviewing/merging it, @tiangolo ? Or do we need to provide something extra to make sure that these changes are valid and ready to be merged?

@github-actions
Copy link

github-actions bot commented Jan 4, 2023

📝 Docs preview for commit cdd51ec at: https://63b58ab3facf617cfb46ea71--sqlmodel.netlify.app

allow passing all `Column` arguments directly to `Field`;
make `default` a keyword-only argument for `Field`
@github-actions
Copy link

github-actions bot commented Jan 4, 2023

📝 Docs preview for commit 0c0c3aa at: https://63b58bf2da28e97fcbe76f09--sqlmodel.netlify.app

@daniil-berg
Copy link
Contributor Author

For what it's worth, I just rebased this branch once more on top of the most recent main commits, so as to get rid of the merge conflicts.

I also gave the current state of PRs and issues a cursory review and can add the following to the lists I started above.

Affected issues (continued)

Affected PRs (continued)


Would love to get feedback on this soon. I am still willing to continue working on this.

@rotemvilsa
Copy link

@tiangolo Is there anything missing in order to have this PR merged? It looks like an amazing work has been put here and could give a lot of value to the community as this truly affect lots of PRs and open issues SQLModel currently have.

@idryzhov
Copy link

Great work and really waiting for this to be merged!

@nuno-andre
Copy link

nuno-andre commented Apr 6, 2023

@daniil-berg In my opinion, the goal of SQLModel is not to replicate SQLAlchemy with a Pydantic syntax, but to add persistence to Pydantic models. Under this point of view, the classes assigned to attributes should signify the attribute typology (as subclasses of FieldInfo) and not reference behaviors outside the data model (like in ForeignKey).

Regarding the inconsistency of including parameters not applicable in all cases, this is something that the Pydantic's Field already manages (gt, pattern, min_lenght...) and validates in each instance. IMO this is a wise design decision that allows to implement a very powerful way to easily define data models avoiding code bloat and reducing the cognitive load.

In any case, this PR seems to me an amazing work that certainly deserves to be merged.

@tiangolo
Copy link
Member

Thank you @daniil-berg for the thorough explanation/argument and all the work! 🤓 🍰

And thanks everyone for the discussion here. ☕

I actually prefer not to fully merge the signatures of both Field and Column. There are several things I'm not very fan of in the design of Column, for example receiving an arbitrary sequence of arguments including name, type, constrains, but that at the same time are not required. This has to be kept like that in SQLAlchemy for backwards compatibility, but I prefer not to adopt the same API in SQLModel.

So I have intentionally simplified and limited the things I support in SQLModel, to work for the majority of use cases (say 90%) in a very simple and intuitive way (at least I hope so) without all the more advanced/exotic features that SQLAlchemy supports.

And then for those advanced use cases, there's a way to escape and use all the features from SQLAlchemy, defining the full column, all the arguments, etc.

I see there are problems this PR would help with, here's my point of view on each of them:

Thanks again for the work and all the help with the rest of the project! 🍰

Copy link

github-actions bot commented Nov 9, 2023

Assuming the original need was handled, this will be automatically closed now. But feel free to add more comments or create new issues or PRs.

@github-actions github-actions bot closed this Nov 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants