Feat: Add virtual properties and rename table_properties to physical_properties#2561
Feat: Add virtual properties and rename table_properties to physical_properties#2561izeigerman merged 11 commits intoSQLMesh:mainfrom
Conversation
sqlmesh/core/model/meta.py
Outdated
There was a problem hiding this comment.
can we remove this and do a migration to move all physical_properties in the database to virtual_properties?
but accept models in the file system from the model file to map to physical_properties?
There was a problem hiding this comment.
There was a problem hiding this comment.
Isn't these fields meant for Pydantic field parsing?
My goal is to be able to parse:
- table_properties ("deprecated")
- physical_properties (the new table_properties)
- virtual_properties (the new properties for virtual layer)
Should that append with a custom Field to factor table_propertiesand physical_properties.
can we remove this and do a migration to move all physical_properties in the database to virtual_properties?
Do you mean all table_properties to physical_properties to migrate?
There was a problem hiding this comment.
so i imagine pydantic will only have two fields, physical/virtual_properties. we migrate in the state all usage of table_properties -> physical properties.
at parse time, if we detect table_properties, we log a message and remap that to physical_properties
There was a problem hiding this comment.
I don't think we need to migrate anything. Can't we just add a table_properties alias to the physical_properties_ field?
There was a problem hiding this comment.
If not, we can also handle this in the root model validator. And just pop table_properties, emit a warning, and then set it as physical_properties
There was a problem hiding this comment.
In any case, only physical_properties should stay, and no updates should be necessary to the EngineAdapter layer. EngineAdapter shouldn't care whether it's "physical" or "virtual". Different level of abstraction.
There was a problem hiding this comment.
I also don't think a migration is needed or maybe I'm missing?
@izeigerman, in this commit I've updated the Pydantic parsing to use both with the related warning. Isn't it what you suggest?
Regarding EngineAdapter, I think we could indeed skip the physical / virtual naming and skip to the actual materialization related naming: table_properties / view_properties WDYT?
There was a problem hiding this comment.
In this commit I tried to set proper boundaries with physical/virtual until the evaluator that translate those into table/view (depending on the actual chosen materialization).
There was a problem hiding this comment.
Hey @Kayrnt, after additional consideration, I've realized that you do need the migration after all. Since we're getting rid of the table_properties field altogether we do need to reflect this in the existing state.
The fact that you had to change the fixtures/migrations/snapshots.json file confirms this suspicion. If we don't migrate, the next time users run the plan command after migration, it will show the diff because the field name has changed.
You can test this by first initializing a project with duckdb using a previous SQLMesh version and then upgrading to your current branch and running the plan command.
Sorry about the confusion initially, I really hoped we could avoid it.
|
is the error due to a new version of pydantic? |
No I think it's PR code related as it works if I just use |
7506512 to
32bdcbd
Compare
tests/core/test_model.py
Outdated
There was a problem hiding this comment.
let's use the builder api for the tests,
exp.array("('test-label', 'label-value')")
There was a problem hiding this comment.
Right, it's more compact 👍
There was a problem hiding this comment.
can you use sqlglot builder here ?
There was a problem hiding this comment.
Sure, then I'll update the existing code too to be consistent
tests/core/test_model.py
Outdated
tests/core/test_model.py
Outdated
There was a problem hiding this comment.
table_properties was renamed to physical_properties and since this is a single line of JSON the whole thing is included in the diff.
There was a problem hiding this comment.
I think this file shouldn't have changed. The fact that it did indicates that a migration script is necessary.
There was a problem hiding this comment.
I think we could keep table_properties as the Pydantic model is backward compatible (unless it's used another way?).
I didn't look into migrations so far so I'm not sure how it works and what's needed.
|
seems almost ready, @izeigerman can you take a final look |
georgesittas
left a comment
There was a problem hiding this comment.
LGTM as well, a few small suggestions.
docs/concepts/models/overview.md
Outdated
There was a problem hiding this comment.
| - A key-value of arbitrary table properties specific to the target engine applied on SQLMesh physical layer. For example: | |
| - A key-value mapping of arbitrary table properties specific to the target engine applied on SQLMesh's physical layer. For example: |
docs/concepts/models/overview.md
Outdated
There was a problem hiding this comment.
| - A key-value of arbitrary table properties specific to the target engine applied on SQLMesh virtual layer. For example: | |
| - A key-value mapping of arbitrary table properties specific to the target engine applied on SQLMesh's virtual layer. For example: |
There was a problem hiding this comment.
| | `physical_properties` | Arbitrary table properties specific to the target engine applied on SQLMesh physical layer. Specified as key-value pairs (`key = value`) | |
| | `virtual_properties` | Arbitrary table properties specific to the target engine applied on SQLMesh virtual layer. Specified as key-value pairs (`key = value`) | dict | N | | |
| | `physical_properties` | Arbitrary table properties specific to the target engine applied on SQLMesh's physical layer. Specified as key-value pairs (`key = value`) | |
| | `virtual_properties` | Arbitrary table properties specific to the target engine applied on SQLMesh's virtual layer. Specified as key-value pairs (`key = value`) | dict | N | |
sqlmesh/core/model/meta.py
Outdated
There was a problem hiding this comment.
Nit: I think this is expected to always exist right? Given that name is required.
| model_name = values.get("name") | |
| model_name = values["name"] |
There was a problem hiding this comment.
Yes, it feels like it should always exists as it's explicitly added in the provided validation fields. I'll add that change so that it fails eagerly if that behavior changes (thus preventing a silent switch to a Python model "None"' log).
sqlmesh/core/model/meta.py
Outdated
There was a problem hiding this comment.
| physical_properties = {} | |
| for expression in self.physical_properties_.expressions: | |
| physical_properties[expression.this.name] = expression.expression | |
| return physical_properties | |
| return {e.this.name: e.expression for e in self.physical_properties_.expressions} |
There was a problem hiding this comment.
Thanks, those suggestions appears more "Pythonic" 👍
sqlmesh/core/model/meta.py
Outdated
There was a problem hiding this comment.
| virtual_properties = {} | |
| for expression in self.virtual_properties_.expressions: | |
| virtual_properties[expression.this.name] = expression.expression | |
| return virtual_properties | |
| return {e.this.name: e.expression for e in self.virtual_properties_.expressions} |
4416d09 to
8bae4b4
Compare
sqlmesh/core/engine_adapter/base.py
Outdated
There was a problem hiding this comment.
Model properties means something else throughout the code so let's not use it here. Additionally, the engine adapter is too low level of a construct to be aware of something as high-level as "model".
I'm ok keeping with keeping this as table_properties or just properties since that's what we create here. If you want to be more specific I'm also fine with something like table_or_view_properties.
There was a problem hiding this comment.
Ok I see what you mean!
The goal of that change was to be consistent with the call parameters since we have both:
- https://github.com/Kayrnt/sqlmesh/blob/8bae4b47b7bd5a038f05bd3d51a126c4c747fcf8/sqlmesh/core/engine_adapter/bigquery.py#L532
- https://github.com/Kayrnt/sqlmesh/blob/8bae4b47b7bd5a038f05bd3d51a126c4c747fcf8/sqlmesh/core/engine_adapter/bigquery.py#L571
I would also suggest for table_or_view_properties (and table_or_view_properties_to_expressions) then as I'd rather have future maintainers understand that the function expects one or the other (and it's not a bug).
I'll make an update
sqlmesh/core/model/meta.py
Outdated
|
Unfortunately, the migration script is required to rename |
|
@izeigerman Looking a bit at the migration system, what you suggest is to create a new migration (something like
I could do it in SQL but I wonder if I can safely assume that all state connections can support JSON modifications or else try a |
|
@Kayrnt yes, the steps look accurate to me.
Yes, that's the safest approach which works uniformly across all supported engines. You shouldn't worry about millions of rows. It's going to be 10s of thousands at most. |
izeigerman
left a comment
There was a problem hiding this comment.
This looks great, thanks a lot for addressing changes!
closes #2014
virtual_propertiesis introduced: it is a dictionary that propagate to the engine the provided properties to the virtual layer (the "target view")physical_propertiesreplacestable_properties(we keep the value as an alias for backward compatibility)physical_propertiesinstead oftable_propertiesTo be figured out:
? So far, it's just on the physical layer.