Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-42935: Refactor tap module to use the Pydantic data model #49

Merged
merged 3 commits into from Apr 23, 2024

Conversation

JeremyMcCormick
Copy link
Collaborator

@JeremyMcCormick JeremyMcCormick commented Mar 25, 2024

The tap module is updated to use the Pydantic data model, instead of the raw YAML. The "load-tap" command in the command line interface was also updated to use the Pydantic schema object. The PYLD transformations were removed from the cli function, as they seemed to be unnecessary. A few changes were made to the Pydantic data model to support setting of reasonable defaults. A "votable_datatype" annotation was added to Colum. The Column datatype was changed to return the enum instead of a string, as this setting seemed to confuse mypy. Tests were changed to conform to the new tap module and the minor changes to the data model. Testing showed that the schema was loaded correctly into a live PostgreSQL database.

  • Diff the DDL output for the TAP_SCHEMA from main (production) with this branch.

Copy link

codecov bot commented Mar 25, 2024

Codecov Report

Attention: Patch coverage is 98.73418% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 91.58%. Comparing base (0d31a49) to head (17ca2bb).
Report is 36 commits behind head on main.

❗ Current head 17ca2bb differs from pull request most recent head cbad0f1. Consider uploading reports for the commit cbad0f1 to get more accurate results

Files Patch % Lines
python/felis/tap.py 98.36% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #49      +/-   ##
==========================================
+ Coverage   91.14%   91.58%   +0.43%     
==========================================
  Files          21       21              
  Lines        2225     2198      -27     
  Branches      468      458      -10     
==========================================
- Hits         2028     2013      -15     
+ Misses        131      125       -6     
+ Partials       66       60       -6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@JeremyMcCormick JeremyMcCormick force-pushed the tickets/DM-42935 branch 2 times, most recently from 51c0aab to db1de2d Compare March 29, 2024 17:37
Copy link
Contributor

@andy-slac andy-slac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, a couple of minor comments.

python/felis/tap.py Outdated Show resolved Hide resolved
python/felis/tap.py Show resolved Hide resolved
python/felis/tap.py Outdated Show resolved Hide resolved
@JeremyMcCormick
Copy link
Collaborator Author

@gpdf Following our discussion of validating the new tap module which uses Pydantic, I have attached below the existing SQL output from main and that from the updated module. They are exactly the same according to diff.

dp02_dc2_tap_main.sql.txt
dp02_dc2_tap_new.sql.txt

@JeremyMcCormick JeremyMcCormick marked this pull request as draft April 8, 2024 23:07
@JeremyMcCormick JeremyMcCormick marked this pull request as ready for review April 12, 2024 00:43
@JeremyMcCormick JeremyMcCormick changed the title Refactor tap module to use the Pydantic data model DM-42935: Refactor tap module to use the Pydantic data model Apr 12, 2024
@JeremyMcCormick
Copy link
Collaborator Author

@gpdf As per discussion today, I reverted addition of the tap:schema_index field on the Pydantic data model, which should have no effect on the TAP_SCHEMA output. The existing scripts within sdm_schemas handle this in a way which does not use the datamodel module.

@JeremyMcCormick JeremyMcCormick requested review from andy-slac and removed request for andy-slac April 19, 2024 17:34
@JeremyMcCormick
Copy link
Collaborator Author

JeremyMcCormick commented Apr 19, 2024

@andy-slac I invalidated your review, as there are some fix-ups needed here related to setting the TAP_SCHEMA index. I'll have @gpdf review this in its final form.

votable_utype: str | None = Field(None, alias="votable:utype")
"""The VOTable utype (usage-specific or unique type) of the object."""

@model_validator(mode="after")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes are a fix to the Pydantic model_validator as it was using the wrong function signature.

@JeremyMcCormick JeremyMcCormick force-pushed the tickets/DM-42935 branch 2 times, most recently from f96594d to 8e8d967 Compare April 22, 2024 22:17
The tap module is updated to use the Pydantic data model, instead of
the raw YAML. The "load-tap" command in the command line interface
was also updated to use the Pydantic schema object. The PYLD
transformations were removed from the cli function, as they seemed to
be unnecessary. A few changes were made to the Pydantic data model to
support setting of reasonable defaults. A "votable_datatype" annotation
was added to Colum. The Column datatype was changed to return the enum
instead of a string, as this setting seemed to confuse mypy. Tests were
changed to conform to the new tap module and the minor changes to the
data model. Testing showed that the schema was loaded correctly into
a live PostgreSQL database.
Copy link

@gpdf gpdf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in general. Please make the one change about the handling of missing table_index values.

@@ -391,7 +393,7 @@ class Table(BaseObject):
primary_key: str | list[str] | None = Field(None, alias="primaryKey")
"""The primary key of the table."""

tap_table_index: int | None = Field(None, alias="tap:table_index")
tap_table_index: int = Field(0, alias="tap:table_index")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please retain the ability for the model to distinguish between "missing" and zero table_index values. We may wish to make the handling more sophisticated in the (near) future.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think column_index still allows None anyway, right?

table.table_index = int(table_obj.get("tap:table_index", 0))
table.utype = table_obj.votable_utype
table.description = table_obj.description
table.table_index = table_obj.tap_table_index
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line would have to be changed as well, perhaps to something like

table.table_index = 0 if table_obj.tap_table_index is None else table_obj.tap_table_index

Make the table index None by default instead of 0 and update the logic
in the tap module accordingly.
Copy link

@gpdf gpdf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@JeremyMcCormick JeremyMcCormick merged commit f31d1e4 into main Apr 23, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants