New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add TableSerializable and mysql_serializer #459
feat: Add TableSerializable and mysql_serializer #459
Conversation
…dels Signed-off-by: xuans <xuan_shen@outlook.com>
Signed-off-by: Sahithi Velma <sahithiv@wepay.com>
Signed-off-by: Sahithi Velma <sahithiv@wepay.com>
Signed-off-by: xuans <xuan_shen@outlook.com>
Signed-off-by: xuans <xuan_shen@outlook.com>
Signed-off-by: xuans <xuan_shen@outlook.com>
Signed-off-by: xuans <xuan_shen@outlook.com>
will take a look |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@crazy-2020 great work! I assume the the file_system_rds_csv_loader or publisher will be in a different pr?
also how's the test so far, are you able to run it locally to load the local mysql. Also i wonder whether we could force every model (or future model) to be subclass of both (GraphSerializable, TableSerializable)
Put a few comments as I just skim through, will take another look tomorrow.
@@ -9,3 +9,5 @@ | |||
SCHEMA_REVERSE_RELATION_TYPE = 'SCHEMA_OF' | |||
|
|||
DATABASE_SCHEMA_KEY_FORMAT = '{db}://{cluster}.{schema}' | |||
|
|||
SCHEMA_PATTERN_REGEX = '([a-z]+://[a-zA-Z0-9_-]+).[a-zA-Z0-9_.-]+' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe a comment on this regex? (e.g like what is the example value, what we are matching for)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do
databuilder/models/table_metadata.py
Outdated
node = self.get_node() | ||
yield node | ||
|
||
def _create_relation_iterator(self) -> Iterator[RDSModel]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be Iterator[GraphRelationship]
as type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, nice catch. I'll update it.
@@ -99,13 +138,13 @@ def __init__(self, | |||
:param source: The unique source of what is populating this description. | |||
:param text: the description text. Markdown supported. | |||
""" | |||
self._source = source | |||
self._text = text | |||
self.source = source |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason we remove the underscore? I think we try to use _var to indicate it is private.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rds models do not have a generic description model. There is only description model for each entity(e.g., TableDescription, SchemaDescription). The code change is going to create description table records from databuilder Table/Schema/Column models, namely, access some attributes outside of DescriptionMetadata class. Should I change it back to protected vars?(I think it may bring some syntax highlight warning). Thanks
record = RDSUser(rk=User.get_user_model_key(email=self.email)) | ||
for attr, value in record_attr_map.items(): | ||
if value or not self.do_not_update_empty_attribute: | ||
record.__setattr__(attr.key, value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you add a comment on this part? Are you trying to set the user record for optional attribute?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, I'll add a comment. Yes, it is going to set the empty attribute based on the flag
@@ -249,7 +249,7 @@ def test_table_without_schema(self, mock_build: Any) -> None: | |||
self.assertEqual(result.cluster, 'your-project-here') | |||
self.assertEqual(result.schema, 'fdgdfgh') | |||
self.assertEqual(result.name, 'nested_recs') | |||
self.assertEqual(result.description._text, '') | |||
self.assertEqual(result.description.text, '') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment on the rename
@feng-tao Thanks for the comments. Yes, there will be a separate pr for the features(e.g., mysql_data_loader/publisher) mentioned in another RFC
The test was mainly based on BigQuery and csv samples in databulider. Yes, I am able to push BQ and csv sample data into mysql backend with a rough implementation of mysql_dataloader/publisher on my local. |
@crazy-2020 feel free to ping once you update the pr |
Signed-off-by: xuans <xuan_shen@outlook.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, but will merge it next week in case others have comment
Summary of Changes
The PR added TableSerializable and mysql_serializer, which is the implementation of RFC adding an abstract serializable to support mysql backend
Main changes:
Tests
Unit tests added to test the new TableSerializable, mysql_serializer and table record iterators of databuilder models
Documentation
What documentation did you add or modify and why? Add any relevant links then remove this line
CheckList
Make sure you have checked all steps below to ensure a timely review.
make test