New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MAINTENANCE] Fluent Datasources - don't register "private" Datasource classes #7124
Conversation
✅ Deploy Preview for niobium-lead-7998 ready!
To edit notification comments on pull requests, go to your Netlify site settings. |
👇 Click on the image for a new way to code review
Legend |
print(f"✅ {name} - {schema_path.name} schema updated") | ||
print(f"🔃 {name} - {schema_path.name} schema updated") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the symbol here so it's easier to tell at-a-glance if schemas have changed
if issubclass(model, Datasource): | ||
print(f"🙈 {name} - is a Datasource; skipping") | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stop skipping Datasource
schema generation
@@ -80,7 +79,6 @@ class _PandasDatasource(Datasource): | |||
asset_types: ClassVar[List[Type[DataAsset]]] = list(_ASSET_MODELS.values()) | |||
|
|||
# instance attributes | |||
type: str = pydantic.Field("_pandas") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This field serves no purpose because this class is not meant to be instantiated. Was only defined to prevent a crash during Metadatasource.__new__
registration. Now that Metadatasource
skips registration for _Datasources
we can remove it.
@@ -49,7 +48,6 @@ class _SparkDatasource(Datasource): | |||
asset_types: ClassVar[List[Type[DataAsset]]] = [CSVSparkAsset] | |||
|
|||
# instance attributes | |||
type: str = pydantic.Field("_spark") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Kilo59 I may need some help next week figuring out the right way to create SparkFilesystemDatasource
, SparkS3Datasource
, and so on in terms of the type
field as well as the Schema
file naming and generation. Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would think this should just be 'spark_file'
or 'spark_filesystem'
and 'spark_s3'
.
@@ -22,10 +22,16 @@ | |||
"additionalProperties": { | |||
"$ref": "#/definitions/CSVSparkAsset" | |||
} | |||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Kilo59 I may need some help next week figuring out the right way to create SparkFilesystemDatasource
, SparkS3Datasource
, and so on in terms of the type
field as well as the Schema
file naming and generation. Most likely, this particular file will have to attain a different name (e.g., SparkFilesystemDatasource.json
). Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (in addition, please see the comments). Thank you!
Changes proposed in this pull request:
Datasource
classes (ex_PandasDatasource
)invoke schema --sync
to be more transparent about when and why schemas are updated (or not updated)Datasource
schemas in addition toDataAsset
schemasDefinition of Done