-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for server-specific data types. #63
Comments
Great question, thank you for raising this - some very off the top of my head thoughts - I welcome all disagreement and agreement. Going with my gut feeling Generally on types I see it potentially in three dimensions:
|
I like the Config map style option, it keeps it simple for those that need simple and allows for more refined data types to be added later. It would also allow for a parser to easily output for example an AVRO schema later or start with an AVRO schema and write out the start of a data contract |
I like the config map stile option. |
My concern with the config map option is it may introduce a lot of "engineer" talk into a document that ideally should be reviewed and used by a broad audience. I believe this could be handled through either handling the mapping internally or modelling the specifics of the datatype (e.g. the precision, min/max value, time aware, encoding, scale) as those are things that could apply to variety of datatypes and have semantic significance - as in they make things explicit but potentially without making the document unapproachable - I'd need to have some worked examples to verify this I'll try to carve some time to work through some examples |
I don't disagree on the a lot of "engineer" talk statement however, I fear that argument may already be shot down looking at the following support for JSON At the same time I think that it is very important that we do define these using a common method, as they are also useful for the data tests. As long as we can map those standards to the correct export types for each of the different export methods. As for the non engineers maybe we offer a --light style option which can parse out the technical detail leaving just a summary for non technical implementers to review. Or even in the HTML viewer a toggle to switch between both which would allow multiple parties to review the same document. I do think there is great benefit in keeping all the detail in one place save duplication but it does also mean we need to be able to display what is appropriate to each group of users and at the same time ensure when we export to AVRO, JSON etc we are putting in the maximum detail possible for the schemas (or tests) |
This will be a useful PR to add. For any other sorts of type I like |
We decided to go with Option: Add config map with server-specific fields (dbt-style). We decided to support a config map on model and field level. A config map may include any additional key-value pairs and support multiple server type bindings. Example: models:
orders:
config:
avroNamespace: "my.namespace"
fields:
my_field_1:
description: Example for AVRO with Timestamp (millisecond precision)
type: timestamp
example: 1970 00:00:00.000 UTC
config:
avroType: long
avroLogicalType: timestamp-millis
snowflakeType: timestamp_tz |
The specification has a closed enumeration of logical data types:
https://datacontract.com/#data-types
The enumeration makes it simple to support the creation of data contracts using tools like schema store. It is also very useful to support checks and other logic in tests, import, and export logic. It also helps, if the data provider and data consumer have different technologies. Tools like the Data Contract CLI then can convert the logical data type to the appropriate export format.
In some cases, however, the enumeration is not enough, e.g., when a specific type is used (e.g. TIMESTAMP_LTZ in Snowflake, SMALLINT) and it is important or helpful to specific this information.
Option: Do nothing
Option: Enumeration to String
type
attributeOption: Additional attribute
customType
custom
customType
Option: Additional attribute
physicalType
Option server-specific fields
Snowflake:
Option: Add config map with server-specific fields (dbt-style)
Like above, but put all additional information that may be useful for tooling in to a
config
,meta
, ... structure.The text was updated successfully, but these errors were encountered: