Skip to content

Use structured data for semantic type #44

@eywalker

Description

@eywalker

Currently, semantic types like Path is supported through the use of metadata field on the respective column. While this works well for data system with metadata support such as Arrow and Delta Table, Polars do not support metadata and this can lead to loss of information when converting into Polars dataframe. A viable alternative is to use structured data type to capture semantic data types. For example, Path may be captured with semantic data type like {_semantic_type: "path", path: "/path/value/as/string"}. This data would correspond to column type of { _semantic_type: str, path: str }. The presence of the special field _semantic_type and its value would allow for the semantic type to be inferred correctly and for the system to handle the content of the column in a manner that respects the nature of the data (eg for Path, the content of the file/dir pointed to by the path will be hashed/transferred).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions