Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create protocol abstraction for Dimensions #16

Closed
Tracked by #7
QMalcolm opened this issue Apr 27, 2023 · 2 comments · Fixed by #41
Closed
Tracked by #7

Create protocol abstraction for Dimensions #16

QMalcolm opened this issue Apr 27, 2023 · 2 comments · Fixed by #41
Assignees
Labels

Comments

@QMalcolm
Copy link
Collaborator

We want a standard definition of what a valid Dimension implementation should have. This allows for a shared definition to be understood in MetricFlow and dbt-core of what anything implementing the Dimension protocol will make available without MetricFlow and dbt-core needing to import each other. This should use the new Protocol type which exists in python 3.8+

@callum-mcdata
Copy link
Contributor

callum-mcdata commented May 3, 2023

@QMalcolm I wanted to lay out a proposed implementation to get your feedback on whether I'm heading in the right direction. Dimensions contain 4 classes that we'll want to migrate over:

  • DimensionType (enum)
  • DimensionValidityParams
  • DimensionTypeParams
  • Dimension

For the final three, I think it is reasonable for us to create Protocols for each of them.

Where I'm a bit fuzzy is in regard to our implementation of enums and methods that we've used the @property decorator on. Let's start with methods.

@Property Methods

To the best of my knowledge, Protocols can't support methods as properties. Which means:

    @property
    def is_primary_time(self) -> bool:  # noqa: D
        if self.type == DimensionType.TIME and self.type_params is not None:
            return self.type_params.is_primary

        return False

Would become:

    def is_primary_time(self) -> bool:
        if self.type == DimensionType.TIME and self.type_params is not None:
            return self.type_params.is_primary
        return False

Then when we implement it in MetricFlow we would use the @Property decorator and the default implementation from the Protocol. That way we ensure that we don't need to change its use all throughout the codebase.

    @property
    def is_primary_time(self) -> bool:
        return super().is_primary_time()

Does that seem like a reasonable implementation to you?

Enums

Okay this one I'm a bit more confused about. Do we ... have to do anything here beyond moving the Enum into this repo and then referencing it in the Protocol? Is it as simple as that?

Protocols

Oh just to make sure I'm also thinking about the Protocols correctly, is this looking fine? Do we have a naming convention of wanting to keep it to the same name or would we want Protocol to be appended onto the end of it so the implementing libraries can use the original name 🤔

class DimensionTypeParamsProtocol(Protocol):
    is_primary: bool
    time_granularity: TimeGranularity
    validity_params: Optional[DimensionValidityParams]

@callum-mcdata
Copy link
Contributor

callum-mcdata commented May 3, 2023

While we wait to cut the code over from MetricFlow, I'm gonna throw a draft into this issue.

# Keeping this the same!
class DimensionType(ExtendedEnum):
    """Determines types of values expected of dimensions."""

    CATEGORICAL = "categorical"
    TIME = "time"

    def is_time_type(self) -> bool:
        """Checks if this type of dimension is a time type"""
        return self in [DimensionType.TIME]

class DimensionValidityParamsProtocol(Protocol):
    is_start: bool = False
    is_end: bool = False

class DimensionTypeParamsProtocol(Protocol):
    is_primary: bool
    time_granularity: TimeGranularity
    validity_params: Optional[DimensionValidityParams]

class DimensionProtocol(Protocol):
    name: str
    description: Optional[str]
    type: DimensionType
    type_params: Optional[DimensionTypeParams]
    is_partition: bool
    expr: Optional[str]
    is_primary_time: bool
    reference: DimensionReference
    time_dimension_reference: TimeDimensionReference
    validity_params: Optional[DimensionValidityParams]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants