Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename Identifier -> Entity #490

Merged
merged 23 commits into from
May 4, 2023
Merged

Rename Identifier -> Entity #490

merged 23 commits into from
May 4, 2023

Conversation

WilliamDee
Copy link
Contributor

resolves dbt-labs/dbt-semantic-interfaces#9

Description

In the new world of dbt-core x MetricFlow Identifiers are becoming Entities. Additionally some of the properties of the object are changing. The resulting object should have the following properties

Property Name Type Description
name str Name of the entity
type enum Type of the entity
description str Description of the entity
role str Role of the entity
entities List[str] List of composite sub-entities
expr str Expression of the entity

The above properties are were pulled from dbt-labs/dbt-core#7456

@cla-bot cla-bot bot added the cla:yes label May 2, 2023
@github-actions
Copy link

github-actions bot commented May 2, 2023

Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide.

@WilliamDee WilliamDee force-pushed the will/rename-identifiers branch 9 times, most recently from 7751fd3 to ea39e77 Compare May 3, 2023 18:37
@WilliamDee WilliamDee marked this pull request as ready for review May 3, 2023 18:40
@WilliamDee WilliamDee force-pushed the will/rename-identifiers branch 2 times, most recently from b885b6f to 442aae0 Compare May 3, 2023 18:59
@tlento tlento temporarily deployed to DW_INTEGRATION_TESTS May 3, 2023 22:01 — with GitHub Actions Inactive
@tlento tlento temporarily deployed to DW_INTEGRATION_TESTS May 3, 2023 22:01 — with GitHub Actions Inactive
@tlento tlento temporarily deployed to DW_INTEGRATION_TESTS May 3, 2023 22:01 — with GitHub Actions Inactive
@tlento tlento temporarily deployed to DW_INTEGRATION_TESTS May 3, 2023 22:01 — with GitHub Actions Inactive
@tlento tlento temporarily deployed to DW_INTEGRATION_TESTS May 3, 2023 22:01 — with GitHub Actions Inactive
Copy link
Contributor

@tlento tlento left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Working through it, need to make dinner. Renaming things is so terrible, sorry.

Comment on lines 40 to 49
class Entity(HashableBaseModel, ModelWithMetadataParsing):
"""Describes a entity"""

name: str
description: Optional[str]
type: EntityType
role: Optional[str]
entities: List[CompositeSubEntity] = []
expr: Optional[str] = None

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should leave the metadata key here for now, as it's how we track lineage internally.

@QMalcolm this might be the time to rename metadata to something more descriptive, like lineage_metadata or lineage_info or something.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: We've decided to keep it for now. We might not use it right away, and we might refine it slightly before we do. The metadata as a concept can be incredibly useful for user experience.

UNIQUE = "unique"


class CompositeSubEntity(HashableBaseModel):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes everything so gnarly, but let's just move it over for now and decide what to do with it later.

@@ -294,16 +294,16 @@ def _convert_dimensions(
select_columns=select_columns,
)

def _create_identifier_instances(
def _create_entity_instances(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am up to here.

role: Optional[str]
entities: List[CompositeSubEntity] = []
expr: Optional[str] = None
metadata: Optional[Metadata] = None
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had it as Optional[Metadata] without a default value. I added the default of None to allow not having to explicitly set it to None if we don't pass this property in.

@@ -24,3 +24,23 @@ more-itertools = "8.10.0"
[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"

[tool.black]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@QMalcolm added this cause there was some CI issues with different black configurations between MF and this repo.

Copy link
Contributor

@QMalcolm QMalcolm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review of first 10 commits. Will review the rest after lunch.

Copy link
Contributor

@tlento tlento left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whew. Massive!

Test assertion change is odd, what's up there?

@@ -197,4 +197,4 @@ def test_local_linked_elements_for_metric(metric_semantics: MetricSemantics) ->
def test_get_data_sources_for_entity(data_source_semantics: DataSourceSemantics) -> None: # noqa: D
entity_reference = EntityReference(element_name="user")
linked_data_sources = data_source_semantics.get_data_sources_for_entity(entity_reference=entity_reference)
assert len(linked_data_sources) == 9
assert len(linked_data_sources) == 8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait what? Why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So basically, when I removed the entity property on the Identifier class, I removed all the entity fields on the test models. One of them was this 200445c#diff-e939351cfbeb60d978e42432cbfb25ebba45ff60f4ca48457712dad0eefd0ce3 and from get_data_sources_for_entity("user"), that bookings_source data source is no longer returned from that function call.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok cool, that makes sense, thanks!

metricflow/test/model/validations/test_entities.py Outdated Show resolved Hide resolved
Copy link
Contributor

@QMalcolm QMalcolm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finished my first pass! 😂

raise RuntimeError(
f"Could not find identifier instance with name ({identifier_spec_in_right_node})"
)
raise RuntimeError(f"Could not find identifier instance with name ({entity_spec_in_right_node})")
Copy link
Contributor

@QMalcolm QMalcolm May 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

f"Could not find identifier ... -> f"Could not find entity ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you caught this already, nice!

@tlento tlento temporarily deployed to DW_INTEGRATION_TESTS May 4, 2023 21:20 — with GitHub Actions Inactive
@tlento tlento temporarily deployed to DW_INTEGRATION_TESTS May 4, 2023 21:20 — with GitHub Actions Inactive
@tlento tlento temporarily deployed to DW_INTEGRATION_TESTS May 4, 2023 21:20 — with GitHub Actions Inactive
@tlento tlento temporarily deployed to DW_INTEGRATION_TESTS May 4, 2023 21:20 — with GitHub Actions Inactive
@tlento tlento temporarily deployed to DW_INTEGRATION_TESTS May 4, 2023 21:20 — with GitHub Actions Inactive
Copy link
Contributor

@tlento tlento left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, assuming all of the various snapshot tests pass across engines and @QMalcolm doesn't turn up any issues.

Copy link
Contributor

@QMalcolm QMalcolm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should dub this "The Great Renamen-ing". This looks good to me! (contingent that the remaining references to identifers / identifer are planned to go away in the renaming of DataSource to SemanticModel

@WilliamDee WilliamDee merged commit 97a7ed3 into main May 4, 2023
@WilliamDee WilliamDee deleted the will/rename-identifiers branch May 4, 2023 22:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Rename Identifiers to Entities and update object keys to new spec
3 participants