Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(dbt sync): Merge metadata and preserve Preset configs #238

Merged
merged 7 commits into from
Aug 31, 2023

Conversation

Vitor-Avila
Copy link
Contributor

By default, when triggering a dbt sync the CLI:

  1. Delete all existing metrics from the dataset.
  2. Create metrics based on the dbt definitions.

While this is the expected scenario for the integration (to preserve dbt as the source of truth), this approach can be impactful for organizations that are already extensively using Preset/Superset and therefore have custom metrics defined in their datasets that don't exist in dbt yet. For these cases, the sync could result in broken charts/dashboards, since metrics that only exist in Preset don't get re-created.

This PR introduces two modifications:

  • Changing the --preserve-columns flag to --preserve-metadata: The --preserve-columns was used to prevent a column sync during the update, to properly preserve column configs (Is temporal, Is dimension, etc). This flag is now re-named to --preserve-metadata and also:
    • Preserve Preset-only metrics in the dataset (they are no longer deleted).
    • For metrics that exist both in Preset and dbt, Preset metadata is maintained (label, metric syntax, etc).
  • Creating a new flag --merge-metadata: This flag can be used to:
    • Preserve Preset-only metrics in the dataset (they are no longer deleted).
    • For metrics that exist both in Preset and dbt, dbt metadata is applied (label, metric syntax, etc).

The validation if the metric already exists in Preset happens comparing the metric name from dbt with the metric_name from Preset.

Copy link
Member

@betodealmeida betodealmeida left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/preset_cli/cli/superset/sync/dbt/command.py Outdated Show resolved Hide resolved
src/preset_cli/cli/superset/sync/dbt/command.py Outdated Show resolved Hide resolved
Comment on lines +44 to +48
for key in ("changed_on", "created_on", "type_generic"):
if key in metadata:
del metadata[key]

return metadata
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for key in ("changed_on", "created_on", "type_generic"):
if key in metadata:
del metadata[key]
return metadata
return {
k: v for k, v in metadata.items()
if k not in {"changed_on", "created_on", "type_generic"}
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(You might also want to make this function more generic by passing the keys that should be filtered out as an argument.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a great idea! I think for now the scope might be limited, but I'll check if this is used in other parts of the code (not necessarily related with the dbt sync) and I can improve this in a future PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsure why, but the suggested change breaks our current tests. It might require updating the tests as well. I'll try to cover this too while simplifying sync_datasets.

src/preset_cli/cli/superset/sync/dbt/datasets.py Outdated Show resolved Hide resolved
tests/cli/superset/sync/dbt/command_test.py Outdated Show resolved Hide resolved
Vitor-Avila and others added 6 commits August 29, 2023 23:26
Co-authored-by: Beto Dealmeida <roberto@dealmeida.net>
Co-authored-by: Beto Dealmeida <roberto@dealmeida.net>
Co-authored-by: Beto Dealmeida <roberto@dealmeida.net>
@Vitor-Avila Vitor-Avila merged commit 96f087c into main Aug 31, 2023
4 checks passed
@Vitor-Avila Vitor-Avila deleted the dbt_preserve_metrics branch August 31, 2023 22:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants