-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MAINTENANCE] General cleanup/refactor of DataAssistantResult
#5198
Conversation
✅ Deploy Preview for niobium-lead-7998 ready!
To edit notification comments on pull requests, go to your Netlify site settings. |
@@ -38,6 +39,8 @@ | |||
) | |||
from great_expectations.types import ColorPalettes, Colors, SerializableDictDot | |||
|
|||
ColumnDataFrame = namedtuple("ColumnDataFrame", ["column", "df"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to provide some more clarity, let's associate our column-df tuple with names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
really really like this adjustment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love it!
theme: Dict[str, Any] = DataAssistantResult._get_theme(theme=theme) | ||
theme = DataAssistantResult._get_theme(theme=theme) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already declared above so the type annotation is not necessary.
@@ -440,15 +443,15 @@ def get_expect_domain_values_to_be_between_chart( | |||
column | |||
for column in df.columns | |||
if column | |||
not in [ | |||
not in { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very minor but set vs list lookup
column_based_expectation_configurations_by_type: Dict[ | ||
str, List[ExpectationConfiguration] | ||
] = self._filter_expectation_configurations_by_column_type( | ||
expectation_configurations, include_column_names, exclude_column_names | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Helper method to filter relevant column-based expectation configurations from the list - the result is a dictionary with keys representing expectation name and values representing the list of relevant configs.
for ( | ||
column_based_expectation_configurations | ||
) in column_based_expectation_configurations_by_type.values(): | ||
display_charts_for_expectation: List[ | ||
alt.VConcatChart | ||
] = self._create_display_chart_for_column_domain_expectation( | ||
expectation_configurations=column_based_expectation_configurations, | ||
attributed_metrics=attributed_metrics_by_column_domain, | ||
plot_mode=plot_mode, | ||
sequential=sequential, | ||
) | ||
display_charts.extend(display_charts_for_expectation) | ||
|
||
for expectation_configuration in column_based_expectation_configurations: | ||
return_chart: alt.Chart = ( | ||
self._create_return_chart_for_column_domain_expectation( | ||
expectation_configuration=expectation_configuration, | ||
attributed_metrics=attributed_metrics_by_column_domain, | ||
plot_mode=plot_mode, | ||
sequential=sequential, | ||
) | ||
) | ||
return_charts.append(return_chart) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For each column-based expectation, create a layer chart (this is only one for the VolumeDataAssistant
but will be more numerous for the OnboardingDataAssistant
). I've also moved the _create_return_chart
calls to be within the same loop to save on work.
if plot_mode is PlotMode.PRESCRIPTIVE: | ||
if metric_name in implemented_metrics: | ||
if metric_name in implemented_metrics: | ||
if plot_mode is PlotMode.PRESCRIPTIVE: | ||
plot_impl = self.get_expect_domain_values_to_be_between_chart | ||
elif plot_mode is PlotMode.DESCRIPTIVE: | ||
if metric_name in implemented_metrics: | ||
elif plot_mode is PlotMode.DESCRIPTIVE: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Flip around conditionals to save on some code (don't need to use if metric_name in implemented_metrics
both times).
|
||
if metric_name == "column_distinct_values_count": | ||
if plot_mode is PlotMode.PRESCRIPTIVE: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is functionally equivalent but please call me out if not!
# OLD
if plot_mode is PlotMode.PRESCRIPTIVE:
if metric_name == "column_distinct_values_count":
plot_impl = (
self.get_interactive_detail_expect_column_values_to_be_between_chart
)
else:
if metric_name == "column_distinct_values_count":
plot_impl = self.get_interactive_detail_multi_chart
# NEW
if metric_name == "column_distinct_values_count":
if plot_mode is PlotMode.PRESCRIPTIVE:
plot_impl = (
self.get_interactive_detail_expect_column_values_to_be_between_chart
)
else:
plot_impl = self.get_interactive_detail_multi_chart
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same note with the similar refactor above!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, but as we continue to add more metrics/expectations we will need possibly rethink if the metric names can get pulled from EXPECTATION_METRIC_MAP
or if this get's pushed down into child classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…_expectations into maintenance/refactor-data-assistant-result
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
love it. Thank you @cdkini
@@ -38,6 +39,8 @@ | |||
) | |||
from great_expectations.types import ColorPalettes, Colors, SerializableDictDot | |||
|
|||
ColumnDataFrame = namedtuple("ColumnDataFrame", ["column", "df"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
really really like this adjustment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@@ -38,6 +39,8 @@ | |||
) | |||
from great_expectations.types import ColorPalettes, Colors, SerializableDictDot | |||
|
|||
ColumnDataFrame = namedtuple("ColumnDataFrame", ["column", "df"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love it!
@@ -564,7 +567,7 @@ def get_interactive_detail_multi_chart( | |||
batch_name: str = "batch" | |||
batch_identifiers: List[str] = [ | |||
column | |||
for column in column_dfs[0][1].columns | |||
for column in column_dfs[0].df.columns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
|
||
domain = domains_by_column_name[domain_kwargs["column"]] | ||
domain = domains_by_column_name[column_name] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤦🏻
|
||
if metric_name == "column_distinct_values_count": | ||
if plot_mode is PlotMode.PRESCRIPTIVE: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, but as we continue to add more metrics/expectations we will need possibly rethink if the metric names can get pulled from EXPECTATION_METRIC_MAP
or if this get's pushed down into child classes.
…' of https://github.com/great-expectations/great_expectations into feature/GREAT-933/data-context-new-hierarchy-with-stubs * 'feature/GREAT-933/data-context-new-hierarchy-with-stubs' of https://github.com/great-expectations/great_expectations: [MAINTENANCE] suppressing type hints in ill-defined situations (#5213) Bugfix for initial position of bar charts with selections (#5212) [RELEASE] 0.15.7 (#5210) typo (#5207) [MAINTENANCE] General cleanup/refactor of `DataAssistantResult` (#5198) [BUGFIX] RuleBasedProfiler: Ensure that run() method runtime environment directives are handled correctly when existing setting is None (by default) (#5202)
Changes proposed in this pull request:
OnboardingDataAssistant
plottingDefinition of Done
Please delete options that are not relevant.