Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Add YAML config option to disable progress bars #3794

Merged
merged 43 commits into from Jan 4, 2022
Merged
Show file tree
Hide file tree
Changes from 27 commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
cacf5cf
feat: Add YAML config option to disable progress bars
cdkini Dec 3, 2021
0f8f952
push before adding test descriptions
Dec 8, 2021
0635ed6
updated tests
Dec 8, 2021
6e1aed0
Update test_data_context.py
Dec 8, 2021
a303d8a
Update test_data_context.py
Dec 8, 2021
70a1bbb
Merge branch 'develop' into feature/ANT-29/datacontext-is-singleton
donaldheppner Dec 8, 2021
2356561
Merge branch 'develop' of github.com:great-expectations/great_expecta…
cdkini Dec 8, 2021
df1fbc4
Merge branch 'feature/ANT-29/datacontext-is-singleton' of github.com:…
cdkini Dec 8, 2021
cdb5544
Merge branch 'develop' of github.com:great-expectations/great_expecta…
cdkini Dec 9, 2021
c064add
feat: Add granular fields
cdkini Dec 9, 2021
c98f6d8
chore: Remove util method
cdkini Dec 9, 2021
632d7b6
chore: Revert Validator
cdkini Dec 9, 2021
af110ad
feat: Add conditional logic to validator
cdkini Dec 9, 2021
4e909c9
feat: Add logic to get progress bar config in UCP
cdkini Dec 9, 2021
e5bd91a
feat: Add util func
cdkini Dec 9, 2021
2bd69ce
Merge branch 'develop' of github.com:great-expectations/great_expecta…
cdkini Dec 14, 2021
015a483
feat: use DataContext progress_bars attr
cdkini Dec 14, 2021
9b7e19b
chore: remove unnecessary imports
cdkini Dec 14, 2021
a4a964d
feat: move functionality to ProgressBarsConfig
cdkini Dec 14, 2021
127a7fb
fix: ensure proper name of key in Validator
cdkini Dec 14, 2021
b1bfc90
refactor: make methods public to work with method
cdkini Dec 14, 2021
1ed9d26
feat: use is_enabled instead of disabled
cdkini Dec 14, 2021
e584418
Merge branch 'develop' of github.com:great-expectations/great_expecta…
cdkini Dec 16, 2021
80903e7
fix: fix typo of new class
cdkini Dec 16, 2021
8ef6b94
fix: remove method on new config class
cdkini Dec 16, 2021
470a5f4
test: add new field to tests
cdkini Dec 16, 2021
ef7a1a5
feat: update conditional per Tal's comments
cdkini Dec 16, 2021
ad55c7d
chore: make misc changes after discussion with Tal
cdkini Dec 16, 2021
6cae763
feat: add post_dump hook
cdkini Dec 16, 2021
7c4961c
chore: type hint
cdkini Dec 16, 2021
fa7fccd
Merge branch 'develop' of github.com:great-expectations/great_expecta…
cdkini Dec 16, 2021
ab27979
docs: add comments to post_dump hook
cdkini Dec 16, 2021
58a6b6c
Merge branch 'develop' into feature/devrel-766/option-to-disable-prog…
cdkini Dec 17, 2021
a9006c4
Merge branch 'develop' of github.com:great-expectations/great_expecta…
cdkini Dec 28, 2021
9698c4f
Merge branch 'feature/devrel-766/option-to-disable-progress-bars' of …
cdkini Dec 28, 2021
fbedc25
Merge branch 'develop' into feature/devrel-766/option-to-disable-prog…
cdkini Dec 29, 2021
d041120
Merge branch 'develop' of github.com:great-expectations/great_expecta…
cdkini Jan 4, 2022
ea20dcb
test: write tests for user configurable profiler
cdkini Jan 4, 2022
e2b82cd
test: write tests for validator
cdkini Jan 4, 2022
902ca99
docs: add comment to test
cdkini Jan 4, 2022
38b6abb
test: misc cleanup in prep for review
cdkini Jan 4, 2022
d668ac0
chore: remove progress bars from test output
cdkini Jan 4, 2022
a4352f3
test: fix breaking test
cdkini Jan 4, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
5 changes: 5 additions & 0 deletions great_expectations/data_context/data_context.py
Expand Up @@ -78,6 +78,7 @@
DataContextConfigDefaults,
DatasourceConfig,
GeCloudConfig,
ProgressBarsConfig,
anonymizedUsageStatisticsSchema,
dataContextConfigSchema,
datasourceConfigSchema,
Expand Down Expand Up @@ -818,6 +819,10 @@ def anonymous_usage_statistics(self):
def concurrency(self) -> ConcurrencyConfig:
return self.project_config_with_variables_substituted.concurrency

@property
def progress_bars(self) -> ProgressBarsConfig:
return self.project_config_with_variables_substituted.progress_bars

@property
def notebooks(self):
return self.project_config_with_variables_substituted.notebooks
Expand Down
25 changes: 25 additions & 0 deletions great_expectations/data_context/types/base.py
Expand Up @@ -1082,6 +1082,24 @@ def make_notebooks_config(self, data, **kwargs):
return NotebooksConfig(**data)


class ProgressBarsConfig(DictDot):
def __init__(
self,
globally: bool = True,
profilers: bool = True,
metric_calculations: bool = True,
):
self.globally = globally
self.profilers = profilers
self.metric_calculations = metric_calculations


class ProgressBarsConfigSchema(Schema):
globally = fields.Boolean(default=True)
profilers = fields.Boolean(default=True)
metric_calculations = fields.Boolean(default=True)
Comment on lines +1085 to +1100
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this automatically create this field when a new Great Expectations project is initialized? I want to make sure that we make this option available through the Data Context, but also that we don't add a mandatory field to the Data Context, so we don't run into an issue where every new Data Context created won't work with older version of Great Expectations, unless the user specifically wants this feature.

Now that I think about it, I think that the same issue might have happened a couple of times with concurrency, and I wonder if we should treat that in the same way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when creating a new DataContext through v3 init, I don't see this field nor concurrency. Let's discuss this live to make sure I'm capturing everything



class ConcurrencyConfig(DictDot):
"""WARNING: This class is experimental."""

Expand Down Expand Up @@ -1173,6 +1191,7 @@ class DataContextConfigSchema(Schema):
)
config_variables_file_path = fields.Str(allow_none=True)
anonymous_usage_statistics = fields.Nested(AnonymizedUsageStatisticsConfigSchema)
progress_bars = fields.Nested(ProgressBarsConfigSchema)
concurrency = fields.Nested(ConcurrencyConfigSchema)

# noinspection PyMethodMayBeStatic
Expand Down Expand Up @@ -1783,6 +1802,7 @@ def __init__(
store_backend_defaults: Optional[BaseStoreBackendDefaults] = None,
commented_map: Optional[CommentedMap] = None,
concurrency: Optional[Union[ConcurrencyConfig, Dict]] = None,
progress_bars: Optional[ProgressBarsConfig] = None,
):
# Set defaults
if config_version is None:
Expand Down Expand Up @@ -1835,6 +1855,10 @@ def __init__(
concurrency = ConcurrencyConfig(**concurrency)
self.concurrency: ConcurrencyConfig = concurrency

if progress_bars is None:
progress_bars = ProgressBarsConfig()
self.progress_bars = progress_bars

super().__init__(commented_map=commented_map)

# TODO: <Alex>ALEX (we still need the next two properties)</Alex>
Expand Down Expand Up @@ -2435,3 +2459,4 @@ class CheckpointValidationConfigSchema(Schema):
notebookConfigSchema = NotebookConfigSchema()
checkpointConfigSchema = CheckpointConfigSchema()
concurrencyConfigSchema = ConcurrencyConfigSchema()
progressBarsConfigSchema = ProgressBarsConfigSchema()
24 changes: 22 additions & 2 deletions great_expectations/profile/user_configurable_profiler.py
Expand Up @@ -109,7 +109,9 @@ def __init__(
self.profile_dataset = profile_dataset
assert isinstance(self.profile_dataset, (Batch, Dataset, Validator))

context: Optional["DataContext"] = None
if isinstance(self.profile_dataset, Batch):
context = self.profile_dataset.data_context
self.profile_dataset = Validator(
execution_engine=self.profile_dataset.data.execution_engine,
batches=[self.profile_dataset],
Expand All @@ -118,12 +120,22 @@ def __init__(
MetricConfiguration("table.columns", {})
)
elif isinstance(self.profile_dataset, Validator):
context = self.profile_dataset.data_context
self.all_table_columns = self.profile_dataset.get_metric(
MetricConfiguration("table.columns", {})
)
else:
self.all_table_columns = self.profile_dataset.get_table_columns()

# Check to see if the user has disabled progress bars
self._enable_progress_bars = True
if context:
progress_bars = context.progress_bars
if "globally" in progress_bars:
self._enable_progress_bars = progress_bars["globally"]
if "profilers" in progress_bars:
self._enable_progress_bars = progress_bars["profilers"]

self.semantic_types_dict = semantic_types_dict
assert isinstance(self.semantic_types_dict, (dict, type(None)))

Expand Down Expand Up @@ -294,7 +306,10 @@ def _build_expectation_suite_from_semantic_types_dict(self):
)

with tqdm(
desc="Profiling Columns", total=len(self.column_info), delay=5
desc="Profiling Columns",
total=len(self.column_info),
delay=5,
disable=not self._enable_progress_bars,
) as pbar:
for column_name, column_info in self.column_info.items():
pbar.set_postfix_str(f"Column={column_name}")
Expand Down Expand Up @@ -339,7 +354,12 @@ def _profile_and_build_expectation_suite(self):

self._build_expectations_table(profile_dataset=self.profile_dataset)

with tqdm(desc="Profiling", total=len(self.column_info), delay=5) as pbar:
with tqdm(
desc="Profiling",
total=len(self.column_info),
delay=5,
disable=not self._enable_progress_bars,
) as pbar:
for column_name, column_info in self.column_info.items():
pbar.set_postfix_str(f"Column={column_name}")
data_type = column_info.get("type")
Expand Down
15 changes: 14 additions & 1 deletion great_expectations/validator/validator.py
Expand Up @@ -212,6 +212,10 @@ def __dir__(self):

return list(combined_dir)

@property
def data_context(self) -> Optional["DataContext"]:
return self._data_context

@property
def expose_dataframe_methods(self) -> bool:
return self._expose_dataframe_methods
Expand Down Expand Up @@ -689,12 +693,21 @@ def resolve_validation_graph(
validation_graph=graph, metrics=metrics
)

# Check to see if the user has disabled progress bars
disable = len(graph.edges) < 3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did you arrive at this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the previous condition as set by Alex! I didn't want to remove old behavior so in the case we've enabled the progress bars, that original conditional still applies

if self._data_context:
progress_bars = self._data_context.progress_bars
if "globally" in progress_bars:
disable = not progress_bars["globally"]
if "metric_calculations" in progress_bars:
disable = not progress_bars["metric_calculations"]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if self._data_context:
progress_bars = self._data_context.progress_bars
if "globally" in progress_bars:
disable = not progress_bars["globally"]
if "metric_calculations" in progress_bars:
disable = not progress_bars["metric_calculations"]
if self._data_context:
progress_bars = self._data_context.progress_bars
if "globally" in progress_bars:
disable = not progress_bars["globally"]
if "metric_calculations" in progress_bars:
disable = not progress_bars["metric_calculations"]
Suggested change
if self._data_context:
progress_bars = self._data_context.progress_bars
if "globally" in progress_bars:
disable = not progress_bars["globally"]
if "metric_calculations" in progress_bars:
disable = not progress_bars["metric_calculations"]
if self._data_context:
progress_bars = self._data_context.progress_bars
if "globally" in progress_bars:
disable = not progress_bars["globally"]
if "metric_calculations" in progress_bars:
disable = not progress_bars["metric_calculations"]
if len(graph.edges) < 3:
disable = True


if pbar is None:
# noinspection PyProtectedMember,SpellCheckingInspection
pbar = tqdm(
total=len(ready_metrics) + len(needed_metrics),
desc="Calculating Metrics",
disable=len(graph.edges) < 3,
disable=disable,
)
pbar.update(0)

Expand Down
10 changes: 10 additions & 0 deletions tests/data_context/test_data_context_config_ui.py
Expand Up @@ -75,6 +75,11 @@ def _construct_data_context_config(
"enabled": True,
},
"concurrency": {"enabled": False},
"progress_bars": {
"globally": True,
"profilers": True,
"metric_calculations": True,
},
}

return _construct_data_context_config
Expand Down Expand Up @@ -1342,6 +1347,11 @@ def test_DataContextConfig_with_InMemoryStoreBackendDefaults(
},
"validations_store_name": "validations_store",
"concurrency": {"enabled": False},
"progress_bars": {
"globally": True,
"profilers": True,
"metric_calculations": True,
},
}

data_context_config_schema = DataContextConfigSchema()
Expand Down
1 change: 1 addition & 0 deletions tests/data_context/test_data_context_v013.py
Expand Up @@ -191,6 +191,7 @@ def test_get_config(empty_data_context):
"anonymous_usage_statistics",
"notebooks",
"concurrency",
"progress_bars",
}


Expand Down