-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] Implement the "column.standard_deviation" metric for sqlite database #5338
Conversation
…4/GREAT-999/alexsherstinsky/rule_based_profiler/data_assistant/data_assistant_result/move_getting_expectation_suite_to_expectation_suite_class-2022_06_15-170
…/alexsherstinsky/rule_based_profiler/remove_json_serialize_directive_and_add_raw_parameter_builder_computation_results_to_json_serialized_results-2022_06_15-170
…/alexsherstinsky/rule_based_profiler/remove_json_serialize_directive_and_add_raw_parameter_builder_computation_results_to_json_serialized_results-2022_06_15-170
…/alexsherstinsky/rule_based_profiler/remove_json_serialize_directive_and_add_raw_parameter_builder_computation_results_to_json_serialized_results-2022_06_15-170
…ror-changelog' into maintenance/GREAT-467/GREAT-464/GREAT-999/alexsherstinsky/rule_based_profiler/remove_json_serialize_directive_and_add_raw_parameter_builder_computation_results_to_json_serialized_results-2022_06_15-170
…ky/rule_based_profiler/remove_json_serialize_directive_and_add_raw_parameter_builder_computation_results_to_json_serialized_results-2022_06_15-170' into pre_pr-prototype/maintenance/GREAT-467/GREAT-464/GREAT-1000/alexsherstinsky/rule_based_profiler/data_assistant/onboarding_data_assistant/performance_improvements-2022_06_15-171
…exsherstinsky/rule_based_profiler/enable_numeric_metric_range_multibatch_parameter_builder_to_use_evaluation_dependencies-2022_06_16-171
…EAT-464/GREAT-1000/alexsherstinsky/rule_based_profiler/data_assistant/onboarding_data_assistant/performance_improvements-2022_06_15-171
…rule_based_profiler/enable_numeric_metric_range_multibatch_parameter_builder_to_use_evaluation_dependencies-2022_06_16-171' into pre_pr-prototype/maintenance/GREAT-467/GREAT-464/GREAT-1000/alexsherstinsky/rule_based_profiler/data_assistant/onboarding_data_assistant/performance_improvements-2022_06_15-171
…/maintenance/GREAT-467/GREAT-464/GREAT-1000/alexsherstinsky/rule_based_profiler/data_assistant/onboarding_data_assistant/performance_improvements-2022_06_15-171
…lder throughout the codebase.
…lder throughout the codebase.
…EAT-464/GREAT-1000/alexsherstinsky/rule_based_profiler/data_assistant/onboarding_data_assistant/performance_improvements-2022_06_15-171
…EAT-464/GREAT-1000/alexsherstinsky/rule_based_profiler/data_assistant/onboarding_data_assistant/performance_improvements-2022_06_15-171
👇 Click on the image for a new way to code review
Legend |
✅ Deploy Preview for niobium-lead-7998 ready!
To edit notification comments on pull requests, go to your Netlify site settings. |
@@ -25,7 +25,7 @@ def _pandas(cls, column, **kwargs): | |||
def _sqlalchemy(cls, column, **kwargs): | |||
"""SqlAlchemy Mean Implementation""" | |||
# column * 1.0 needed for correct calculation of avg in MSSQL | |||
return sa.func.avg(column * 1.0) | |||
return sa.func.avg(1.0 * column) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious: does this make a difference in the results of the calculation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Shinnnyshinshin No -- only sense of aesthetics on my part. Thanks!
…_standard_deviation_metric_for_sqlite_database-2022_06_17-176
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really appreciate you digging into this @alexsherstinsky Thank you very very much. LGTM
elif _dialect.name.lower() == "sqlite": | ||
mean = _metrics["column.mean"] | ||
nonnull_row_count = _metrics["column_values.null.unexpected_count"] | ||
standard_deviation = sa.func.sqrt( | ||
sa.func.sum((1.0 * column - mean) * (1.0 * column - mean)) | ||
/ ((1.0 * nonnull_row_count) - 1.0) | ||
) | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
@@ -1496,7 +1496,6 @@ def candidate_test_is_on_temporary_notimplemented_list_cfe(context, expectation_ | |||
"expect_column_values_to_be_dateutil_parseable", | |||
"expect_column_values_to_be_json_parseable", | |||
"expect_column_values_to_match_json_schema", | |||
"expect_column_stdev_to_be_between", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoohoo. Just explicitly stating that this enables the expect_column_stdev_to_be_between
Expectation tests for sql dialects (include sqlite
). So additional unittests are not needed for this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great and everything is passing for expect_column_stdev_to_be_between
in the build_gallery.py script! Thanks!
…ture/GREAT-953/migration-part2-refactor-usage-stats-opt * feature/GREAT-953/migration-part1-move-to-abc: Update base_data_context.py [MAINTENANCE] Update release schedule JSON (#5349) [DOCS] DOC-337 automate updates to the version information displayed in the getting started tutorial. (#5348) Maintenance/great 761/great 1010/great 1011/alexsherstinsky/rule based profiler/data assistant/include only essential public methods in data assistant dispatcher class 2022 06 21 177 (#5351) [FEATURE] Implement the "column.standard_deviation" metric for sqlite database (#5338) [BUGFIX] Fix for failing Expectation test in `cloud_db_integration` pipeline (#5321)
Please annotate your PR title to describe what the PR does, then give a brief bulleted description of your PR below. PR titles should begin with [BUGFIX], [FEATURE], [DOCS], or [MAINTENANCE]. If a new feature introduces breaking changes for the Great Expectations API or configuration files, please also add [BREAKING]. You can read about the tags in our contributor checklist.
Changes proposed in this pull request:
After submitting your PR, CI checks will run and @cla-bot will check for your CLA signature.
For a PR with nontrivial changes, we review with both design-centric and code-centric lenses.
In a design review, we aim to ensure that the PR is consistent with our relationship to the open source community, with our software architecture and abstractions, and with our users' needs and expectations. That review often starts well before a PR, for example in github issues or slack, so please link to relevant conversations in notes below to help reviewers understand and approve your PR more quickly (e.g.
closes #123
).Previous Design Review notes:
Definition of Done
Please delete options that are not relevant.
Thank you for submitting!