Enable grouping for minibatch preprocessing#254
Closed
kaknikhil wants to merge 3 commits intoapache:masterfrom
Closed
Enable grouping for minibatch preprocessing#254kaknikhil wants to merge 3 commits intoapache:masterfrom
kaknikhil wants to merge 3 commits intoapache:masterfrom
Conversation
|
Refer to this link for build results (access rights to CI server needed): |
njayaram2
reviewed
Apr 3, 2018
Contributor
njayaram2
left a comment
There was a problem hiding this comment.
LGTM, but for a couple of very minor comments.
| set_zero_std_to_one (optional, default is False. If set to true | ||
| 0.0 standard deviation values will be set to 1.0) | ||
| create_temp_table If set to true, create a persistent instead of a temp | ||
| table, else create a temp table for x_mean |
Contributor
There was a problem hiding this comment.
Shouldn't this comment say create temp table when true, and a persistent table when set to false?
| grouping_cols, | ||
| x_mean_table, | ||
| set_zero_std_to_one = True, | ||
| create_temp_table = False) |
Contributor
There was a problem hiding this comment.
Could you please correct the indentation here?
This commit enables grouping for the minibatch preprocessor module. Other changes 1. Added install check test for special chars. 2. Improved error messages and created a reusable function for testing column dimension in install check. 3. Add a new optional flag to utils_ind_var_scales_grouping so as to create a persistent x_mean table that will be reused as the standardization table by the preprocessor module. Co-authored-by: Jingyi Mei <jmei@pivotal.io>
This commit adds a new unittest file for the validate_args python file. The only two functions tested right now are input_tbl_valid and output_tbl_valid.
Before this commit, all the unit tests that wanted to assert that plpy.error was called had to assert that an Exception was raised. This was too generic and did not distinguish between an exception coming from the plpy mock class vs any other exception. With this commit, we now raise a custom plpy exception so that we don't need to assert for the equality of the error messages. Asserting for the exception is proof enough that plpy.error was called.
a4d8b69 to
76a3b9b
Compare
|
Refer to this link for build results (access rights to CI server needed): |
njayaram2
approved these changes
Apr 5, 2018
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR enables grouping for the minibatch preprocessor module.
Other changes
testing column dimension in install check.
utils_ind_var_scales_groupingso as tocreate a persistent x_mean table that will be reused as the
standardization table by the preprocessor module.
input_tbl_validandoutput_tbl_validin validate_args.py_inCo-authored-by: Jingyi Mei jmei@pivotal.io