Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-5785] Add an Imputer for preparing data #3625

Closed
wants to merge 12 commits into from
Closed

[FLINK-5785] Add an Imputer for preparing data #3625

wants to merge 12 commits into from

Conversation

p4nna
Copy link

@p4nna p4nna commented Mar 27, 2017

Provides an Imputer for sparse DataSets of Vectors.
Adds missing values with the mean, median or most frequent value of each vector resp. dimension

Two testclasses which test the functions implemented in the new imputer class. One for the rowwise imputing over all vectors and one for the vectorwise imputing
adds missing values in sparse DataSets of Vectors
@zentol
Copy link
Contributor

zentol commented Mar 27, 2017

Regarding the license: Every (non-binary) file in the flink repository must have the apache license at the very top of the file. Simply take a look at an existing scala class and you'll see what i mean.

Second: It is not required to open a new PR when making changes, you can add commits to the branch of the PR. (note that force-pushes should only be done if necessary).

Third, the file count in this PR is dramatically higher than in the last one (4 vs 84), is this intended or a mistake?

@p4nna p4nna closed this Mar 30, 2017
@p4nna p4nna deleted the ml-Imputer-edits branch March 30, 2017 07:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants