New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GSOC]DatasetMapper & Imputer #694
Merged
Merged
Changes from 1 commit
Commits
Show all changes
47 commits
Select commit
Hold shift + click to select a range
87c05a5
concept work for imputer
keon 2e4b1a8
Merge branch 'master' of github.com:keonkim/mlpack into imputer
keon 631e59e
do not to use NaN by default, let the user specify
keon 391006e
Merge branch 'master' of github.com:keonkim/mlpack into imputer
keon 6a1fb81
add template to datasetinfo and add imputer class
keon b0c5224
clean datasetinfo class and rename files
keon de35241
implement basic imputation strategies
keon 2d38604
modify imputer_main and clean logs
keon bb045b8
add parameter verification for imputer_main
keon 1295f4b
add custom strategy to impute_main
keon 5a517c2
add datatype change in IncrementPolicy
keon 94b7a5c
update types used in datasetinfo
keon ebed68f
initialize imputer with parameters
keon db78f39
remove datatype in dataset_info
keon 7c60b97
Merge branch 'master' of github.com:keonkim/mlpack into imputer
keon da4e409
add test for imputer
keon d8618ec
restructure, add listwise deletion & imputer tests
keon 3b8ffd0
fix transpose problem
keon 90a5cd2
Merge pull request #7 from mlpack/master
keon 32c8a73
merge
keon e09d9bc
updates and fixes on imputation methods
keon 87d8d46
update data::load to accept different mappertypes
keon de0b2db
update data::load to accept different policies
keon bc187ca
add imputer doc
keon a340f69
debug median imputation and listwise deletion
keon 21d94c0
remove duplicate code in load function
keon a92afaa
delete load overload
keon bace8b2
modify MapToNumerical to work with MissingPolicy
keon 896a018
MissingPolicy uses NaN instead of numbers
keon 1a908c2
fix reference issue in DatasetMapper
keon 2edbc40
Move MapToNumerical(MapTokens) to Policy class
keon d881cb7
make policy and imputation api more consistent
keon a881831
numerical values can be set as missing values
keon 63268a3
add comments and use more proper names
keon 2eb6754
modify custom impute interface and rename variables
keon 6d43aa3
add input-only overloads to imputation methods
keon fedc5e0
update median imputation to exclude missing values
keon 787fd82
optimize imputation methods with output overloads
keon a0b7d59
expressive comments in imputation_test
keon 9a6dce7
shorten imputation tests
keon c3aeba1
optimize preprocess imputer executable
keon 028c217
fix bugs in imputation test
keon 03e19a4
add more comments and delete impute_test.csv
keon ef4536b
Merge pull request #8 from mlpack/master
keon 6e2c1ff
Merge branch 'master' of github.com:keonkim/mlpack into imputer
keon 5eb9abd
fix PARAM statements in imputer
keon d043235
delete Impute() overloads that produce output matrix
keon File filter
Filter by extension
Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
/** | ||
* @file missing_policy.hpp | ||
* @author Keon Kim | ||
* | ||
*/ | ||
#ifndef MLPACK_CORE_DATA_MAP_POLICIES_DATATYPE_HPP | ||
#define MLPACK_CORE_DATA_MAP_POLICIES_DATATYPE_HPP | ||
|
||
#include <mlpack/core.hpp> | ||
|
||
namespace mlpack { | ||
namespace data { | ||
|
||
/** | ||
* The Datatype enum specifies the types of data mlpack algorithms can use. | ||
* The vast majority of mlpack algorithms can only use numeric data (i.e. | ||
* float/double/etc.), but some algorithms can use categorical data, specified | ||
* via this Datatype enum and the DatasetMapper class. | ||
*/ | ||
enum Datatype : bool /* [> bool is all the precision we need for two types <] */ | ||
{ | ||
numeric = 0, | ||
categorical = 1 | ||
}; | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is very pedantic but there is an extra line here. :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updated |
||
|
||
} // namespace data | ||
} // namespace mlpack | ||
|
||
#endif |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can make map_type_t as mapped_type, mapped_type is same as the typedef of std::map, mapped_type should be more familiar with c++ programmer(my example did not name it well, sorry about that)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no problem, it is now updated :)