Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What value should the class label be in regression? #114

Closed
BramVanroy opened this issue Jan 31, 2018 · 5 comments
Closed

What value should the class label be in regression? #114

BramVanroy opened this issue Jan 31, 2018 · 5 comments

Comments

@BramVanroy
Copy link

In the README it says:

: : ...
.
.
.

Each line contains an instance and is ended by a '\n' character. For
classification, is an integer indicating the class label
(multi-class is supported). For regression, is the target
value which can be any real number.

However, we found that the label doesn't need to be an integer on Linux, as it also works if you use a string. For instance, using UNK (from unknown) works - but not on Windows.

To ensure a similar experience across operating systems, which default value is encouraged? Documentation says 'any integer', so can I just use 0?

@cjlin1
Copy link
Owner

cjlin1 commented Jan 31, 2018 via email

@BramVanroy
Copy link
Author

But the target value is unknown, right? It's the one you are trying to predict.

@cjlin1
Copy link
Owner

cjlin1 commented Jan 31, 2018 via email

@BramVanroy
Copy link
Author

So just using something like the following, where the label is 0 is okay?

0 1:4.458333333333333 2:24.0 3:0.20833333333333334 4:8.333333333333334 5:29.166666666666668 6:87.5 8:1.0

@cjlin1
Copy link
Owner

cjlin1 commented Jan 31, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants