Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Methods For Determining the Type of Data Need Revisited. #888

Closed
kvb2univpitt opened this issue Aug 15, 2018 · 6 comments
Closed

The Methods For Determining the Type of Data Need Revisited. #888

kvb2univpitt opened this issue Aug 15, 2018 · 6 comments
Labels

Comments

@kvb2univpitt
Copy link
Collaborator

The method isMixed() of the BoxDataSet determines if the dataset is mixed by counting the number of continuous variables and discrete variables the dataset has. If the number of discrete variables and continuous variables are both non-zero, it is considered as mixed. This is not quite right, because the max-discrete-category value can be set small such that there will be no variable considered discrete. In this case, the dataset is still mixed, but the method isMixed() will return false and the method isContinuous() will return true. Having said that, the method isContinuous() and isDiscrete() are not correct either.

@kvb2univpitt
Copy link
Collaborator Author

@jdramsey Joe, what are your thoughts?

@yuanzhou
Copy link
Contributor

Just want to mention that in algo chooser we are also using isContinuous(), isDiscrete(), or isMixed() to decide the applicable algorithms based on the data type of datasets.

@jdramsey
Copy link
Collaborator

jdramsey commented Apr 9, 2019

@kvb2univpitt I think if you set the max discrete value too low and it reads it in as continuous, it's your fault. :) The dataset you read in is in fact continuous.

Or are you suggesting that if a variable has only two values, it should be interpreted as discrete even if the values are all real values? That's a possibility.

Sorry guys for not checking all of these issues earlier!

I think this issue can be closed. :)

@cg09
Copy link

cg09 commented Apr 9, 2019 via email

@jdramsey
Copy link
Collaborator

jdramsey commented Apr 9, 2019 via email

@jdramsey
Copy link
Collaborator

This all works now and has for a long time. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants