Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is DatasetInfo a necessary parameter for Hoeffding Tree? #2962

Closed
xlindo opened this issue May 31, 2021 · 5 comments
Closed

Is DatasetInfo a necessary parameter for Hoeffding Tree? #2962

xlindo opened this issue May 31, 2021 · 5 comments

Comments

@xlindo
Copy link

xlindo commented May 31, 2021

In hoeffding_tree.hpp, there is a default constructor without parameters, and also a Train method does not need DatasetInfo.
But when using these two to avoid passing DatasetInfo, like,

 std::shared_ptr<mlpack::tree::HoeffdingTree<>> mlpack_ht(
                                                  new mlpack::tree::HoeffdingTree<>());
// ... Loading data
try
{
    mlpack_ht->Train(mlpack_texture_train_X, tmp_labels);
}
catch (std::exception& e)
{
    std::cerr << e.what() << std::endl << std::endl;
}

an exception is thrown.

requested type of dimension 0, but dataset only has 0 dimensions

So is DatasetInfo a necessary parameter for Hoeffding Tree? Or is it possible to train a Hoeffding tree without passing in DatasetInfo explicitly?

@rcurtin
Copy link
Member

rcurtin commented Jun 1, 2021

Hey @xlindo, thanks for reporting this! I played with the example and found that this is indeed a bug in the handling of new datasets. I opened a PR, #2964, which should fix the issue. If you'd like to try your code on that branch and see if it works, that would be great!

In any case, to answer the original question, it should be possible to train a Hoeffding tree without a data::DatasetInfo. But, due to a bug in the code, that does not currently work. It will, though, once #2964 is merged. 😄

@xlindo
Copy link
Author

xlindo commented Jun 2, 2021

Hey @xlindo, thanks for reporting this! I played with the example and found that this is indeed a bug in the handling of new datasets. I opened a PR, #2964, which should fix the issue. If you'd like to try your code on that branch and see if it works, that would be great!

In any case, to answer the original question, it should be possible to train a Hoeffding tree without a data::DatasetInfo. But, due to a bug in the code, that does not currently work. It will, though, once #2964 is merged. 😄

Thanks a lot!
I will try it now.

@xlindo
Copy link
Author

xlindo commented Jun 2, 2021

Hey @xlindo, thanks for reporting this! I played with the example and found that this is indeed a bug in the handling of new datasets. I opened a PR, #2964, which should fix the issue. If you'd like to try your code on that branch and see if it works, that would be great!

In any case, to answer the original question, it should be possible to train a Hoeffding tree without a data::DatasetInfo. But, due to a bug in the code, that does not currently work. It will, though, once #2964 is merged. 😄

For consistency with other classification algorithms, I thought there can be an optional parameter num_classes=0 of Train(...).

@rcurtin
Copy link
Member

rcurtin commented Jun 2, 2021

Hey @xlindo, thanks for reporting this! I played with the example and found that this is indeed a bug in the handling of new datasets. I opened a PR, #2964, which should fix the issue. If you'd like to try your code on that branch and see if it works, that would be great!
In any case, to answer the original question, it should be possible to train a Hoeffding tree without a data::DatasetInfo. But, due to a bug in the code, that does not currently work. It will, though, once #2964 is merged. smile

For consistency with other classification algorithms, I thought there can be an optional parameter num_classes=0 of Train(...).

Nice point! Let me add that to the changes in #2964. 👍

@xlindo
Copy link
Author

xlindo commented Jun 4, 2021

Hey @xlindo, thanks for reporting this! I played with the example and found that this is indeed a bug in the handling of new datasets. I opened a PR, #2964, which should fix the issue. If you'd like to try your code on that branch and see if it works, that would be great!
In any case, to answer the original question, it should be possible to train a Hoeffding tree without a data::DatasetInfo. But, due to a bug in the code, that does not currently work. It will, though, once #2964 is merged. smile

For consistency with other classification algorithms, I thought there can be an optional parameter num_classes=0 of Train(...).

Nice point! Let me add that to the changes in #2964. 👍

Many thanks.
This issue could be closed now.

@rcurtin rcurtin closed this as completed Jun 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants