Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cross validation in random forest doesn't work #6278

Closed
mixmixmix opened this issue Mar 16, 2016 · 4 comments
Closed

cross validation in random forest doesn't work #6278

mixmixmix opened this issue Mar 16, 2016 · 4 comments

Comments

@mixmixmix
Copy link

Greetings.

Random forest training with number of folds different than rtrees->setCVFolds(0), segfaults. Mentioned on answers.opencv.org

Example code

auto rtrees = cv::ml::RTrees::create(); 
rtrees->setMaxDepth(30);
rtrees->setMinSampleCount(12);
rtrees->setRegressionAccuracy(0);
rtrees->setUseSurrogates(false);
rtrees->setMaxCategories(16);
rtrees->setPriors(cv::Mat());
rtrees->setCalculateVarImportance(false);
rtrees->setActiveVarCount(0);
rtrees->setTermCriteria({cv::TermCriteria::MAX_ITER, 100, 0});
//rtrees->setCVFolds(10); //uncommenting that line causes rtrees->train() to crash

rtrees->train(trainingData32F.colRange(1, trainingData32F.cols),
              cv::ml::ROW_SAMPLE, labels);

Please state the information for your system

  • OpenCV version: 3.1
  • Host OS: Linux (Ubuntu 15.10) and Mac OS X 10.11
  • compiled with clang++

In which part of the OpenCV library you got the issue?

ml module

Expected behaviour

The program should complete training

Actual behaviour

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff1d5c438 in cv::ml::DTreesImpl::calcValue(int, std::vector<int, std::allocator<int> > const&) () from /opt/opencv-3.0.0/lib/libopencv_ml.so.3.1
(gdb) bt
#0  0x00007ffff1d5c438 in cv::ml::DTreesImpl::calcValue(int, std::vector<int, std::allocator<int> > const&) () from /opt/opencv-3.0.0/lib/libopencv_ml.so.3.1
#1  0x00007ffff1d5b533 in cv::ml::DTreesImpl::addNodeAndTrySplit(int, std::vector<int, std::allocator<int> > const&) () from /opt/opencv-3.0.0/lib/libopencv_ml.so.3.1
#2  0x00007ffff1d5ba2b in cv::ml::DTreesImpl::addTree(std::vector<int, std::allocator<int> > const&) () from /opt/opencv-3.0.0/lib/libopencv_ml.so.3.1
#3  0x00007ffff1d2b7e2 in cv::ml::DTreesImplForRTrees::train(cv::Ptr<cv::ml::TrainData> const&, int) [clone .constprop.131] () from /opt/opencv-3.0.0/lib/libopencv_ml.so.3.1
#4  0x00007ffff1d4a7d5 in cv::ml::StatModel::train(cv::_InputArray const&, int, cv::_InputArray const&) () from /opt/opencv-3.0.0/lib/libopencv_ml.so.3.1
Python Exception <class 'gdb.error'> There is no member named _M_dataplus.: 
Python Exception <class 'gdb.error'> There is no member named _M_dataplus.: 
#5  0x000000000040d217 in TrainRandomForest (trainingData=..., dir=, setName=, depth=30, samplecount=12) at /home/miko/rachael/source/train.cpp:40
@mshabunin mshabunin added the bug label Mar 16, 2016
@ohnozzy
Copy link
Contributor

ohnozzy commented Apr 10, 2016

In fact the documentation of the RTree (http://docs.opencv.org/3.0-beta/modules/ml/doc/random_trees.html) claims the following:

In random trees there is no need for any accuracy estimation procedures, such as cross-validation or bootstrap, or a separate test set to get an estimate of the training error.

So RTree is not designed to work with cross validation. In this case shall we just add an assertion to enforce CVFolds to be 0?

@mixmixmix
Copy link
Author

That's right @ohnozzy . It is even in the original paper, my bad.
It'd be best to have a message explaining that, because assertion wouldn't give all the information.

@simonhaenisch
Copy link
Contributor

If it shouldn't be changeable by design, can't this line just be removed:
https://github.com/opencv/opencv/blob/master/modules/ml/src/rtrees.cpp#L371
No need to add the setter/getter, right?

CVFolds is still set to zero in the DTreesImpl here:
https://github.com/opencv/opencv/blob/master/modules/ml/src/rtrees.cpp#L78

@Yumin-Sun-00
Copy link

But when I apply setCVFolds for DTrees, the method is also not implemented, why?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants