Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[MRG+3] Add mean absolute error splitting criterion to DecisionTreeRe…
…gressor (scikit-learn#6667) * feature: add initial node_value method * testing code for node_impurity and node_value This code runs into 'Bus Error: 10' at node_value final assignment. * fix: node_value now correctly calculating weighted median for sorted data. Still need to change the code to work with unsorted data. * fix: node_value now correctly calculates median regardless of initial order * fix: correct bug in calculating median when taking midpoint is necessary * feature: add initial version of children_impurity * feature: refactor median calculation into one function * fix: fix use of DOUBLE_t vs double * feature: move helper functions to _utils.pyx, fix mismatched pointer type * fix: fix some bugs in children_impurity method * push a debug version to try to solve segfault * push latest changes, segfault probably happening bc of something in _utils.pyx * fix: fix segfault in median calculation and remove excessive logging * chore: revert some misc spacing changes I accidentally made * chore: one last spacing fix in _splitter.pyx * feature: don't calculate weighted median if no weights are passed in * remove extraneous logging statement * fix: fix children impurity calculation * fix: fix bug with children impurity not being initally set to 0 * fix: hacky fix for a float accuracy error * fix: incorrect type cast in median array generation for node_impurity * slightly tweak node_impurity function * fix: be more explicit with casts * feature: revert cosmetic changes and free temporary arrays * fix: only free weight array in median calcuation if it was created * style: remove extraneous newline / trigger CI build * style: remove extraneous 0 from range * feature: save sorts within a node to speed it up * fix: move parts of dealloc to regression criterion * chore: add comment to splitter to try to force recythonizing * chore: add comment to _tree.pyx to try to force recythonizing * chore: add empty comment to gradient boosting to force recythonizing * fix: fix bug in weighted median * try moving sorted values to a class variable * feature: refactor criterion to sort once initially, then draw all samples from this sorted data * style: remove extraneous parens from if condition * implement median-heap method for calculating impurity * style: remove extra line * style: fix inadvertent cosmetic changes; i'll address some of these in a separate PR * feature: change minmaxheap to internally use sorted arrays * refactored MAE and push to share work * fix errors wrt median insertion case * spurious comment to force recythonization * general code cleanup * fix typo in _tree.pyx * removed some extraneous comments * [ci skip] remove earlier microchanges * [ci skip] remove change to priorityheap * [ci skip] fix indentation * [ci skip] fix class-specific issues with heaps * [ci skip] restore a newline * [ci skip] remove microchange to refactor later * reword a comment * remove heapify methods from queue class * doc: update docstrings for dt, rf, and et regressors * doc: revert incorrect spacing to shorten diff * convert get_median to return value directly * [ci skip] remove accidental whitespace * remove extraneous unpacking of values * style: misc changes to identifiers * add docstrings and more informative variable identifiers * [ci skip] add trivial comments to recythonize * remove trivial comments for recythonizing * force recythonization for real this time * remove trivial comments for recythonization * rfc: harmonize arg. names and remove unnecessary checks * convert allocations to safe_realloc * fix bug in weighted case and add tests for MAE * change all medians to DOUBLE_t * add loginc allocate mediancalculators once, and reset otherwise * misc style fixes * modify cinit of regressioncriterion to take n_samples * add MAE formula and force rebuild bc. travis was down * add criterion parameter to gradient boosting and add forest tests * add entries to what's new
- Loading branch information