Why do tree-based models still outperform deep learning on typical tabular data?, Grinsztajn+, Soda, Inria Saclay , arXiv'22 #574

AkihikoWatanabe · 2023-04-28T10:59:26Z

https://arxiv.org/pdf/2207.08815.pdf

AkihikoWatanabe · 2023-04-28T11:00:43Z

tree basedなモデルがテーブルデータに対してニューラルモデルよりも優れた性能を発揮することを確認し、なぜこのようなことが起きるかいくつかの理由を説明した論文。

NNよりもtree basedなモデルがうまくいく理由として、モデルの帰納的バイアスがテーブルデータに適していることを調査している。考察としては

NNはスムーズなターゲットを学習する能力が高いが、表形式のような不規則なデータを学習するのに適していない

Random Forestでは、x軸においてirregularなパターンも学習できているが、NNはできていない。

uninformativeなfeaatureがMLP-likeなNNに悪影響を与える

Tabular dataは一般にuninformativeな情報を多く含んでおり、実際MLPにuninformativeなfeatureを組み込んだ場合tree-basedな手法とのgapが増加した

データはrotationに対して不変ではないため、学習手順もそうあるべき（この辺がよくわからなかった）

ResNetはRotationを加えても性能が変わらなかった（rotation invariantな構造を持っている）

AkihikoWatanabe added Transformer TabularData Neural labels Apr 28, 2023

AkihikoWatanabe changed the title ~~Why do tree-based models still outperform deep learning on typical tabular data?~~ Why do tree-based models still outperform deep learning on typical tabular data?, Grinsztajn+, Soda, Inria Saclay , arXiv'23 Apr 28, 2023

AkihikoWatanabe changed the title ~~Why do tree-based models still outperform deep learning on typical tabular data?, Grinsztajn+, Soda, Inria Saclay , arXiv'23~~ Why do tree-based models still outperform deep learning on typical tabular data?, Grinsztajn+, Soda, Inria Saclay , arXiv'22 Apr 28, 2023

AkihikoWatanabe added the MachineLearning label Oct 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why do tree-based models still outperform deep learning on typical tabular data?, Grinsztajn+, Soda, Inria Saclay , arXiv'22 #574

Why do tree-based models still outperform deep learning on typical tabular data?, Grinsztajn+, Soda, Inria Saclay , arXiv'22 #574

AkihikoWatanabe commented Apr 28, 2023 •

edited

AkihikoWatanabe commented Apr 28, 2023 •

edited

Why do tree-based models still outperform deep learning on typical tabular data?, Grinsztajn+, Soda, Inria Saclay , arXiv'22 #574

Why do tree-based models still outperform deep learning on typical tabular data?, Grinsztajn+, Soda, Inria Saclay , arXiv'22 #574

Comments

AkihikoWatanabe commented Apr 28, 2023 • edited

AkihikoWatanabe commented Apr 28, 2023 • edited

AkihikoWatanabe commented Apr 28, 2023 •

edited

AkihikoWatanabe commented Apr 28, 2023 •

edited