Skip to content

valeman/smalltabnets

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Modern neural networks for small tabular datasets

An evaluation framework from the paper Modern Neural Networks for Small Tabular Datasets: The New Default for Field-Scale Digital Soil Mapping?

This repository provides a unified framework for evaluating modern deep neural networks on small tabular datasets, evaluated on 31 field- and farm-scale digital soil mapping datasets from LimeSoDa.

Overview

  • Datasets: Uses soil datasets from the LimeSoDa repository with proximal soil sensing and remote sensing features.

  • Models: Implements 15+ models with a unified interface:

    • Classical ML: Linear Regression, Ridge, Lasso, PLSR, Random Forest, XGBoost
    • MLP-based NNs: MLP, TabM, RealMLP
    • Retrieval-based NNs: TabR, ModernNCA
    • Attention-based NNs: AutoInt, FT-Transformer, ExcelFormer, T2G-Former, AMFormer
    • In-context learning foundation models: TabPFN
  • Configuration: Experiment settings defined via YAML configuration files. Configuration files for datasets with feature-to-sample ratio < 1 are in the config/pss/ folder, while configurations for high-dimensional datasets with ratio > 1 (including MIR/NIR spectroscopy features) are in the config/spectroscopic/ folder.

  • Preprocessing: Built-in support for PCA, feature scaling, numerical embeddings

Setup

Requirements: Python 3.10+

pip install -r requirements.txt

Usage

Run experiments using YAML configuration files:

python benchmark.py --config config/pss/limesoda_mlp.yaml

Example configuration files are provided in config/pss/ and config/spectroscopic/ folders.

Results & Data

Complete experimental results, including optimized hyperparameters for all dataset-model combinations and model predictions, are available: results.tar.gz

Citation

@misc{barkov2025modern,
  title         = {Modern Neural Networks for Small Tabular Datasets: The New Default for Field-Scale {Digital} {Soil} {Mapping}?},
  author        = {Viacheslav Barkov and Jonas Schmidinger and Robin Gebbers and Martin Atzmueller},
  year          = {2025},
  eprint        = {2508.09888},
  archiveprefix = {arXiv},
  primaryclass  = {cs.LG},
  url           = {https://arxiv.org/abs/2508.09888},
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%