Skip to content

alaamaalouf/AutoCoreset

Repository files navigation

AutoCoreset: An Automatic Practical Coreset Construction Framework [ICML 2023]

1 CSAIL, MIT | 2 DataHeroes | 3 Rice University

Alaa Maalouf, Murad Tukan, Vladimir Braverman, and Daniela Rus

AutoCoreset design

A coreset is a small weighted subset that approximates the loss function on the whole data, prevalent in machine learning for its advantages. However, current construction methods are problem-dependent and may be challenging for new researchers.

No worries, we got you. We propose AutoCoreset: an automatic practical framework for constructing coresets requiring only input data and the cost function (without any other user computation or calculation), making it user-friendly and applicable to various problems. See our open-source code which supports future research and simplifies coreset usage.

Usage

To use AutoCoreset on your data and desired ML model:

(1) Modify Line - 134 to be a list containing the path to your Dataset.

(2) Modify Line - 138 to be the name of your ML model (currently we support 'k_means', 'logistic_regression', 'linear_regression', 'svm').

(3) You can certainly change the ML model as you wish - as long as you provide the "fit" and "score" functions.

(4) Run: python main.py

Citation

If you find this work helpful please cite us:

@article{maalouf2023autocoreset,

      title={AutoCoreset: An Automatic Practical Coreset Construction Framework},

      author={Maalouf, Alaa and Tukan, Murad and Braverman, Vladimir and Rus, Daniela},

      journal={arXiv preprint arXiv:2305.11980},

      year={2023}

}

About

AutoCoreset in an automatic practical Coreset construction framework for any function. The user only have to spicify the desried cost function and datasets

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages