Skip to content

ranja-sarkar/sensitivity

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 

Repository files navigation

sensitivity

Selecting a dataset size for machine learning is a challenging open problem. Typically, there is a relationship between training dataset size and model performance, especially for nonlinear models. However, the caveat is that such a relation might not exist for some models and datasets.

image

A sensitivity analysis forms the basis for testing different model types and model configurations in addition to the ones chosen at the outset to address a given problem. This helps in evaluating a better model algorithm and decide on a rough estimate of the training data needed to build the predictive model, apart from the statistical heuristics such as number of classes (classification problem), number of input features or the number of model parameters that are considered.

A good read on the effectiveness of larger datasets in deep learning models: https://arxiv.org/abs/1707.02968

About

How much training data is required to build a ML model?

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published