Skip to content

Latest commit

 

History

History
27 lines (16 loc) · 1.15 KB

dataset.md

File metadata and controls

27 lines (16 loc) · 1.15 KB

Dataset

A dataset (or data set) is a collection of data that are used for machine-learning training job.

Machine learning typically works with three datasets:

  • Training dataset

    The actual dataset that we use to train the model. The model learns weights and parameters from this data.

  • Validation dataset

    The validation set is used to evaluate a given model during the training process. It helps machine learning engineers to fine-tune the HyperParameter at model development stage. The model doesn't learn from validation dataset; and validation dataset is optional.

  • Test dataset

    The Test dataset provides the gold standard used to evaluate the model. It is only used once a model is completely trained.

See Jason Brownlee’s article for more detail.

DJL provides a number of built-in basic and standard datasets. These datasets are used to train deep learning models.