-
Notifications
You must be signed in to change notification settings - Fork 5
Model Training ‐ Comparison ‐ [Epochs x Repeats]
Models | Logs | Graphs | Configs
One of the first questions that every guide provides its own answer to is "What's better: 100 epochs with 1 repeat, or 50 epochs with 2 repeats, or 25 epochs with 4 repeats...?" In general, does the ratio of epochs to repeats affect the result if the total number of steps remains the same?
Compared values:
-
100x1
-B
, -
50x2
, -
20x5
, -
10x10
.
DLR(step)
The peak DLR
varies slightly, but a clear logic of growth doesn't seem to be apparent.
Loss(step)
As the number of epochs varies in each case while the total number of steps remains constant, we will use the loss(step)
graph.
The graphs are almost identical, and the lowest graphs are only slightly different due to the time difference between training the models. The only notable observation is that with more epochs and fewer repeats, the graph tends to be more irregular. At 10x10
the graph is very smooth, while at 100x1
it is much more jagged.
While many results may look a little different, the quality appears to be nearly identical across all of them.
There is one nuance that is influenced by this ratio, which we will discuss a bit later. However, for now, there seems to be no difference, and the quality of the results is more influenced by the random. The only advantage of having more epochs is that, if desired, you can choose from a larger number of models.
- Introduction
- Examples
- Dataset Preparation
- Model Training ‐ Introduction
- Model Training ‐ Basics
- Model Training ‐ Comparison - Introduction
Short Way
Long Way
- Model Training ‐ Comparison - [Growth Rate]
- Model Training ‐ Comparison - [Betas]
- Model Training ‐ Comparison - [Weight Decay]
- Model Training ‐ Comparison - [Bias Correction]
- Model Training ‐ Comparison - [Decouple]
- Model Training ‐ Comparison - [Epochs x Repeats]
- Model Training ‐ Comparison - [Resolution]
- Model Training ‐ Comparison - [Aspect Ratio]
- Model Training ‐ Comparison - [Batch Size]
- Model Training ‐ Comparison - [Network Rank]
- Model Training ‐ Comparison - [Network Alpha]
- Model Training ‐ Comparison - [Total Steps]
- Model Training ‐ Comparison - [Scheduler]
- Model Training ‐ Comparison - [Noise Offset]
- Model Training ‐ Comparison - [Min SNR Gamma]
- Model Training ‐ Comparison - [Clip Skip]
- Model Training ‐ Comparison - [Precision]
- Model Training ‐ Comparison - [Number of CPU Threads per Core]
- Model Training ‐ Comparison - [Checkpoint]
- Model Training ‐ Comparison - [Regularisation]
- Model Training ‐ Comparison - [Optimizer]