Is it more common for the loss vs epoch count curve of ML models to be concave or convex? Why?
It is more common for the loss vs epoch count curve of ML models to be convex.

The loss function in a typical ML model is a measure of how well the model is able to make predictions. The goal of the training process is to minimize this loss function, which is achieved by updating the model's parameters based on the gradients of the loss function with respect to those parameters.

During the early epochs of training, the model is usually far from optimal and the gradients are large, which allows for quick improvements in the loss. However, as the model gets closer to optimal, the gradients become smaller and it becomes harder to make further improvements. This leads to a curve that is convex in shape, where the rate of improvement decreases over time.

However, it's worth noting that not all loss vs epoch count curves will be perfectly convex. Depending on the architecture of the model, the size and quality of the training data, and the chosen hyperparameters, the curve may have various shapes, such as concave, saddle-shaped, or irregular. Therefore, it's important to carefully monitor the loss curve during training and make adjustments as needed to ensure that the model is learning effectively.