Sample hardness
During training, it is useful to identify samples that are more difficult for a model to learn so that training can be more focused around these hard samples. These hard samples are also useful as seeds when considering what other new samples to add to a training dataset.

probabilities of a model’s output.

“A classification model predicts which class a certain data point belongs to. The “raw” output of the model is often in the form of “logits”, or log-odds. Each logit corresponds to a score for a specific class. The higher the score, the more likely the model thinks the data point belongs to that class. However, these logits are not in a very interpretable form. So, they are usually transformed into probabilities using a function like softmax. The softmax function takes a vector of logits and squashes them into a range of [0, 1] such that the entire vector sums to 1.0. This way, each element in the softmax output can be interpreted as the probability of the data point belonging to a specific class.

Now, entropy is a concept borrowed from information theory. In this context, it’s used to quantify the “uncertainty” or “surprise” of a probability distribution. A uniform distribution, where all outcomes are equally likely, has the highest entropy because it is the most uncertain or surprising – you have no idea which outcome is going to occur. Conversely, a distribution where one outcome is certain to happen has an entropy of zero, because there is no surprise or uncertainty.

So, when you calculate the entropy of the softmax output, you’re calculating the uncertainty in the model’s predictions. If the entropy is low, it means the model is very confident in its predictions. If the entropy is high, it means the model is less certain about its predictions.”



In order to compute hardness, all you need to do is add your model predictions and their logits to your FiftyOne Dataset and then run the compute_hardness() method:

In [None]:
import fiftyone as fo
import fiftyone.brain as fob

dataset = fo.load_dataset(...)

fob.compute_hardness(dataset, "predictions")

Input: A Dataset or DatasetView on which predictions have been computed and are stored in the "predictions" argument. Ground truth annotations are not required for hardness.

Output: A scalar-valued hardness field is populated on each sample that ranks the hardness of the sample. You can customize the name of this field via the hardness_field argument of compute_hardness().

What to expect: Hardness is computed in the context of a prediction model. The FiftyOne Brain hardness measure defines hard samples as those for which the prediction model is unsure about what label to assign. This measure incorporates prediction confidence and logits in a tuned model that has demonstrated empirical value in many model training exercises.

