-
Notifications
You must be signed in to change notification settings - Fork 8
Open
Description
Recently I was analyzing the job execuiton and noticed quite a strange situation.

There are two problems with uniformity:
- In the area around 54 DB12. It looks like some DB12 results are impossible to get. Like 52.8, 53.7, 54.4, 55.0 and so on.
- In the area around 38 DB12. On the other hand, it looks like some values have considerably higher probability to appear.
So, I decided that the problem is with rounding and here what I got.
- When we calculate DB12, we are using
os.times()
, but this function gives values rounded by 2 numbers after comma. Example: 4.58, 4.59, 4.6 and so on. - When DB12 benchmarks results are saved by dirac, they are also rounded by 1 number after comma (https://github.com/DIRACGrid/DIRAC/blob/c0afdba5ded29e72f1db144d71df3918313a28cb/src/DIRAC/WorkloadManagementSystem/scripts/dirac_wms_cpu_normalization.py#L66).
To simulate that I generated DB12 results which could be received from 50000 tests with test duration distributed uniformly:
df = pd.DataFrame({
'duration': [3.00 + x * 0.0001 for x in range(50000)],
'result': [round((250.0 / round(3.00 + x * 0.0001, 2)) / 1.0, 1) for x in range(50000)],
})
And draw a histogram counting different values of results (that is how CPUNormalizationFactor ius calculated for every job).

(Interactive plot with results: DB12_freq.html)
Obviously there is a problem with uniformity of the results. It is not only for high DB12 values, but also for lower one. It is just not so visible for low values of DB12.
Metadata
Metadata
Assignees
Labels
No labels