Are GitHub actions reliable, and where to take the project #57
MatthewCaseres
started this conversation in
General
Replies: 1 comment 3 replies
-
|
Yeah, seeing these I don't see much difference. Weird to see multiple runs with ~30ms timings on the new Julia array version on the PR (it ran multiple times because of my multiple commits) and then consistently get ~50ms on the main branch. |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
6 runs were done with/without the changes from #56. It is not much difference between the two. Note that I had pushed multiple different branches so that the runs would be done concurrently and save me time, but not within the same actual Action run. I do not know how hardware or VM are allocated. Here are results.
sequential
jbr [84.971, 82.249, 85.348, 86.260, 81.267, 86.589]
jba [50.032, 49.452, 50.147, 50.549, 49.981, 50.992]
pan [78.781, 79.302, 81.198, 79.092, 80.079, 79.076]
pap [46.044, 47.674, 47.193, 46.515, 47.299, 46.309]
pll [617.214, 613.601, 617.600, 610.330, 618.243, 616.722]
prn [46.969, 63.591, 68.401, 64.476, 46.893, 61.273]
prp [73.111, 73.598, 75.202, 74.140, 73.143, 73.147]
parallel
jbr [82.722, 85.440, 85.510, 83.315, 82.656, 85.853]
jba [49.147, 50.971, 50.436, 49.895, 49.516, 50.000]
pan [79.872, 79.722, 80.457, 91.071, 82.561, 79.846]
pap [46.144, 44.292, 46.445, 45.853, 48.278, 45.671]
pll [618.954, 616.108, 618.983, 611.658, 615.401, 612.407]
prn [67.239, 46.856, 59.004, 67.092, 47.791, 47.018]
prp [72.276, 72.622, 72.050, 73.082, 74.414, 72.796]
thoughts on experiments
This leads me to want to revert the change to sequential as I do not see the difference in the seqential and parallel jobs. Notifying @alecloudenback. The variability in runtimes appears to only be for NumPy based models, these results don't seem worrisome
can better benchmarks solve the problem
I am sure some life actuarial workloads take longer than 50 ms. So then a 10ms variance won't be a concern anymore if we can find a heavier model.
What is a good model
I have no clue about economic assumptions, but understand mortality tables and lapses okay. How can the benchmarks become a collection of runnable models that adhere to a standard that practicing actuaries would find valuable?
Containers
Something called "Apptainer" is popular for HPC testing, compatible with docker. So these heavier benchmarks might not even have the source code entirely contained in this repository. This repository might report on results of various published containers, and be an index into all the different implementations people might have. And then people can run the containers.
I would probably start submitting pull requests to lifelib with my implementations and having them live there for example.
Beta Was this translation helpful? Give feedback.
All reactions