New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CI] Adaptive mining timeout, depending on available CPU power #7387
[CI] Adaptive mining timeout, depending on available CPU power #7387
Conversation
.github/workflows/build.yml
Outdated
@@ -131,7 +131,7 @@ jobs: | |||
- name: install monero dependencies | |||
run: sudo apt -y install build-essential cmake libboost-all-dev miniupnpc libunbound-dev graphviz doxygen libunwind8-dev pkg-config libssl-dev libzmq3-dev libsodium-dev libhidapi-dev libnorm-dev libusb-1.0-0-dev libpgm-dev libprotobuf-dev protobuf-compiler | |||
- name: install requests | |||
run: pip install requests | |||
run: pip install requests psutil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for RAM readouts
|
||
def pi_digits_numeric_yield(digits): | ||
"""Generate x digits of Pi. | ||
https://stackoverflow.com/a/54967370 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using code from stackoverflow is a serious licensing issue. Also, as a practical matter, you can perhaps do something much simpler here, such as a calculation with the decimal or math libraries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proper copyrights actually belong to these gents:
This is implementing a spigot algorithm by A. Sale, D. Saada, S. Rabinowitz, mentioned on http://mail.python.org/pipermail/edu-sig/2012-December/010721.html
https://gist.github.com/deeplook/4947835
If this is not acceptable, I will look for alternatives, but really, Stackoverflow was just the messenger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course algorithms to compute pi are plentiful in the public domain. The issue here is you've now attributed work to stackoverflow and taking that attribution back would be doubly bad. Plus, using an optimized C-based mathematical function, as found in decimal or math, seems maybe a better strategy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that as long as the PR isn't merged, I don't give nor take anything. That's what a PR is for. Otherwise you create a destructive mental pressure on contributors, making them refrain from doing any contributions.
Besides, a link is not a copyright clause.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not saying that you've broken the project by introducing a PR with an SO link, I'm saying it could be problematic to merge said code, given its history.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then I'm very happy, that I misunderstood you.
The Decimal solution works equally fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The decimal solution isn't right. The same code runs for 2s on my machine and 2000s on the CI machine. I'll try to avoid Decimal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying the Leibnitz' formula now -> OK!
tests/functional_tests/mining.py
Outdated
while height < target_height: | ||
seen_height = height | ||
for _ in range(timeout): | ||
time.sleep(1) | ||
delta = datetime.datetime.now() - start |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a correct methodology (depends on system clock and timezone), use time.monotonic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
tests/functional_tests/mining.py
Outdated
timeout_base_mine = 20 | ||
cores_init = multiprocessing.cpu_count() # RX init uses all cores | ||
cores_mine = res_status.threads_count # Mining uses parametric number of cores | ||
timeout_init, timeout_float_init = get_timeout_int_float(time_pi_all_cores, timeout_base_init, cores_init) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would anyone need the value as an int and a float? All the relevant python timing functions use floats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need int
for printing it out to the user and for it to be a valid argument for the for
loop a few lines below. I wanted to avoid multiple conversions to int along the way.
Float is obviously used for calculations, in order not to loose on the precision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The loop issue can be trivially solved, just use math.ceil
(i.e. range(math.ceil(timeout))
), or switch to a while time.monotonic() < timeout_time
format. As for printing, I don't know what you mean, floats can be printed and formatted arbitrarily, and the precise value seems more useful than a rounded one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Sounds good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
# Inputs are the number of digits to calculate for each CPU | ||
inp = [num_digits] * cores | ||
# Start measuring the time | ||
start = datetime.datetime.now() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The methodology here ends up counting process startup time, which could lead to wildly inaccurate results. A simple workaround would be to have each task time itself and push results onto a Queue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it does count the process start up time, but if it increases by a large margin, so will an equivalent functionality of the RX init just in the next 2 seconds after this calculation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, first off, it's not accurate to assume process startup time will be relatively constant over a short timeframe, but second, even if one makes such an assumption, I fail to see how it resolves my concern.
Edit: If you want to also gauge process startup time in order to adjust the init timeout, that would be reasonable, but, in that case, you should do it separately (though I think assuming some fixed value, e.g. 1s, would be fine). Combining two signals into a single benchmark only works when you're using it as a baseline for the superposition, but, here, you're using it as a baseline for both the superposition and one of its constituent factors, which logically doesn't work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right from the logical PoV. I would just like to cover the startup up with a calculation long enough, that the process init becomes negligible. I have a small problem with your approach, that it makes a simple thing slightly more complex. I wanted to solve this with averaging the returned processing time of the threads. I only fear, that the time calculation within the threads will be not accurate. Do you think the monotonic timer will help?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think running a benchmark with a broken methodology makes everything more complex, as it's very difficult to reason about a calculation with many different noise sources (are they acceptable? did the code intentionally leave them in play, and if so, why? etc). Using a Queue maybe adds 5-10 LoC and much clarity. As for timing process startup, I proposed just not even bothering and assuming some fixed value, so that's not really a concern, I think (you could easily time it, too, but I don't have a strong opinion on the matter).
@@ -131,7 +131,7 @@ jobs: | |||
- name: install monero dependencies | |||
run: sudo apt -y install build-essential cmake libboost-all-dev miniupnpc libunbound-dev graphviz doxygen libunwind8-dev pkg-config libssl-dev libzmq3-dev libsodium-dev libhidapi-dev libnorm-dev libusb-1.0-0-dev libpgm-dev libprotobuf-dev protobuf-compiler | |||
- name: install requests | |||
run: pip install requests | |||
run: pip install requests psutil monotonic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need this library? time.monotonic is a python built-in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't looked at the runner in detail, but these files have a shebang line which claims python3 is to be expected, so it's problematic to be using python2. I also note that the files import print_function
from __future__
, which would imply python2 is in use. Overall, I would try to clean things up to use python3, as python2 is EOL and dying rather quickly (the only reason it's still available, and sometimes the default in certain distros, is because core system components have assumed it, and it takes a while to make all the requisite changes, though I think most distros are quite close and it'll be gone in a year).
I should probably contextualize my comments up to this point, as I think there's a danger they could be misconstrued: Normally system benchmarks would be written in C, so as to minimize overhead and maximize accuracy. However, creating an accurate measure of "system performance" is itself a fraught and difficult exercise, and one that I don't think proves necessary here. Instead, I imagine this PR as introducing a "smoke test" for something having gone very wrong with the mining code (say, a 5x or 10x regression). A more focused, narrowly bounded mining benchmark might be a good idea, but you can't reasonably achieve that result in a noisy neighbor environment. To that end, your bounds in this PR can be quite loose (1.5 or even 2x baseline), so you shouldn't need to worry about python version (if you're seeing > 50% delta based on python version, probably you should use a different test). I still think proper benchmark hygiene (using a sane timing methodology) should be observed, to the extent that it's not terribly difficult and in some ways makes the code both clearer and more useful (a poorly constructed benchmark begs the question, "Was this an oversight, or just intentionally lax in its methodology?" While code comments can answer that question, it's often simpler to avoid asking it in the first place). |
Still to do for the new C++ timing app: Copyright and documentation. Done. |
decoded = time_calc.decode('utf-8') | ||
try: | ||
miliseconds = int(decoded) | ||
except ValueError as ex: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The try/exception seems unproductive, if conversion fails it'll throw a ValueError which includes the value and rationale for the failure. Plus, utility functions should generally avoid print
, unless their primary purpose is producing output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
by running numerical calculations | ||
""" | ||
|
||
from __future__ import print_function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes no sense in python3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Until #7396 is merged, we're stuck with Python2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point, but I think, perhaps, you've ended up making your life more difficult by splitting commits in this way: Though each commit and PR should achieve a single logical goal, there are cases where a clear dependency exists, and ignoring the dependency therefore introduces extra work and complexity, the very reasons to ordinarily avoid PRs with multiple aims. For example, this PR adds multiple python2-specific pieces of code, which will eventually have to be removed. On top of that, the files all still mark themselves as targeting python3 via the shebang line, which makes life very confusing for the reader. You could have, instead, either branched this PR from the "use python3" PR, or introduced a more limited change which just runs the mining tests with python3, as a single commit which forms part of this multi-commit PR. I don't think it's an issue worthy of further discussion, but certainly something to consider in future work.
def available_ram_gb(): | ||
ram_bytes = psutil.virtual_memory().available | ||
kilo = 1024.0 | ||
ram_gb = ram_bytes / kilo / kilo / kilo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not super important, but this seems rather verbose an unnecessarily slow (three divisions), when you could just do:
return psutil.virtual_memory().available / (1<<30)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used your previous suggestion of kilo ** 3 for better readability.
while height < target_height: | ||
seen_height = height | ||
for _ in range(timeout): | ||
for _ in range(int(math.ceil(timeout))): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int
is superfluous, ceil
returns an int.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only since Python3
tests/functional_tests/mining.py
Outdated
timeout_init = calc_timeout(timeout_base_init, time_pi_all_cores, cores_init) | ||
timeout_mine = calc_timeout(timeout_base_mine, time_pi_single_cpu, cores_mine) | ||
|
||
msg = "Timeout for {} adjusted for the currently available CPU power, is {} s" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar thought here regarding using number formatting rather than rounding the injected value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
tests/functional_tests/mining.py
Outdated
@@ -71,6 +75,12 @@ def create(self): | |||
def mine(self, via_daemon): | |||
print("Test mining via " + ("daemon" if via_daemon else "wallet")) | |||
|
|||
time_pi_single_cpu = self.measure_cpu_power_get_time(1) | |||
time_pi_all_cores = self.measure_cpu_power_get_time(multiprocessing.cpu_count()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would suggest putting the core count in a variable, as you use it repeatedly in calculations (technically there is such a variable now, cores_init
, used later in this function and not here; to my mind, it'd be clearer to have a single variable, e.g. nproc, used everywhere).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Regarding nproc
, such variable name would suggest the source and quantity of the variable, but not the intention how it's meant to be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All I will say here is that if you find nproc
insufficient to document intent, you can always alias it while still avoiding repeatedly querying the machine's core count.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The suggestion was correct, and I query the cpu count only once now (well twice, since the method itself is being called twice)
tests/functional_tests/mining.py
Outdated
assert False, 'Failed to mine successor to block %d (initial block = %d)' % (seen_height, prev_height) | ||
timeout = 10 | ||
assert False, 'Failed to mine successor to block %d (initial block = %d) after %d s. RX initialized = %r' % (seen_height, prev_height, round(seconds_passed), init_rx) | ||
if not init_rx: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't really need this boolean, can just use seen_height == prev_height
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need it for readability. Same as with documentation and self-documenting code: the goal is not only to write the most minimal code, but code that can be understood by others.
It was hard enough to understand the flow of this test at the first glance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't find the variable terribly offensive or anything, but it seems you're perhaps papering over the real readability issue, which is, I think, that prev_height
is a rather poor name, and should be something along the lines or initial_height
. With that change, I think the conditional proves quite clear without an extra variable, but I won't belabor the point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The initial_height
is indeed a much better name for the variable, especially since it's a constant.
Still the rx_inited
var is used twice - once in the print function and secondly below the for
loop. Using the comparison twice is error-prone IMHO.
BTW. I added an additional printout, which will be executed if one while
loop's iteration is not enough.
It sounds to me, like we're nearing the end of the review. I'm rebasing and reverting the turnaround time simplifications. |
3785011
to
0fcb9d3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make sure to open a PR which removes the python2 compat code (montonic module, __future__
imports, etc) once the python3 test runner PR gets merged.
Thanks for the review.
For the record, we're talking about #7396 , which needs to be approved and merged first. |
A successful ad-hoc run under MacOSX requested by @selsta . |
0fcb9d3
to
b795a94
Compare
Just added the Python dependency list to the documentation. |
As requested by @selsta , I ran the mining test multiple times (19) and each of the runs was successful. This should serve as the ultimate proof, that the concept works fine. Within the logs of the tests, you're looking for the string: |
7e1fb66
to
553ba67
Compare
Rebased and integrated latest test fixes. |
Got it. Will resolve very soon. |
Printing also available RAM. Add comprehensive description.
553ba67
to
45f01f5
Compare
Intro
A more serious approach to determining the appropriate timeout for the mining test in
functional_tests_rpc
. The timeouts are calculated separately for RX init and for the mining of subsequent blocks themselves. Printing also available RAM, in case the RX init went wrong, which needs above 3 GB of RAM. Added comprehensive description.More details:
As per the discussion with @iamamyth in #7371 , the adjustment is done by testing the time of generation of N next digits of Pi - once for a single core (mining use case) and once for as many cores as there are available (RX init). This test helps orientating in the currently available CPU processing power, even in the event of:
ctest
being ran in parallel (with-j$(nproc)
)The obtained measurements serve as multipliers for the assumed timeout constants. For the case of RX init, the number of cores serves additionally as a divisor of the timeout constant, since availability of more cores decreases the init time (ideally) linearly.
Example runs
(see the bottom of the
tests
log):-1- -2- -3- -4-
Example log:
6: [TEST STARTED] mining
6: Resetting blockchain
6: Creating wallet
6: Test mining via daemon
6: Available RAM = 17.0 GB
6: Measuring the currently available CPU power...
6: Time taken to calculate 10000 Pi digits on 4 core(s) was 2.03 s.
6: Measuring the currently available CPU power...
6: Time taken to calculate 10000 Pi digits on 1 core(s) was 1.31 s.
6: Timeout for init, adjusted for the currently available CPU power, is 31 s
6: Timeout for mining, adjusted for the currently available CPU power, is 27 s
6: Time taken for RX init + mining 1st block = 12 s.
6: Time taken for total mining = 12 s.
6: Test mining via wallet
6: Available RAM = 16.8 GB
6: Measuring the currently available CPU power...
6: Time taken to calculate 10000 Pi digits on 4 core(s) was 2.13 s.
6: Measuring the currently available CPU power...
6: Time taken to calculate 10000 Pi digits on 1 core(s) was 1.3 s.
6: Timeout for init, adjusted for the currently available CPU power, is 33 s
6: Timeout for mining, adjusted for the currently available CPU power, is 27 s
6: Time taken for RX init + mining 1st block = 11 s.
6: Time taken for total mining = 23 s.
6: Test submitblock
6: Resetting blockchain
6: Test RandomX
6: Recreating the chain
6: Mining from genesis block again
6: Adding one to reorg
6: [TEST PASSED] mining
Work time: ~15 h