-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Plugin becomes CPU hog on larger nodes #86
Comments
metrics.log is at https://nopaste.net/mO6ckGOgZp |
Tried the new version. While it got much better, it's still consuming ~65% CPU constantly for the go-lnmetrics process alone. |
I'm reopening it because I think the problem now is the list of forwarding payments, do you have a lot of these? |
while testing yesterday in 24h: 40 forwards, 430 local fails, 2700 non-local fails if you poll the whole listforwards, it is ~400k elements in the forwards-array. |
Here we go! we find the bottleneck, so this is something that we need to fix from core lightning side! I will look into it! |
could the plugin not just hook into any forwards happening while it's running? of course it would only get those forwards that happen while it's running, but this seems to be reasonable. |
And if you are not running the plugin? you will break the metrics. This introduces an amount of work to keep all in sync, and I will not go to start to look into it! c-lightning need to provide the forward payment since a timestamp, all the other solution are only hack, very bad hack! |
if you don't run the metrics plugin properly you won't get proper metrics. Sounds reasonable to me. There is important-plugin for that, too. But well, your choice obviously. Another thing is, if the plugin works on historical data anyways, and the forwards are timestamped, why would it constantly poll those? Why not just not-poll for 100x as long as it did take to poll last time, so CPU consumption is <1%? Would still be undesirable if while getting/processing the data the node became unresponsive of course. No idea how/if this can be done in parallel. Of course proper filtering and pagination would also solve this - and a few other problems. |
filtering and pagination work by timestamps if you want to iterate over somethings
how you can get the metrics in the last 30 days if the plugin was not started? how your "run metrics plugin properly" looks like? if there is a bug in the software, you invalidate all the metrics collection and with the metrics architecture I can not accept this, because is a metrics collection service, and not a simple plugin that iterates in a fancy way over |
I'm running a node with ~180 channels. Normally my clightning process uses about 1% CPU. If I start running the metrics collector, things change to ~150% (equally split between the plugin and lightningd). Also (obviously?) interacting with lightningd becomes horribly slow, listpeers etc takes ~7s with the plugin running and 0.03s without.
Sadly that makes it unusable for me.
The text was updated successfully, but these errors were encountered: