Plugin becomes CPU hog on larger nodes #86

svewa · 2022-03-10T01:53:11Z

I'm running a node with ~180 channels. Normally my clightning process uses about 1% CPU. If I start running the metrics collector, things change to ~150% (equally split between the plugin and lightningd). Also (obviously?) interacting with lightningd becomes horribly slow, listpeers etc takes ~7s with the plugin running and 0.03s without.

Sadly that makes it unusable for me.

svewa · 2022-03-10T08:23:34Z

metrics.log is at https://nopaste.net/mO6ckGOgZp

svewa · 2022-04-03T18:06:42Z

Tried the new version. While it got much better, it's still consuming ~65% CPU constantly for the go-lnmetrics process alone.

vincenzopalazzo · 2022-04-04T06:27:31Z

Tried the new version. While it got much better, it's still consuming ~65% CPU constantly for the go-lnmetrics process alone.

I'm reopening it because I think the problem now is the list of forwarding payments, do you have a lot of these?

svewa · 2022-04-04T11:54:24Z

while testing yesterday in 24h: 40 forwards, 430 local fails, 2700 non-local fails
on busier days 4x as much

if you poll the whole listforwards, it is ~400k elements in the forwards-array.

vincenzopalazzo · 2022-04-04T12:30:31Z

if you poll the whole listforwards, it is ~400k elements in the forwards-array.

Here we go! we find the bottleneck, so this is something that we need to fix from core lightning side!

I will look into it!

svewa · 2022-04-04T12:34:01Z

could the plugin not just hook into any forwards happening while it's running? of course it would only get those forwards that happen while it's running, but this seems to be reasonable.

vincenzopalazzo · 2022-04-04T12:36:20Z

could the plugin not just hook into any forwards happening while it's running? of course, it would only get those forwards that happen while it's running, but this seems to be reasonable.

And if you are not running the plugin? you will break the metrics.

This introduces an amount of work to keep all in sync, and I will not go to start to look into it!

c-lightning need to provide the forward payment since a timestamp, all the other solution are only hack, very bad hack!

svewa · 2022-04-04T12:48:34Z

if you don't run the metrics plugin properly you won't get proper metrics. Sounds reasonable to me. There is important-plugin for that, too. But well, your choice obviously.

Another thing is, if the plugin works on historical data anyways, and the forwards are timestamped, why would it constantly poll those? Why not just not-poll for 100x as long as it did take to poll last time, so CPU consumption is <1%? Would still be undesirable if while getting/processing the data the node became unresponsive of course. No idea how/if this can be done in parallel.

Of course proper filtering and pagination would also solve this - and a few other problems.

vincenzopalazzo · 2022-04-04T12:52:13Z

Of course proper filtering and pagination would also solve this - and a few other problems.

filtering and pagination work by timestamps if you want to iterate over somethings

if you don't run the metrics plugin properly you won't get proper metrics. Sounds reasonable to me. There is important-plugin for that, too. But well, your choice obviously.

how you can get the metrics in the last 30 days if the plugin was not started? how your "run metrics plugin properly" looks like? if there is a bug in the software, you invalidate all the metrics collection and with the metrics architecture I can not accept this, because is a metrics collection service, and not a simple plugin that iterates in a fancy way over listforwords

vincenzopalazzo added metric_one Refering to the metric_one of the ln-metrics-rfc performance ⚡ Performance issue labels Mar 10, 2022

vincenzopalazzo self-assigned this Mar 10, 2022

vincenzopalazzo mentioned this issue Mar 29, 2022

plugin: introduce the cache system to speed up the plugin with big node #88

Merged

vincenzopalazzo closed this as completed in #88 Apr 1, 2022

vincenzopalazzo reopened this Apr 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Plugin becomes CPU hog on larger nodes #86

Plugin becomes CPU hog on larger nodes #86

svewa commented Mar 10, 2022

svewa commented Mar 10, 2022

svewa commented Apr 3, 2022

vincenzopalazzo commented Apr 4, 2022

svewa commented Apr 4, 2022

vincenzopalazzo commented Apr 4, 2022

svewa commented Apr 4, 2022

vincenzopalazzo commented Apr 4, 2022 •

edited

Loading

svewa commented Apr 4, 2022 •

edited

Loading

vincenzopalazzo commented Apr 4, 2022 •

edited

Loading

Plugin becomes CPU hog on larger nodes #86

Plugin becomes CPU hog on larger nodes #86

Comments

svewa commented Mar 10, 2022

svewa commented Mar 10, 2022

svewa commented Apr 3, 2022

vincenzopalazzo commented Apr 4, 2022

svewa commented Apr 4, 2022

vincenzopalazzo commented Apr 4, 2022

svewa commented Apr 4, 2022

vincenzopalazzo commented Apr 4, 2022 • edited Loading

svewa commented Apr 4, 2022 • edited Loading

vincenzopalazzo commented Apr 4, 2022 • edited Loading

vincenzopalazzo commented Apr 4, 2022 •

edited

Loading

svewa commented Apr 4, 2022 •

edited

Loading

vincenzopalazzo commented Apr 4, 2022 •

edited

Loading