Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce mem and cpu usage of onchaind for long-lived channels #4250

Merged

Conversation

rustyrussell
Copy link
Contributor

Noticed my node's onchaind was chewing memory and CPU (I run under valgrind!), caught the IO from onchaind and created this replay test. It will break as soon as we change onchaind's wire format, but it's a good test for now.

onchaind/onchaind.c Outdated Show resolved Hide resolved
Copy link
Member

@cdecker cdecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from the elements question I think this is good to go 👍

ACK 40931f1

Comment on lines +155 to +157
/* FIXME: This doesn't work for elements? */
if (chainparams->is_elements)
return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question, I don't think it should be too much of an issue, given that we currently only have one main-asset input, which also makes the below if (!amount_asset_is_main(&amt)) always false, but better keep it in in case we eventually add multi-input and multi-asset channels.

Was there a concrete error that was triggered on elements, or is this more of a precaution?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we had a -1ULL satoshis value. Then added this test and this function caused an onchaind failure (presumably by giving the wrong result).

Copy link
Collaborator

@niftynei niftynei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 40931f1

Under some circumstances, valgrind complains.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In particular, this made valgrind OOM, and chewed much CPU.  I dumped the
input and output into a file to allow easy replay.

This will break as soon as we change onchaind's wire format, but it will
serve its purpose until then!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We assert() this in onchaind while grinding fees; better to free newtx.

Before this we hit 530MB, after a mere 2.5MB.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: onchaind uses much less memory on unilateral closes for old channels.
We try signatures to see which HTLC (we can have many) is the right one;
we can trivially match htlcs against commitment tx outputs, but the CTLV
can vary, and that's inside the htlc tx itself.

By sorting them, it's easy to skip comparing duplicates:

Time before: 2m32.547s
Time after: 1m6.984s

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The commitment tx uses the same feerate as the HTLC txs (which we have
to grind to find), but we can't use it directly since the fee could be
increased by the presence of dust HTLCs.  We can still use it to cap
the maximum though.

Time before: 1m6.984s
Time after:  0m15.794s

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: onchaind is much faster when unilaterally closing old channels.
@rustyrussell
Copy link
Contributor Author

Trivial rebase, plus changelog additions.

@cdecker
Copy link
Member

cdecker commented Dec 7, 2020

ACK ea4d6dc

@cdecker cdecker merged commit e13e3e9 into ElementsProject:master Dec 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants