sweep: create sweeper#1960
Conversation
5ce7feb to
0e9fbeb
Compare
369d13b to
58740fd
Compare
We can be smarter about handling these errors. If the error is non-recoverable, then publishing in the next block won't really help (also can potentially get us banned by peers if we keep sending them invalid transactions). In that case we should return the error and exit IMO to surface the bug. For double spends we can detect which input was already spent, remove it from the sweep, and retry. I think this is what is implicitly done already (since an input will be removed from the set after spend is detected), but I think it would help to make it more clear. We do something similar in the
It should be handled, but my suggestion is to reduce the scope of this PR to not include it. This is an existing problem.
Yes, I think this is actually my suggestion. Maybe we are actually talking about the same solution here, the behavior is just not obvious. See comment above.
I don't think retry logic is strictly needed, and can be added in a follow-up PR. |
|
@Roasbeef your comments have been processed. |
|
Discussed simplification with @halseth. We could take out the reschedule logic and just try to sweep once every block. Start the batch timer at the beginning of every block and if it expires, try to sweep. The downside of this is that a transaction that is stuck because of low fees will keep blocking all subsequent sweep txes if the backend doesn't support RBF. (This is something that would be solved in a probabilistic way by the randomized exponential back off.) It is also a regression from nursery, because nursery does skip over stuck txes because they are organized per height. If we'd decide to go this path, the follow up pr with exponential backoff should probably go in before release. @Roasbeef your input please |
|
Added one commit that handles errors in a strict way, not to camouflage potential bugs. |
halseth
left a comment
There was a problem hiding this comment.
This is starting to look pretty good to me :)
There was a problem hiding this comment.
Can be moved to immediately before the loop, I guess. Rationale being that we might encounter an error before ever using the result of this calculation :)
There was a problem hiding this comment.
since this is only used for calculating the dust limit(?), can we instead store the dust limit directly here?
There was a problem hiding this comment.
I wanted to have the dust limit calculation inside txgenerator to logically group the functionality together. generateInputPartitionings takes two fees and uses that as the basis to calculate the partitionings.
There was a problem hiding this comment.
If this happens, feels like the whole sweeper should exit.
There was a problem hiding this comment.
What do you mean with exit? The only thing that is running is the sweeper main event loop and that one is exited.
There was a problem hiding this comment.
Do you have an estimate how many blocks from sending the input to the sweeper until this condition is met, on average? (if it never successfully confirms)
There was a problem hiding this comment.
I think ((1 << 10) - 1) / 2 😄 so about 500 blocks.
There was a problem hiding this comment.
is this not critical at this point?
There was a problem hiding this comment.
It is a pre-existing bug that I wanted to mark with a TODO.
There was a problem hiding this comment.
I think this is still happy flow, therefore Info.
There was a problem hiding this comment.
Why not just read the buckets directly? The migration logic would be more precise that way and less risk of accidentally reaching into the wrong bucket, if a future key/bucket in the codebase starts with the prefix.
|
Made a commit with some additional logs that should likely be cherry picked in: Here's the branch itself as I'm modifying commits as I test: https://github.com/Roasbeef/lnd/pull/new/sweeper-debug |
|
With the patch above, I ran on a very old node with an ancient nursery and saw no transactions migrated: |
|
|
Tested nursery tx migration again. Seems to work on my machine. |
|
Modifying the static fees is not thread safe. In this commit the fees are made immutable.
We need to distinguish an lnd build for the purpose of integration testing from a regular dev build. This makes it possible to adapt parameters to let integration tests run faster (for example: sweeper batch window).
This commit is a preparation for the implementation of remote spend detection. Remote spends may happen before we broadcast our own sweep tx. This calls for accurate height hints.
This commit adds a function that takes a set of inputs and splits them in sensible sets to be used for generating transactions.
This commit adds a store for the sweeper. The sweeper needs minimal persistent data to be able to recognize its own sweeps.
In this commit, the sweep package is extended from just tx generation to an active sweeper that collects sweep inputs and autonomously proceeds to publish the sweep tx after the batch window time interval has passed without new inputs being added.
Previously, nursery generated and published its own sweep txes. It stored the sweep tx in nursery_store to prevent a new tx with a new sweep address from being generated on restart. In this commit, sweep generation and publication is removed from nursery and delegated to the sweeper. Also the confirmation notification is received from the sweeper.
cfromknecht
left a comment
There was a problem hiding this comment.
Awesome work @joostjager, excited to finally have this in and help people sweep their stuck outputs! Very happy with the final iteration and the huge reduction in complexity over the nursery, LGTM 🎉
This PR moves the sweeper logic from
utxonurseryinto a separate sweeper struct.A follow up PR is #2000, which will let resolvers use the sweeper directly instead of going through nursery. The end goal is to remove nursery completely.
In this PR, there is no db migration to clean up now unused nursery store data. This allows us to keep the downgrade path open for a while.