Privacy / anonymization note
The concrete payment identifiers, invoice text, amounts, part amounts, and recipient details below have been anonymized/faked before posting publicly. The important property preserved is the state-machine relationship: pay was redirected to xpay, the RPC returned code=209 with remaining 0msat, then lower-level sendpay_success notifications with a preimage arrived and the successful parts summed to the intended amount.
Summary
On CLN v25.05.1, an application called pay for a BOLT12 invoice. The xpay plugin intercepted the call (pay -> xpay-as-pay). The RPC returned a failure:
JSONRPCError: code=209, message=Failed after 18 attempts...
... Then routing for remaining 0msat failed: amount must be non-zero
However, immediately afterwards CLN emitted successful sendpay_success notifications for the same payment_hash, including the payment_preimage. The recipient later confirmed receiving the exact amount for the same invoice and preimage.
This looks like an xpay MPP state-machine race/bug: xpay reaches a state where the remaining amount is 0msat, still calls askrene getroutes with amount_msat=0, askrene rejects it with amount must be non-zero, and xpay turns that internal zero-amount routing failure into a final user-facing RPC error even though the lower-level payment completed.
Version
Core Lightning / CLN v25.05.1
Relevant setup
- Application/plugin calls
pay.
xpay-handle-pay is enabled, so pay is redirected by xpay.
- Payment was a BOLT12 invoice.
- MPP/partial payment attempts were involved.
Payment identifiers / fake example values
payment_hash:
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
payment_preimage:
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
completed amount:
12345000msat = 12345 sats
Observed log sequence, with anonymized/fake values
-
The plugin fetched the BOLT12 invoice successfully.
-
Application started payment via pay:
plugin-cln4go-plugin: paying the offer ...
plugin-cln-xpay: Got command pay
plugin-cln-xpay: Redirecting pay->xpay
xpay attempted many payment parts. Several failed with routing/payment errors such as:
Error fee_insufficient
Error temporary_channel_failure
- The RPC caller received an error:
plugin-cln4go-plugin: WrapError received raw error:
JSONRPCError: code=209, message=Failed after 18 attempts...
... Then routing for remaining 0msat failed: amount must be non-zero
- Immediately afterwards, CLN emitted successful
sendpay_success notifications for the same payment hash, for example:
{
"payment_hash": "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
"payment_preimage": "bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb",
"status": "complete",
"bolt12": "lni1qqexample..."
}
- The successful parts were:
1000000 msat
8000000 msat
3345000 msat
Sum:
12345000msat = 12345 sats
- The recipient later confirmed receiving exactly:
Amount: 12,345 sats
Invoice: same anonymized lni1...
Lightning preimage: bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
- CLN also logged:
UNUSUAL plugin-cln-xpay:
Destination accepted partial payment, failed a part (...), but accepted only 12345000msat of 12345000msat. Winning?!
Note that in this case the log says 12345000msat of 12345000msat, i.e. the amount accepted equals the intended amount.
Suspected code path
pay redirection
In plugins/xpay/xpay.c, handle_rpc_command() logs:
plugin_log(cmd->plugin, LOG_DBG, "Got command %s", ...);
plugin_log(cmd->plugin, LOG_INFORM, "Redirecting pay->xpay");
and replaces the method with:
json_add_string(response, "method", "xpay-as-pay");
Final user-facing error construction
The code=209 error appears to be constructed in plugins/xpay/xpay.c:getroutes_done_err():
if (amount_msat_eq(payment->amount_being_routed, payment->amount))
complaint = "Then routing failed";
else
complaint = tal_fmt(tmpctx, "Then routing for remaining %s failed",
fmt_amount_msat(tmpctx, payment->amount_being_routed));
payment_give_up(aux_cmd, payment, PAY_UNSPECIFIED_ERROR,
"Failed after %"PRIu64" attempts. %s%s: %s",
payment->total_num_attempts,
payment->prior_results,
complaint,
msg);
amount must be non-zero
That text appears to come from plugins/askrene/askrene.c:json_getroutes():
if (amount_msat_is_zero(*amount)) {
return command_fail(cmd, JSONRPC2_INVALID_PARAMS,
"amount must be non-zero");
}
xpay already treats getroutes_for(0msat) as abnormal
In plugins/xpay/xpay.c:getroutes_for():
/* I would normally assert here, but we have reports of this happening... */
if (amount_msat_is_zero(deliver)) {
payment_log(payment, LOG_BROKEN, "getroutes for 0msat!");
send_backtrace("getroutes for 0msat!");
}
So it seems xpay knows routing 0msat is an abnormal state, but still continues and sends getroutes to askrene, which then rejects it and causes the final RPC failure.
Winning?! log
The Destination accepted partial payment... Winning?! log is in plugins/xpay/xpay.c:update_knowledge_from_error() and is triggered if a previous attempt succeeded and then another part fails:
if (any_attempts_succeeded(attempt->payment)) {
payment_log(attempt->payment, LOG_UNUSUAL,
"Destination accepted partial payment,"
" failed a part (%s), but accepted only %s of %s."
" Winning?!",
description,
fmt_amount_msat(tmpctx, total_delivered(attempt->payment)),
fmt_amount_msat(tmpctx, attempt->payment->amount));
}
In this incident it logged 12345000msat of 12345000msat, which appears to be a completed amount, not a partial one.
Expected behavior
If the remaining amount becomes 0msat, xpay should not call askrene getroutes with amount_msat=0.
If sufficient parts have completed and a preimage is known, the high-level RPC should return success.
If there are still in-flight parts, xpay should wait/reconcile rather than returning a final failure caused only by zero-amount route computation.
Actual behavior
The high-level pay/xpay RPC returned error code 209, while lower-level sendpay state completed and the recipient received the payment/preimage.
Impact
External applications using pay can mark a payout as failed even though the payment completed. This can lead to incorrect accounting, duplicate payout attempts, or manual reconciliation work.
In our case, the application dashboard did not record the Ocean payout as successful because the RPC returned an error, even though the recipient had the preimage and exact amount.
Current mitigation / question
As a temporary mitigation, I am testing/running:
My understanding is that this is intended to make xpay wait until all current payment parts have completed or failed before returning success/failure to the RPC caller. That seems relevant to this issue because the observed failure was that the high-level RPC returned an error while lower-level sendpay_success notifications for the same payment_hash arrived immediately afterwards.
Could you confirm whether xpay-slow-mode=true is the recommended workaround for this class of issue, or whether you would suggest a different temporary mitigation, such as disabling xpay-handle-pay and falling back to the legacy pay plugin?
I am also adding application-level reconciliation after any pay/xpay error using:
lightning-cli listpays payment_hash=<payment_hash>
lightning-cli listsendpays payment_hash=<payment_hash>
The main question is whether xpay-slow-mode=true should be enough to avoid returning a final RPC failure while in-flight MPP parts can still complete, or whether xpay still needs a code fix to avoid calling getroutes for 0msat and converting that into a user-facing payment failure.
Privacy / anonymization note
The concrete payment identifiers, invoice text, amounts, part amounts, and recipient details below have been anonymized/faked before posting publicly. The important property preserved is the state-machine relationship:
paywas redirected toxpay, the RPC returnedcode=209withremaining 0msat, then lower-levelsendpay_successnotifications with a preimage arrived and the successful parts summed to the intended amount.Summary
On CLN
v25.05.1, an application calledpayfor a BOLT12 invoice. Thexpayplugin intercepted the call (pay -> xpay-as-pay). The RPC returned a failure:However, immediately afterwards CLN emitted successful
sendpay_successnotifications for the samepayment_hash, including thepayment_preimage. The recipient later confirmed receiving the exact amount for the same invoice and preimage.This looks like an
xpayMPP state-machine race/bug:xpayreaches a state where the remaining amount is0msat, still callsaskrene getrouteswithamount_msat=0,askrenerejects it withamount must be non-zero, andxpayturns that internal zero-amount routing failure into a final user-facing RPC error even though the lower-level payment completed.Version
Relevant setup
pay.xpay-handle-payis enabled, sopayis redirected byxpay.Payment identifiers / fake example values
Observed log sequence, with anonymized/fake values
The plugin fetched the BOLT12 invoice successfully.
Application started payment via
pay:xpayattempted many payment parts. Several failed with routing/payment errors such as:sendpay_successnotifications for the same payment hash, for example:{ "payment_hash": "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", "payment_preimage": "bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb", "status": "complete", "bolt12": "lni1qqexample..." }Sum:
Note that in this case the log says
12345000msat of 12345000msat, i.e. the amount accepted equals the intended amount.Suspected code path
payredirectionIn
plugins/xpay/xpay.c,handle_rpc_command()logs:and replaces the method with:
Final user-facing error construction
The
code=209error appears to be constructed inplugins/xpay/xpay.c:getroutes_done_err():amount must be non-zeroThat text appears to come from
plugins/askrene/askrene.c:json_getroutes():xpayalready treatsgetroutes_for(0msat)as abnormalIn
plugins/xpay/xpay.c:getroutes_for():So it seems
xpayknows routing0msatis an abnormal state, but still continues and sendsgetroutestoaskrene, which then rejects it and causes the final RPC failure.Winning?!logThe
Destination accepted partial payment... Winning?!log is inplugins/xpay/xpay.c:update_knowledge_from_error()and is triggered if a previous attempt succeeded and then another part fails:In this incident it logged
12345000msat of 12345000msat, which appears to be a completed amount, not a partial one.Expected behavior
If the remaining amount becomes
0msat,xpayshould not callaskrene getrouteswithamount_msat=0.If sufficient parts have completed and a preimage is known, the high-level RPC should return success.
If there are still in-flight parts,
xpayshould wait/reconcile rather than returning a final failure caused only by zero-amount route computation.Actual behavior
The high-level
pay/xpayRPC returned error code209, while lower-levelsendpaystate completed and the recipient received the payment/preimage.Impact
External applications using
paycan mark a payout as failed even though the payment completed. This can lead to incorrect accounting, duplicate payout attempts, or manual reconciliation work.In our case, the application dashboard did not record the Ocean payout as successful because the RPC returned an error, even though the recipient had the preimage and exact amount.
Current mitigation / question
As a temporary mitigation, I am testing/running:
xpay-slow-mode=trueMy understanding is that this is intended to make
xpaywait until all current payment parts have completed or failed before returning success/failure to the RPC caller. That seems relevant to this issue because the observed failure was that the high-level RPC returned an error while lower-levelsendpay_successnotifications for the samepayment_hasharrived immediately afterwards.Could you confirm whether
xpay-slow-mode=trueis the recommended workaround for this class of issue, or whether you would suggest a different temporary mitigation, such as disablingxpay-handle-payand falling back to the legacypayplugin?I am also adding application-level reconciliation after any
pay/xpayerror using:The main question is whether
xpay-slow-mode=trueshould be enough to avoid returning a final RPC failure while in-flight MPP parts can still complete, or whetherxpaystill needs a code fix to avoid callinggetroutesfor0msatand converting that into a user-facing payment failure.