New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Horizon timeout after transaction is added to ledger #305
Comments
A node that succeeds submitting transaction returns the following "txsub.buffered": {
"value": 0
// ...
"txsub.open": {
"value": 11 Nodes that are timing out return the following: "txsub.buffered": {
"value": 0
// ...
"txsub.open": {
"value": 58 |
I've managed to bypass the issue by not using the quickstart docker image. I created an alternative setup similar to the above, but on AWS. At first I tried to used the quickstart docker image but got the same results - not a single successful transaction, all timed out. Afterwards I decided to abandon the docker image and setup my ow PostgreSQL on RDS, and four EC2 instances in four different regions globally, and the problem has disappeared completely. |
Moving to stellar/quickstart#41. |
Have the same issue, but not using quickstart docker image. |
It doesn't look like this is related to the docker image (I can imagine it's not related to docker at all). We currently have one Horizon that is timing out with 504s for all transactions that are submitted to it. Anything I can do to help debugging this? |
Hm, I just set |
@andrenarchy what's the average tx submission rate to all your Horizon instances? Does it happen when you submit a lot of transactions or even for smaller numbers? |
I think I'm able to reproduce it, going to debug it now. |
@bartekn cool, keep me posted! When Horizon was in this state it was enough to submit a single transaction to get the 504. The instance might have seen a large tx submission rate before though. |
After checking the First, let me explain how Now, why Core doesn't add a new transaction to So what's the workaround in case of such situation? Developers have two options:
Core team is discussing a new fee model in stellar/stellar-protocol#133 that's basically changing This issue will be closed after merging #541 which updates the documentation. |
Thanks for the explanation @bartekn! However, this does not explain the issue for me. We observe that the transaction is actually included pretty much immediately after submitting it to Horizon (usually the next ledger). This is also what @oryband described above. Thus I don't see how the window of 3 consecutive ledgers in @bartekn can you please reopen the issue? |
@andrenarchy can you confirm that response body you're getting is sent by Horizon, i.e. it looks like this: https://github.com/stellar/go/blob/master/services/horizon/internal/render/problem/main.go#L64 Are you using any reverse proxy or communicate with Horizon directly?
I will confirm with the Core team if this behaviour has been changed recently. |
We are using a reverse proxy but the timeout is not coming from the proxy. I can't reproduce it right now (Horizon is only showing this behavior sometimes) but I assume the response looks like in the link you sent. Just to be clear: once Horizon is in this state it times out for all transactions sent to it – also when there are less than 50 tx per ledger. @bartekn how did you manage to reproduce it? I'm happy to help debugging. :) |
Yes, issue I had is not caused by high load. |
OK, it's possible that it's some obscure sync issue (like #603). Will try to debug it again. |
Hi. So what's up with this issue? Has it been fixed? Or what is a proper workaround how to deal with horizon if it returns 504 error? How to make sure that the transaction has been submitted and get it's resulting hash? We've just stumbled upon it and don't know how to resolve it and get proper response from horizon that transaction has been successfully submitted. And yes there is absolutely 0 load on our host. We only submit our own transactions from 1 account. Thanks. |
@gituser It worked for me. |
What LOG_LEVEL has to do with timeouts? I've just checked I had already |
Related stellar-core issue: stellar/stellar-core#1811 |
I just had again this error on fresh
Transaction of course has been included in the ledger. |
I had to rollback to |
I'm consistently encountering this problem for each transaction I try posting on 0.14.2/10.0.0. Have to resort to using SDF servers. The transactions do actually materialize, but the response is 504. |
@slavkomae interestingly I have this issue only on newer |
The If you are constantly having this issue it means that your stellar-core has problems with communicating with the network (ie. transaction submission will not work if your node is not synced correctly) or the network is congested at the moment and you just need to try again. If you want to receive a final state of a transaction in a defined time please add a I am going to close this as it's not Horizon issue - Horizon just forwards the transaction to stellar-core and waits for results. It has no control over how the transaction is broadcasted in the network. |
temporary solution that worked for me: stellar/quickstart#41 (comment) |
The issue has nothing to do with timeout. The transaction gets included in 1 second in the ledger and transaction time indicates that, so the question is why horizon responds with 504 whilst in fact the transaction has been included in 1 second right after posting? For me issue occurs every time I try to send outbound transaction through horizon on a fully synced node. What could be an issue there? My guess is after reboot there is gap in between horizon and stellar-core so it might be that it's trying to get some ledgers that are not available anymore because they were pruned? Althrough my instance shows that it's fully synced (right now I'm using latest versions
Of course, but horizon is used to send this transaction to the stellar network so for many applications this is an issue right there. Something timeouts means transaction wasn't successful but in fact it was. |
Indeed this works in LXC container as well. |
Please note this comment stellar/quickstart#41 (comment) |
@bartekn looks like the PR solves the issue, I have an environment where I can reproduce this and it doesn't appear anymore when testing with the above branch. See my comment on the PR. |
Horizon 0.15.4 contains a fix for this issue. Can others confirm it's working for you? |
The SatoshiPay docker images have been updated to 0.15.4 and things seem to run smoothly with them. We haven't seen this issue very often though so it's hard to tell – maybe there's someone who saw that problem more frequently with the SatoshiPay images and who wants to give the new version a shot? @bartekn thx for the fix, great job! :) |
I'll update my images and have a look. Thanks a lot! |
it's reproducable if not using the stellar testnet passphrase, and horizon launches before core is available for it. submitting a transaction in this case will timeout with HTTP 504. |
Yes, seems that Also, I've added this to the systemd units (just to be sure): /etc/systemd/system/stellar-core.service
/etc/systemd/system/horizon.service
|
@bartekn We've experienced this issue after submitting a transaction with version 0.22.1 of the horizon many times.
Also, I don’t think this is just a TX prioritization issue caused by node/network volume. |
@TFiroozian It's not clear to me that what you're reporting is the same issue, since others reported success after the fix in 0.15. Could you open a new one and reference this one? Thanks. |
It seems this issue is still there. It happens from time to time: usually when there are lots of transactions being made in Stellar network. Happens on:
Not sure how to debug it. |
@gituser Note that it is normal and expected to receive 504 errors when the network is under heavy load, if your provided maximum fee is insufficient. Please confirm the hash of the failed transaction, the output of /fee_stats at the time, and the fee you used. If you still feel the behaviour is incorrect, please open a new issue and include all of that information and we can take a look. |
@ire-and-curses the thing is that transaction is being included in the ledger 30 seconds later after the request. Is there any way to get optimal fee rate for current network condition? I think I'm using standard stellar fee. The error is still weird to me, should be more detailed. Also I couldn't find anywhere where the timeout is defined for |
@gituser The general best practice is as follows:
We're looking at improving the docs to make all of this clearer. I hope this helps. |
https://horizon.stellar.org/transactions/335b869b7c9eb491291869b8b06b32582190395945cccb91457ba04774f3ace0
2018-02-12T15:12:19.31094599Z
. Looking at the transaction information, it was created on2018-02-12T15:12:19Z
.2018-02-12T15:12:19Z
(~5 seconds after submission): https://horizon.stellar.org/ledgers/162301262018-02-12T15:13:19.434419221Z
(~1 minute after submission).horizon.Client.HTTP.Timeout
is set to 2 minutes.This is reproducible on 4 identical machines on DigitalOcean. The instances are on DO's zones FRA1, SFO1, SGP1, NYC1
Each node is running Horizon + Core (watcher, non-validating) using stellar/quickstart docker image with this commit: stellar/quickstart@aad0807
A local application submits transactions to the local Horizon on each node.
Here is the following Core and Horizon configuration in use:
It looks like the configuration is OK, no idea why I'm getting the timeouts.
The text was updated successfully, but these errors were encountered: