Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libp2p node envelope out queue overflow cause unknown address as envelope target #2634

Open
solarw opened this issue Sep 23, 2021 · 0 comments

Comments

@solarw
Copy link
Contributor

solarw commented Sep 23, 2021

libp2p_node out queue overflow issue:

The case:
During TAC demo checks was found that internal queue (

)
is running full and freed quite slowly, cause look up for unknown address (that does not present in the network) has timeout 20 seconds ( ).
Queue goes full and freed for one slot in 20 seconds.
default ACN ack timeout for libp2p_node connection is 5 seconds ( )
that less than 20 seconds to await for free slot.
When message goes from libp2p connection to libp2p node process with full buffer it blocked on message add to queue for about 20 seconds. after 5 seconds left python code decide to retry and reconnect to node process.

In TAC demo two messages were generated every two seconds, one message has not existing address as a target, so it caused constant delay for 20 seconds.

Overall logic and design of p2p node is good, but particular case showed weak parts of the implementation.

Ways to solve the particular issue:
Add queue state warning logs in libp2p_node to help find out similar issue in the future.
Add a specific benchmarks for similar cases.

  • adjust timeouts and time limits: like bigger ACN timeout on the python side and lower address lookup timeout will help to fix this issue without any extra code/logic update.
    python part will be blocked until empty envelope slot appears.
  • add acn wait response that will be sent in case queue is full, to pause envelope sending from the python side till free slot appears.
  • try to parallel processing of the messages from queue if possible (depends on libp2p code, envelopes order):
    • in case slow envelopes put it into special queue and process individually, cause most of the time it awaits for response from remote node.
    • introduce several sending channels: like workers that consume queue and perform some tasks in parallel: 5 workers to send 5 envelopes in a time, if some blocked other will keep working. sure, all the workers can be blocked, but this solution can decrease probability of total blocking
  • probably create per target address queues that will block envelopes to specific address only

another point:
find out the way to notify p2p connection, that specific address is not reachable, to avoid sending messages to this address and probably notify user about it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant