-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
.dataReceived() is called once per byte, probably inefficient #3
Milestone
Comments
dash pointed at Line 66 in 7ae558f
parsley.makeProtocol , but instead we'd need to build a protocol just like that but with .dataReceived() overridden. Or, we build yet another Protocol object (with the switch-on-state logic) to wrap the Parsley-provided one. The transport would be connected to our wrapper, which would either deliver data to the Parsley protocol (before the handshake completes) or to the application-layer one (foolscap, in our case), depending upon the state of the Parsley object.
|
The easy way is to just subclass the class returned from |
Fixed per @washort's suggestion. Thanks! |
exarkun
added a commit
to exarkun/txi2p-original
that referenced
this issue
Aug 2, 2022
Set up CI for tests and release automation
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
While analyzing tahoe#2861, we identified a likely performance concern with the way txi2p delivers data on the server side of a connection.
This happens when you use a
txi2p.sam.endpoints.SAMI2PStreamServerEndpoint
, which is how you listen on an.i2p
address. txi2p implements this by making an outbound connection to the local I2P daemon, writing a command that says "hey, I want to accept connections for (some .i2p address)", then waiting for a response. When some client connects, the daemon responds ("hey, someone connected, get ready to talk to them"), and then uses the same TCP connection for the subsequent tunneled data.On the txi2p side, there is a parser/state machine (implemented with Parsely) that manages the initial command and response. Once the response is received, this state machine is moved into
State_readData
, which matches on arbitrary single bytes (the "anything:data
" target), and delivers each one toreceiver.dataReceived()
.This is sound, but slow. The expected scenario is when e.g. a Tahoe client uploads several megabytes of binary data to an I2P-based server, delivered through Foolscap and into the I2P connection. On the receiving (server) side, large buffers can be delivered in a single system call, up to the size of the kernel buffers (typically 64kB). This could all be processed in a single
.dataReceived()
invocation. When txi2p breaks this up into a lot of one-byte invocations instead, performance will suffer (in particular, CPU usage on the server will be higher than necessary). Worst case is probably a quadratic slowdown, if the next-higher protocol (e.g. Foolscap) does the lazy thing and appends the incoming bytes to a buffer until the expected number have been received:To fix this, txi2p will need to swap out the Parsley parser for a direct connection to the target protocol's
.dataReceived
, when it moves intoState_readData
. @washort suggested:This might also affect the inbound side of I2P client connections too (those created with
SAMI2PStreamClientEndpoint
), I'm not sure. In the Tahoe context, this would be a Tahoe client downloading a file from an I2P-hosted server, and the additional CPU load would occur on the client side.The text was updated successfully, but these errors were encountered: