Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When tor process exits unexpectedly tor.protocol.get_info hangs #389

Closed
hellais opened this issue Aug 29, 2023 · 5 comments · Fixed by #390
Closed

When tor process exits unexpectedly tor.protocol.get_info hangs #389

hellais opened this issue Aug 29, 2023 · 5 comments · Fixed by #390

Comments

@hellais
Copy link
Contributor

hellais commented Aug 29, 2023

Here is a minimal reproduction for this issue:

from twisted.internet.task import react
from twisted.internet.defer import ensureDeferred

import txtorcon

EXIT_RELAY_FP = [
'130CFCF38BA3327E3001A1DB2A4B5ACBDAE248D9', # this triggers the getinfo fail

'127F6358F68FFB7E437DBA51D6D4DAC47B9F78A7',
]

async def main(reactor):
        try:
            tor = await txtorcon.launch(
                reactor,
                kill_on_stderr=False,
                progress_updates=lambda x, y, z: print(f"{x}%: {y} - {z}"),
            )
        except Exception as exc:
            print(f"FAILED to start tor {exc}")
            return

        state = await tor.create_state()
        for exit_fp in EXIT_RELAY_FP:
            print(f"doing {exit_fp}")
            try:
                print("calling GETINFO")
                info = await tor.protocol.get_info("md/id/" + exit_fp)
                print(f"got {info}")
            except Exception as exc:
                print(f"FAILED to get info for {exit_fp} {exc}")

@react
def _main(reactor):
    return ensureDeferred(main(reactor))

You can see that the last log lines are:

doing 130CFCF38BA3327E3001A1DB2A4B5ACBDAE248D9
calling GETINFO
FAILED to get info for 130CFCF38BA3327E3001A1DB2A4B5ACBDAE248D9 Tor unexpectedly disconnected while running: GETINFO md/id/130CFCF38BA3327E3001A1DB2A4B5ACBDAE248D9
doing 127F6358F68FFB7E437DBA51D6D4DAC47B9F78A7
calling GETINFO
Unhandled Error
Traceback (most recent call last):
Failure: builtins.RuntimeError: Tor exited with error-code 0

Where we are waiting on the tor.protocol.get_info("md/id/" + exit_fp) call.

I would expect to either get an errback, because the process is dead or have some other way to tell that I should not be calling it.

It would also be nice to be able to somehow know that tor has exited this way, as it's currently not possible to listen for the "tor exited" event.

This probably is also a tor bug, as it should not happen that the tor process exits when issuing this specific get_info command.

@meejah
Copy link
Owner

meejah commented Aug 29, 2023

Okay, so it does errback() on the getinfo (properly) --- but still "hangs". However, if you change the test-case to "raise" right after the "FAILED" print() then it'll exit.

Although it's not quite the same, there is TorControlProtocol.when_disconnected() which will effectively tell you if tor exited (although it's a bit more general, because there may be other ways to lose the connection to Tor besides it exiting).

That said, adding a when_exited() or similar to TorProcessProtocol probably makes sense.

@hellais
Copy link
Contributor Author

hellais commented Aug 29, 2023

Okay, so it does errback() on the getinfo (properly) --- but still "hangs".

It calls the errback on the first failure, that is this log line

doing 130CFCF38BA3327E3001A1DB2A4B5ACBDAE248D9
calling GETINFO
FAILED to get info for 130CFCF38BA3327E3001A1DB2A4B5ACBDAE248D9 Tor unexpectedly disconnected while running: GETINFO md/id/130CFCF38BA3327E3001A1DB2A4B5ACBDAE248D9

But then when the second get_info call is made, it hangs without any errback, see these log lines:

doing 127F6358F68FFB7E437DBA51D6D4DAC47B9F78A7
calling GETINFO
Unhandled Error
Traceback (most recent call last):
Failure: builtins.RuntimeError: Tor exited with error-code 0

However, if you change the test-case to "raise" right after the "FAILED" print() then it'll exit.

I am not sure this is the correct thing to do in this case though, since the first failure could potentially be a non-final failure (ex. I passed an invalid descriptor) in which case I just would like to skip that specific relay.

@meejah meejah added the bug label Aug 29, 2023
hellais added a commit to hellais/txexitmap that referenced this issue Aug 29, 2023
@meejah
Copy link
Owner

meejah commented Aug 29, 2023

Yes, definitely at least one issue here, thanks for the detailed bug report!

@meejah
Copy link
Owner

meejah commented Aug 29, 2023

By default, Twisted's LineOnlyReceiver limits to 16384 bytes maximum for "a line" and this particular microdescriptor exceeds that.

So, we either need to "not limit" line-length (not really possible with Twisted right now, but we could set MAX_INT or something I suppose) and / or find out what the maximum length of a microdescriptor can be .....

@meejah
Copy link
Owner

meejah commented Aug 29, 2023

To test if this is the only thing holding your case back, you could set tor.protocol.MAX_LENGTH = 16384000 before issuing any of the GETINFOs

This was referenced Aug 29, 2023
kristapsk added a commit to JoinMarket-Org/joinmarket-clientserver that referenced this issue Feb 5, 2024
ca33eca Bump txtorcon from 23.0.0 to 23.11.0 (Kristaps Kaupe)

Pull request description:

  In #1567 we updated to 23.0.0, not 23.5.0, because that was the last version supporting Python 3.7. Now that Python 3.7 support is dropped with #1639, can upgrade to latest.

  Changes in [23.5.0](https://github.com/meejah/txtorcon/releases/tag/v23.5.0):

  * twisted.web.client.Agent instances now use the same HTTPS policy by default as twisted.web.client.Agent. It is possible to override this policy with the tls_context_factory= argument, the equivalent to Agent's contextFactory= (Thanks to Itamar Turner-Trauring)
  * Added support + testing for Python 3.11.
  * No more ipaddress dependency

  Changes in [23.11.0](https://github.com/meejah/txtorcon/releases/tag/v23.11.0):

   * Fix test-failures on Python 3.12
   * Particular GETINFO hanging (meejah/txtorcon#389) (ultra-long lines over 16KiB caused problems in the protocol)
   * Use built-in `mock` only (https://github.com/jelly)
   * Remove `incremental` (https://github.com/gdrosos)

  Basically, not much changes, but fixed bug, dropped dependencies and better support for Python 3.11 and 3.12 is good enough reason to merge for me.

Top commit has no ACKs.

Tree-SHA512: 0379dfd7303f3d2f5567fddda75f4d06af1fb56c0499245745842ba82b5623d49324e4c36e7e759b50becb22874a32282c115f6c5a2c5c18979acc4b096f7d18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants