Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hostwatch assumes ASCII-only locale #377

Closed
BenWiederhake opened this issue Oct 30, 2019 · 4 comments
Closed

hostwatch assumes ASCII-only locale #377

BenWiederhake opened this issue Oct 30, 2019 · 4 comments

Comments

@BenWiederhake
Copy link
Contributor

This sounds like the same bug as #110 and #93, but I'm using version 0.78.5, and those issues claim it has been resolved in 0.78.1 and "0.78.1.dev6<+ng1d64879" (which I guess is older than 0.78.5).

Here's a part of the log:

firewall manager: setting up /etc/hosts.
 s: mux wrote: 30/30
 s: Waiting: 2 r=[4, 7] w=[] x=[] (fullness=22/0)
HH: <    127.0.1.1
HH:  > hosts
HH: <    127.0.0.1 ['localhost']
HH: <    127.0.1.1 ['myremote']
HH: <    192.168.0.200 ['otherpc']
HH:  > netstat
HH: Traceback (most recent call last):
--->   File "sshuttle.server", line 144, in start_hostwatch
--->   File "sshuttle.hostwatch", line 294, in hw_main
--->   File "sshuttle.hostwatch", line 131, in _check_netstat
---> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 552: ordinal not in range(128)
 s:   Ready: 2 r=[7] w=[] x=[]
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "assembler.py", line 39, in <module>
  File "sshuttle.server", line 403, in main
  File "sshuttle.ssnet", line 598, in runonce
  File "sshuttle.server", line 334, in hostwatch_ready
sshuttle.helpers.Fatal: hostwatch process died

And here's the first 570 (>522) bytes of output of ssh myremote netstat:

Aktive Internetverbindungen (ohne Server)
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        0      0 192.168.0.2:22001       xxxxxxxxxxxxxxxxxxxxxxx VERBUNDEN  
tcp        0    236 192.168.0.2:1509        xxxxxxxxxxxxxxxxxxxxxxx VERBUNDEN  
tcp        0      0 192.168.0.2:50796       xxxxxxxxxxxxxxxxxxxxxxx VERBUNDEN  
tcp        0      0 192.168.0.2:1509        xxxxxxxxxxxxxxxxxxxxxxx VERBUNDEN  
tcp        0      0 192.168.0.2:1509        xxxxxxxxxxxxxxxxxxxxxxx VERBUNDEN  
Aktive Sockets in der UNIX-Domäne (ohne Server) 

Note the non-ASCII character ä in Domäne.

Maybe change the invocation to something like LC_ALL=C netstat, wouldn't this fix the problem?

@brianmay
Copy link
Member

brianmay commented Nov 7, 2019

Do you want to test LC_ALL=C netstat?

My suspicion is that it may not work. I think netstat may just copy the bytes in the hostname as is.

We could assume that the output is always UTF8, but is this really the case? There could be systems out there that are using other encodings. Does this need to be configurable?

@BenWiederhake
Copy link
Contributor Author

Well, here's a line of output from netstat -n:

Aktive Sockets in der UNIX-Domäne (ohne Server)

Note the non-ASCII ä in there.
And here's the same line from LC_ALL=C netstat -n:

Active UNIX domain sockets (w/o servers)

Note the ASCII-only-ness.
My best guess is that netstat's strings for the C locale are ASCII-only; if you want to confirm that this should be easy by just checking netstat's potfile.

Therefore my suggestion is: Stick to ASCII-only, but call netstat with LC_ALL=C set. Maybe do so for all other subprocesses, too.

@brianmay
Copy link
Member

brianmay commented Nov 8, 2019

Yes, it sounds like setting LC_ALL=C is probably the way to go. In fact we probably should have been doing it anyway, as we really want the output of netstat to remain predictable and not change according to the locale.

Oh, interesting, looks like we do the right thing for other external commands, e.g.

def _list_routes(argv, extract_route):
    # FIXME: IPv4 only
    env = {
        'PATH': os.environ['PATH'],
        'LC_ALL': "C",
    }
    p = ssubprocess.Popen(argv, stdout=ssubprocess.PIPE, env=env)
   ...

However the call in hostwatch.py was missed. Patches welcome. This should be trivial to fix based on the code above and adding the required env=env parameter to the Popen call as above.

brianmay pushed a commit that referenced this issue Nov 9, 2019
* Make hostwatch locale-independent

See #377: hostwatch used to call netstat and parse the result,
without setting the locale.
The problem is converting the binary output to a unicode string,
as the locale may be utf-8, latin-1, or literally anything.
Setting the locale to C avoids this issue, as netstat's source
strings to not use non-ASCII characters.

* Break line, check all other invocations
@brianmay
Copy link
Member

brianmay commented Nov 9, 2019

Merged fix, closing.

@brianmay brianmay closed this as completed Nov 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants