-
-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support socks5h proxy using unix sockets #8668
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For tests, I think test case 1435 can serve as good inspiration as it already uses --unix-socket
lib/url.c
Outdated
@@ -2519,6 +2519,20 @@ static CURLcode parse_proxy(struct Curl_easy *data, | |||
} | |||
|
|||
/* now, clone the proxy host name */ | |||
#ifdef USE_UNIX_SOCKETS | |||
if(proxytype == CURLPROXY_SOCKS5_HOSTNAME |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason to limit this to SOCKS5_HOSTNAME
and not allow SOCKS4a too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought it may not work as dns resolution is done locally in SOCKS4a using the same function. But after quick test, looks like it should work. I will update for all socks proxies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to support all socks proxies and not depend on the scheme prefix in proxy string. Please confirm if implementation/syntax is ok. I will then add a test.
lib/url.c
Outdated
@@ -2519,6 +2519,20 @@ static CURLcode parse_proxy(struct Curl_easy *data, | |||
} | |||
|
|||
/* now, clone the proxy host name */ | |||
#ifdef USE_UNIX_SOCKETS | |||
if(proxytype == CURLPROXY_SOCKS5_HOSTNAME | |||
&& strncmp("unix/", proxy + 10, 5) == 0) { /* len("socks5h://") = 10 */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The host name might start with that scheme, the type could also be set with a separate option and then the proxy name starts with the host name. This cannot assume the first 10 bytes are known.
I also find it a little unfortunate to reserve the unix
host name like this, as there might actually be a proxy in use that is named like that. Maybe we should instead make the --unix-socket
option accept a socks...
prefix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The host name might start with that scheme, the type could also be set with a separate option and then the proxy name starts with the host name. This cannot assume the first 10 bytes are known.
I will update to handle that case.
there might actually be a proxy in use that is named like that.
In that case, unix/path
cannot be a valid URI for a socks proxy as it can only have host:port
syntax. Since I am checking for unix/
, I think it should not affect anyone having a server named unix
with a socks proxy running in it.
Maybe we should instead make the --unix-socket option accept a socks... prefix
I wish to have the ability to set the proxy via environment variable like ALL_PROXY=socks5h://unix/run/tor/socks
, so that programs that internally use curl does not have to change the command line.
How about a syntax like this:
ALL_PROXY=socks5h:///unix/run/tor/socks
i.e. an extra slash, making the host part of the URL empty. This is similar to
how a file: URL would refer to the same file.
|
Unfortunately, the URL parser accepts one, two or three slashes where there should be two so three slashes will be accepted as if they were two:
Another option is to use a short but otherwise illegal host name. For example just two dots (
|
On Sat, Apr 02, 2022 at 02:55:00PM -0700, Daniel Stenberg wrote:
Unfortunately, the URL parser accepts one, two or three slashes where there
should be two so three slashes will be accepted as if they were two:
That sounds wrong—it goes against RFC 1738 and RFC 8089, anyway. Is this a
WHATWG thing?
|
Yes. 😢 The lax nature of WHATWG's URL spec makes such URLs occasionally get used in the wild, so this is our adjustment to accept such URLs better.The WHATWG spec actually allows any amount of slashes (even zero), and both backslashes and slashes (mixed) but curl only allows regular forward slashes and only one, two or three of them. |
I looked at the test setup. I think the following changes are required along with writing a testcase.
Q: Where should the temporary socket be created and how/when should it be cleaned up? With last update, If the hostname is Please correct if I am missing anything. |
3a6e94b
to
9c68b8e
Compare
@bagder Please take a look again.
I don't understand the torture test failure but it does not fail consistently. |
The torture test fail of 1467 means that there is a memory/resource leak somewhere in the exit path. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no added documentation?
Reviewed again. Could not find the leak.. Can you please share instructions to reproduce failure locally? I am on a ubuntu desktop. Added documentation for the command line flags. |
The clue is torture tests. It's a special way to run the tests, which verifies the exit paths heavily.
This then makes all allocations in the test fail, one by one and it verifies that it never leaks memory in any case. The 1467 failure you got looked like this:
It shows that escape.c:144 allocated a line that was never freed. It was when fallible function number 93 was made to fail. If you wanted reproduce that exact fail number, you can do this:
I usually first try with -t to run all loops torture to see that it reproduces, it might not be the exact same number in a local build. Also, use -n to switch off valgrind since running with valgrind makes every loop take several times longer, which can make this test really slow. Using valgrind in the last step when you re-run a single number is very useful because valgrind can help you pinpoint the entire leak call stack much better. |
if you still have problems rerunning the test, I can have a go at it soon. |
Thank you. Was able to reproduce and fix it. Waiting for CI run. |
Fixed torture test and added documentation. Please review. curl/check failure does not show any logs or errors except for message "Build Failed". Please help understand the error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks very good and I'm ready to merge. Just one thing left that I can think of:
You added docs for the command line tool but not for the libcurl option that actually passes the string to libcurl. CURLOPT_PROXY
. It needs to be similarly extended so that users can learn how to specify a socks proxy over unix domain socket with it.
Usage: curl -x "socks5h://localhost/run/tor/socks" "https://example.com" Updated runtests.pl to run a socksd server listening on unix socket Added tests test1467 test1468 Added documentation for proxy command line option and socks proxy options
Thank you. Updated CURLOPT_PROXY documentation. |
Thanks! |
Usage:
Ref: https://curl.se/mail/archive-2021-03/0012.html
This is a proof of concept to make sure it is working. Please take over from here or guide next steps - tests, documentation etc.,