-
Notifications
You must be signed in to change notification settings - Fork 29k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
http: support environment-defined proxy #8381
Comments
While I do agree that this is a very common thing to want, I think it is important to point out that so are also many other things that |
I don't think this is going to work here. Imagine a CLI tool spawning a child process of |
I think this is a fallacy (don't ask me which) because the statement can be reversed to: The user cannot reasonably provide a The person that writes the request: http.request(url, {
respectEnv: true
}) decides if this request respects the environment variable (new mode) or not (legacy). I am pushing this because HTTP_PROXY environment variables and errors related to them are horrible to debug once you have an application running and just upgraded a node version. In any case I would say that a default: |
Not saying this shouldn't be done, but there is a bit of a security risk inherent in this. Using an environment variable, it would be possible for a rogue module to set the environment variable discreetly causing traffic to be redirected through the proxy without the developers/users awareness. |
@jasnell Pretty sure something similar could be achieved by monkey-patching |
Yep, as I said, I didn't say it shouldn't be done ;-) If we do it tho, we need to make sure the risks are well documented. |
-1 as I think this is something that is better handled in userland |
This would depend on proxy support in the agent, something which #1490 looks to have been for, but was closed for some reason. Also to quote @sindresorhus from sindresorhus/got#79:
|
A Rogue module can monkey-patch all the methods in the
I don't think the structure of a network which is beyond my control qualifies as a userland problem. My OS deals with any proxy I may or may not need to connect through so that each application doesn't have to deal with this. Does this example analogy not hold for nodejs core? If not, do we still want to continue burdening every module using |
Related: #1490 To summarize: "Proxy support?" "Not saying no but..." |
I would agree that ultimately this is something that should be handled directly within node. Node should be read my environment variables to see that I am starting the node executable with my curl, npm, git, etc, all respect these environment variables by default. This is the purpose of the environment. The user has already specified these within their environment, so the fact that they aren't being respected is very confusing. It shouldn't be left up to a non-node-core library developer to decide to support my proxy environment by properly configuring node supported HTTP module with my environment variables or via a configuration to be passed into their library because ultimately I may not be directly using those modules, such as in the case of using a framework that includes a library that includes a library that interfaces with the HTTP module. Thus this "userland" solution essentially creates a recursive issue through all dependencies which is much harder to solve in all cases. Since the node HTTP library is at the core of this, it should solve this issue by respecting the environment variables used at run time and override other attempts where libraries / modules may try to set these settings itself unless some other environment variable is passed to allow for this. |
@matthewwiesen The counterargument to your argument is that curl, npm and git are all end user programs, whereas node is a platform. A better comparison is python: the builtin httplib doesn't respect http_proxy, that is left to user libraries. Python, unlike node, has a strong "batteries included" philosophy, so that is saying something. Also: big slippery slope. Yes, curl respects http_proxy... but it also honors all_proxy, no_proxy with patterns and wildcards, and happily parses your .netrc with every connection. I don't think that has a place in core. Core is for mechanism, not policy. |
Looking at the Python, it seems that the documentation does mention respecting http_proxy in regard to the core https://docs.python.org/3.4/library/urllib.request.html?highlight=http_proxy
Although, this page does mention to "users" that is recommended for a higher-level http client interface such as the http://docs.python-requests.org/en/master/user/advanced/#proxies Looking at Go, it seems to be pretty similar, it seem to be supported within their Net HTTP Transport module: Looking at Ruby, this seems to be supported as well: Unless I'm mistaken on these sources. |
I picked httplib because it's the python counterpart to node's http module. urllib is an 'open any kind of url' toolkit - useful, but without a node equivalent. requests is actually a good example of what I mean. It's the python sister to request. request supports proxies, tunnels, auth, etc; it's the go-to package for anyone making http requests. I don't see a compelling reason to duplicate that functionality in core when a best-of-breed solution exists and is in widespread use. |
node's focus is implementing HTTP mechanism well, not policies, to enable diversity and iterative improvement of user-facing HTTP client libraries like https://www.npmjs.com/package/fetch and https://www.npmjs.com/package/request. It happens that node core Since node's APIs cannot be changed/improved without causing great difficulties, because they can't be properly versioned, unlike userland modules, we should be very, very cautious in adding functionality. The "widely useful" test is not sufficient, it needs to be "doing it in userland causes real problems". node comes with npm included - its a great way to select the API and version of that API that you want to use that has the features you need. |
So just to be clear, the proposed solution here is to:
This ultimately leads toward educating a significant sector of the Node community to HTTP proxy environments, when this is something that specific to how HTTP requests should be handled within a specific user environment. Why should we go to every node module author to tell them that we can't use their framework/module/library because at the end of the day somewhere something is wraping the node HTTP request and that they specifically didn't support the HTTP Proxy environment because in all likelyhood they were not aware of this use case since probably over 95% of the node community doesn't have an environment like this. They likely aren't aware of running their code within a proxy environment and thus did not allow for or implement pass-thru support to allow for an end user to configure their code to ultimately configure Node HTTP to work properly in an environment where access to the external internet must go through a HTTP Proxy. @bnoordhuis To address some of your comments from here:
I don't see how allowing HTTP proxy will suddenly open the flood gate to supporting all of those items you mention.
Regarding the SSL Certificates: I will say that I think supporting internal CA certificates should also be supported since ultimately CA certificates themselves are already supported within Nodes HTTPS module here via the TLS module here here, which indicates that when the The same issue as the proxy environment occurs in this scenario where with custom internal Root CA certificates when there is not an easy way to inject the internal CA certificate chains at run time via an environment variable so that the TLS module will augment its default CA and append my internal CA certs to the existing chain it already supports. But what is required today is that we necessitate that all downstream modules play nice to allow for easily hooking into and passing these configuration options through the full chain so that my HTTPS requests to sites secured with an internal CA issued certificate will validate properly. Or there is always the insecure Finally, in regard to security with respect to the "rogue module" scenario, wouldn't implementing this feature in core prevent this? If node HTTP was by itself setting the proxy settings via the user supplied environment variable and essentially rejecting configuration options passed to it when an environment variable is available, then there would be no way for a "rogue module" siting between the end users code and the Node HTTP module to manipulate this. I agree that supporting these changes, does seem to be breaking with the current behavior of how things work today, but :
Personally, I would think that if someone where starting their application/script with their HTTP proxy environment set, then it is not unreasonable to assume that this is what they want since they configured this. If they did not want to proxy their requests, they could always do what they do now, by not setting the HTTP proxy and directly configuring this within their various libraries / modules or directly via the node HTTP lib. |
Support for custom CAs has landed in #9139. It's not released yet, but should be available in the next minor 7.x release, and it'll possibly be backported some point after that. |
To be clear, your straw man procedure:
is just that, a strawman. The actual procedure is:
Also, I suggest that 95% of users are probably using What exact problem did you have that led you to think Node should do this in core? Did you not know that node encourages the use of npm modules, wrote code using Or did you find a third party module that was coding directly to the |
That's simply how such things play out when a piece of software is popular. It's only a matter of time before someone requests them because "feature $x doesn't work for me because you don't support feature $y." To answer your specific questions: SOCKS because people use it (request supports it although I don't know if it does DNS-over-SOCKS), NTLM and Kerberos because many proxies require authenticated connections. |
@silverwind great to hear this is being included. Looking at #9139 this seems like good news on this front. Glad to see there is agreement there. @sam-github The example that I am providing is that: I would like to use a Node Web Application Framework, this application is dependent upon lots of modules. Those modules are dependent upon modules, which are dependent upon modules and ultimately we arrive at https://www.npmjs.com/package/got, which is were the we see the issue.
Now you say that So what are my options here, go to the If Great we solved this for You see how to essentially support HTTP proxy we are forcing this to be done on every module? Or the solution is just to not use that module? I don't see how that is a solution at all. I certainly don't see a straw man here as since I'm not misrepresenting or exaggerating the options here. If anything the opposing argument I see here is that if we implement HTTP proxy then it opens the flood gate, which certainly an assumption and definitely not true. |
Thanks @matthewwiesen , that helps a lot to give context. And I'm still not in agreement, and somewhat baffled.
Why is this one single feature (that is missing from Particularly when the very existence of I'd say, the bare-bones nature of node's HTTP enables fetch/request/got to all have their own opinions on what features are bloat, and what are essential. If
|
This brings me to my original request. If a user has explicitly defined the HTTP proxy environment variable in the environment that the node processes is started, when would it ever be the desired behavior that this is not respected? This is the main reason why I believe this should be embedded in core, so that this is enforced as per the users explicit request by setting the HTTP proxy environment variable within the shell to ensure that ALL HTTP requests made through node pass through this user's proxy since if this is not done undesirable effects will occur. I believe this fits into the similar use case of why it is important to allow the user to define their own custom CA certificates within an environment variable and I agree with the point you made here. Ultimately this is necessary in core to ensure that node HTTP behaves in a way that is conducive to the corporate/enterprise environment in a consistent/enforceable fashion, so that the HTTP Proxy is honored regardless of downstream module authors since they may not support this because they are not familiar with proxy environments which ultimately leads to their modules not behaving correctly within the corporate environment where a proxy is essentially standard. This will ultimately mean that requests they expect to function normally will be blocked because they are not routed through the HTTP Proxy as defined by the user's environment variables. The ultimate desired effect from the user's perspective is that there exists a need for ALL HTTP requests honor the proxy environment since it is undesirable where when the HTTP Proxy environment variable is set and is not enforced as this leads to undesirable behaviors with the HTTP environment where requests will not function unless all downstream modules that interface with Node HTTP do so where the proxy environment is ultimately configured, which begs the question as to why would we not want this in core? It is because of this which is why I propose the HTTP proxy to be included into core so that this is the default effect when a user supplies the HTTP proxy environment. As for all other HTTP customization/configurations these are not relevant to core. If the user did not want to enforce the HTTP Proxy in this proposed way, it would be reasonable that the user not define their HTTP Proxy environment variable and leave this unset. Edit: |
@nodejs/ctc needs more input. I'd be interested in knowing from the authors of request (@mikeal, @simov ), fetch (@andris9), and got (@sindresorhus). I'm half convinced there should be a canonical github.com/nodejs/http-proxy or something of the like that can be used by down stream HTTP client APIs, still not so convinced that it should be in node core, would like to hear more from implementors. |
Just to be sure, did you had node-fetch in mind instead of my fetch which is some pretty old code and not used so much? I don't know a lot about HTTP proxies but I have added proxy support to Nodemailer for SMTP connections (docs here). Basically I added a new method |
So... does |
Tries to read 'proxySettings' string from config.json. If it was provided, sets it to --all-proxy parameter when running aria2c and https_proxy environment variable when running satellite (node) Refere to these links: nodejs/node#8381 https://github.com/request/request#controlling-proxy-behaviour-using-environment-variables
closes #1893 For some fun discussion on why this is required, see this issue: nodejs/node#8381
* fix: Add server-side proxy support via fetch-with-proxy closes #1893 For some fun discussion on why this is required, see this issue: nodejs/node#8381 * lint
Because nodejs does not support |
FWIW, I've changed my mind on this. I feel it would be convenient to have proxy support in core (esp. because of fetch()) but I haven't thought too deeply about the details. Feature creep is still a real problem, too. Having said all that, I suspect this is waiting for nodejs/undici#1650 before anything else can happen. |
To enable HTTP connectivity behind corporate firewalls, a number of tools and programming languages support HTTP/HTTPS proxies defined through environment variables like
Note that there seems to be no consensus on the case of these variables and all-lowercase variable names are also very common. My limited research suggest that at least the following languages automatically obtain and use a proxy from the environment:
The
request
module also supports these variables, but I feel they show be respected by corehttp
andhttps
for best compatibilty.The text was updated successfully, but these errors were encountered: