make imgadm to use http(s)_proxy environment variables #142

Closed
wants to merge 2 commits into
from

Projects

None yet

3 participants

@MerlinDMC

This request changes imgadm to use curl for downloading manifests from dataset servers.

Curl respects http(s)_proxy environment variables during download so imgadm can also be used in behind proxies.

This fixes issue #120

MerlinDMC added some commits Nov 12, 2012
@MerlinDMC MerlinDMC pull cache updates with curl
This will add http_proxy support without much hassle.
Also curl was already used to pull the image files.
90c7211
@MerlinDMC MerlinDMC removed unused code 460cf62
@trentm
Member
trentm commented Jan 30, 2013

@MerlinDMC: I hope to have a long-worked-on re-write of imgadm landing in the next couple of weeks. I'll make sure that it supports http(s)_proxy.

@trentm
Member
trentm commented Jan 30, 2013

note to self: Tracking this in IMGAPI-88 ticket.

@AlainODea
Member

@trentm does imgadm v2 support http(s)_proxy environment variables? It doesn't seem to work on joyent_20130222T000747Z.

@AlainODea
Member

@trentm Node.js doesn't appear to directly support HTTP proxies like Squid, but it is fairly easy to work around that.

var http = require('http');
var host = "proxyhost.example.com";

var options = {
  host: host,
  port: 3128,
  path: "http://images.joyent.com/images",
  headers: {
    Host: 'images.joyent.com'
  }
};

var req = http.request(options, function(res) {
  console.dir('STATUS: ' + res.statusCode);
  console.log('HEADERS: ' + JSON.stringify(res.headers));
});

req.end();
@AlainODea
Member

Is it practical for images.joyent.com to have caching information like ETag headers? That would really help keep bandwidth costs down for significant SmartOS deployments. A cache or cluster of caches on the front would cache the content minimizing the hit on your servers and minimizing the bandwidth required to gets images to all hosts.

@AlainODea
Member

On joyent_20130307T214308Z if I configure proxies:

export ftp_proxy="http://lvhubproxy01:3128/"
export http_proxy="http://lvhubproxy01:3128/"
export https_proxy="http://lvhubproxy01:3128/"

And then attempt to check for available images:

imgadm avail

I get the following error:

imgadm: error (MultiError): multiple (2) API errors
    error (ClientError): https://images.joyent.com: Error: connect ECONNREFUSED
    error (ClientError): https://datasets.joyent.com/datasets: Error: connect ECONNREFUSED

No error or deny is shown on the Squid proxy log.

Is it required that the SmartOS GZ has unrestricted direct Internet access?

@trentm
Member
trentm commented Mar 15, 2013

@MerlinDMC Your question from IRC yesterday was passed on to me. I'd love for you to put together a http(s)_proxy patch for the new imgadm if you are willing. I'm sorry that I haven't gotten to this yet.

@AlainODea
Member

@trentm I am going to try tjfontaine's idea from #node.js:

[16:06] AlainODea_: right, such a patch would change any use of http(s).get with something like https://github.com/mikeal/request where you can var r = request.defaults({'proxy': process.env.http_proxy})

It may require adding a dependency on mikeal's request module. Would that be okay?

@AlainODea
Member

@trentm okay. I'm deep over my head here. It requires changes either to Node.js' core http and https modules or to node-restify in http_client.rawRequest(). Neither of these is trivial. Changing the core Node.js modules to support proxies is likely the correct solution.

An alternative easier solution would be to have node-restify depend on mikeal/request and alter http_client.rawRequest() to use @mikeal's support for proxies (this is likely not to work, but is my first impression):

var proxy = pts.protocol === 'https:' ? process.env.https_proxy: process.env.http_proxy;
var r = request.defaults({'proxy': proxy});
r(opts, function onResponse(res) {
                clearTimeout(timer);
                dtrace._rstfy_probes['client-response'].fire(function () {
                        return ([ id, res.statusCode, res.headers ]);
                });
                log.trace({client_res: res}, 'Response received');

                res.log = log;

                var err;
                if (res.statusCode >= 400)
                        err = errors.codeToHttpError(res.statusCode);

                req.removeAllListeners('error');
                req.removeAllListeners('socket');
                req.emit('result', (err || null), res);
        });

Looking at http_client.rawRequest() it needs the req object to have log, on() and once(). I'm not clear on what functions and properties the prototype of mikeal/request has. mikeal/request inherits prototypes from Stream which is currently qualified as unstable so its API could change. on() and once() are part of Event which Stream uses. It does emit("error") so chances are good that on("error") will work, but it might have different failure modes. I'm not sure once("socket") will work.

I'm well out of my depth here. I want to get this done because it is valuable to me and I'd like to contribute where I can, but I think coding and testing this is beyond me at this point.

mikeal/request follows redirects as well, so in a way it is closer to cURL in capabilities.

Hopefully my investigation of the possible fixes is useful here.

@AlainODea
Member

@trentm imgadm on joyent_20130125T031721Z used https_proxy when running import, but had a defect that caused it to use IP addresses instead of hostname's when doing the CONNECT through the proxy. That would be a major impediment to web filtering if that defect comes up in the fix for proxies in imgadm v2.

@trentm
Member
trentm commented Jan 12, 2015

This should be fixed in smartos-live#master now. I'm sorry this sat idle for so long.

@trentm trentm closed this Jan 12, 2015
@AlainODea
Member

Thanks @trentm ! This is great news.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment