Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: new HTTP client package #23707

Open
bradfitz opened this issue Feb 5, 2018 · 33 comments

Comments

@bradfitz
Copy link
Member

commented Feb 5, 2018

Tracking bug for higher-level more usable HTTP client functionality.

  • take contexts
  • more timeout control per-request, rather than per-transport/client
  • ...Options pattern?
  • decode into JSON/etc easily?
  • treat non-2xx as error (by default?).
  • slurp entire response from server (by default?)
  • control over max response size (some sane default limit when slurping is enabled)

I thought we had a bug for this, but maybe not.

@pciet

This comment has been minimized.

Copy link
Contributor

commented Feb 5, 2018

This is a pattern in a test client I wrote:

// or Post, PostForm
r, err := client.Get(addr)
if err != nil {
    panic(err)
}
// do some parsing here then discard the rest
_, err = io.Copy(ioutil.Discard, r.Body)
if err != nil {
    panic(err)
}
err = r.Body.Close()
if err != nil {
    panic(err)
}
if r.StatusCode != http.StatusOK {
    panic(r)
}

Are you looking at adding new identifiers?

type BodyParser func(io.Reader) error

// http.ErrBadStatus for non-2xx, discards unread response body, and closes the connection.
func (c *Client) ClosingGet(url string, resp BodyParser) error

// repeated for the other methods
var t MyStruct
err := c.ClosingGet(addr, http.JSONBodyParser(&t))
@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Feb 5, 2018

Are you looking at adding new identifiers?

There would be new API for this, yes. I don't have a concrete proposal yet.

@awreece

This comment has been minimized.

Copy link
Contributor

commented Mar 26, 2018

We wrote a wrapper for the http client internally that uses the ...Options pattern and achieves a lot of these goals:

https://github.com/AlphaFlow/gohttp

GET with parameters:

cli := http.NewClient()

var ret map[string]interface{}
err := cli.Get(context.Background(),
    "http://example.com",
    http.WithParam("debug", "1),
    http.WithJSONResponse(&ret))

POST with body:

ctx, cancel := context.WithTimeout(context.Background(), 1*time.Second)
defer cancel()

body = map[string]string{
   "hello": "world",
}
err := cli.Post(ctx, "http://example.com",
    http.WithJSONBody(body),
    http.WithHeader("Authorization", "Bearer my-token"))
if bse, ok := err.(*http.BadStatusError); ok && bse.Code == 404 {
    fmt.Println(bse)
}

Mocking the HTTP request:

	return http.NewMockClient(func(ctx context.Context, req *http.Request) error {
		if e != nil {
			return e
		}
		if len(body) == 0 {
			return nil
		}
		if req.Output != nil {
			_, err := req.Output.Write(body)
			return err
		} else if req.JSONOutput != nil {
			return json.Unmarshal(body, req.JSONOutput)
		} else {
			panic("no place to write response")
		}
	})

This has greatly simplified our error handling/json marshaling.

Quirks

  • Mocking the client using this interface is a bit weird -- the mock needs to know an awful lot about how the code works. This could be simplified a bit, maybe.
  • We inherit standard go http client "follow redirect" idiom, when it would be nice for that to be an option.
@bradfitz bradfitz modified the milestones: Go1.11, inpl, Unplanned May 23, 2018
@bradfitz bradfitz changed the title net/http: add high-level HTTP client operations net/http: new HTTP client package Dec 3, 2018
@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Dec 18, 2018

I've started working on this.

See:

I'll move a lot of that to golang.org/x/exp/http* at some point here.

@danp

This comment has been minimized.

Copy link
Contributor

commented Dec 18, 2018

Any problems with or changes to tracing to consider? Is https://golang.org/pkg/net/http/httptrace/ still the desired interface for tracing if there's a chance for change?

@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Dec 18, 2018

@danp, thanks. I'll include a section about that. It's mostly okay as-is AFAIK and should ideally continue to work but if you have issues with it, do share.

@creker

This comment has been minimized.

Copy link

commented Dec 18, 2018

Maybe that would be a good opportunity to fix one thing that really bugs me. Right now if request fails due to context being cancelled, you get url.Error that wraps context.Cancelled error. It's very annoying when that error is deep inside the call stack wrapped multiple times with something like github.com/pkg/errors. I don't know what is the convention here but it looks to me that context.Cancelled should be returned directly in every call that accepts context without any wrappers.

And in general, it's weird that http package uses url.Error and why the latter is even in the url package. The package description says "Package url parses URLs and implements query escaping" and everything apart from url.Error does relate to URL parsing. Maybe http client should provide it's own error type, even if it's similar to already existing url.Error.

@joonas-fi

This comment has been minimized.

Copy link

commented Dec 19, 2018

Just a while ago I made a small facade w/ sane defaults whose varargs design may be of interest: https://github.com/function61/gokit/tree/master/ezhttp

edit: my idea seems to be quite similar to what @awreece already posted

Example:

res := MyResponseType{};
_, err := ezhttp.Get(
    ctx,
    "https://example.com/api/getstuff",
    ezhttp.AuthBearer("myToken"), // this and below are varargs "config mutators"
    ezhttp.RespondsJson(&res),
    ezhttp.TolerateNon2xxResponse)

More examples in test file

Mutators my current ezhttp implements:

  • Header(key, val)
  • Cookie(cookie)
  • AuthBasic()
  • AuthBearer() // this is common enough use case to justify presence in HTTP package IMO
  • SendJson(ref interface{})
  • SendBody(body io.Reader, contentType string) // for raw data
  • RespondsJson(ref interface{}, allowUnknownFields bool)
  • TolerateNon2xxResponse // by default errors if response not 2xx
@lpar

This comment has been minimized.

Copy link

commented Dec 19, 2018

I feel like Context as a mechanism for cancelling, and Context as a way to pass data along with a request, probably ought to be different things, not the same object/parameter. I'm not sure whether that's what is meant by noting that Request.WithContext is slow?

@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Dec 20, 2018

@lpar, redesigning context is out of scope for this bug. In practice, you end up always needing both in big systems (for tracing), and passing two context-like things around everywhere is even grosser than passing one. So one it is. That doesn't affect the performance. WithContext is slow because of the defensive garbage it makes. If you could run a request with a context to begin with (Go 1 didn't accept one), then we wouldn't need WithContext.

@fgrosse

This comment has been minimized.

Copy link

commented Dec 20, 2018

While I see some examples for using options when sending the request, I often find that in my programs I want to configure those once at startup directly on the client.

Thats why we wrote https://github.com/classmarkets/cmhttp.

Example:

client := cmhttp.Decorate(
    http.DefaultClient,
    cmhttp.Scoped("https://api.example.com"),
    cmhttp.Typed("application/json"), // or cmhttp.JSON()
)

req, err := http.NewRequest("GET", "/v1/places/de.berlin", nil)
if err != nil {
    panic(err)
}

resp, err := client.Do(req)

Like the examples of @awreece and @joonas-fi this uses varargs to let the user pick the options they want but instead of doing this on a per request basis this decorates the client. Just thought I might share that here for more inspiration for your redesign :)

@choonkeat

This comment has been minimized.

Copy link
Contributor

commented Dec 20, 2018

Are we open to consider respecting proxy env by default?

@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Dec 20, 2018

Are we open to consider respecting proxy env by default?

Seems reasonable.

@bcmills

This comment has been minimized.

Copy link
Member

commented Dec 20, 2018

The magic-method interfaces (Flusher, Hijacker, Pusher, etc.) are pretty unpleasant: they make the documentation a lot harder to navigate and understand, and make semantically-correct wrappers nearly impossible (you have to write O(2ⁿ) wrapper types for n extension interfaces).

(See also #21904 and #16474.)

Ideally we should not only fold in the existing magic-method interfaces, but also have a plan to avoid the need for them going forward. (Perhaps use structs of functions instead?)

@djui

This comment has been minimized.

Copy link

commented Dec 20, 2018

I share many points enumerated in the Problems document and would like to drill a bit more into the point Client vs. Transport distinction:

The current separation between Request and Client regularly leads to modifying per-request scope properties (e.g. query parameters, headers, body, ...) (using a builder pattern with late/lazy error handling would be nice) but also modifying client scope properties (e.g. base url for single host API endpoints, user-agent, timeouts, outgoing request and incoming response logging, request id or tracing span tracking, API token and refresh logic, retry logic, ...). All of these client modifications have to be done in a nested way, similar to a server's middleware chaining, using the Transport while being careful to not mix Transport struct and RoundTripper interface.

Is the idea of the Handler to solve these client modifications or maybe even combining request and client modifications?

As a real-world example (Experience Reports?), we use the following pattern for client/Transport middleware:

func NewClient(logger *log.Logger, opts ...TransportOption) *http.Client {
	c := DefaultClient()

	c.Transport = NewTransport(opts...)
	c.Transport = NewLoggingRoundTripper(c.Transport, logger)
	c.Transport = NewSpanIDRoundTripper(c.Transport)

	return c
}

// DefaultClient returns a new http.Client with similar default values to
// http.Client, but with a non-shared Transport, idle connections disabled, and
// keepalives disabled.
func DefaultClient() *http.Client {
	return &http.Client{
		Transport: DefaultTransport(),
	}
}

And the following pattern for Request building:

func NewClient(httpClient *http.Client, opts ...RequestOption) *Client {
	return &Client{
		HTTPClient:  httpClient,
		RequestOpts: opts,
	}
}

func (c *Client) WithOptions(opts ...RequestOption) *Client {
	c.RequestOpts = append(c.RequestOpts, opts...)
	return c
}

func (c *Client) WithContext(ctx context.Context) *Client {
	c.RequestOpts = append(c.RequestOpts, WithContext(ctx))
	return c
}

func (c *Client) WithBaseURL(u string) *Client {
	c.RequestOpts = append(c.RequestOpts, WithBaseURL(u))
	return c
}

func (c *Client) WithHeader(header, value string) *Client {
	c.RequestOpts = append(c.RequestOpts, WithHeader(header, value))
	return c
}

func (c *Client) Do(req *Request, opts ...RequestOption) (*http.Response, error) {
	if c.err != nil { // abort if already in error state
		return nil, c.err
	}

	oo := c.RequestOpts
	oo = append(oo, req.Opts...)
	oo = append(oo, opts...)

	for _, o := range oo {
		if req.err != nil { // abort if already in error state
			return nil, req.err
		}

		o(req)
	}

	if req.err != nil { // abort if already in error state
		return nil, req.err
	}

	return c.HTTPClient.Do(req.Request)
}
@EricFortin

This comment has been minimized.

Copy link

commented Dec 21, 2018

We also use a lot the customisation of Transport much like @djui. We also use this to implement caching and client instrumentation for example. I do agree that better control over single request is needed, especially for timeout but I think having a way to configure client(I am not necessarily attached to RoundTripper though) should be kept in the final design.

@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Dec 21, 2018

@bcmills, I agree, but this is a bug about the HTTP client, not server. I've pushed back on optional interfaces for some time now in all packages, though, and don't plan on using them here.

@szuecs

This comment has been minimized.

Copy link

commented Dec 21, 2018

Dear @bradfitz my wishlist:

  1. current internal net and net/http should expose an API to be able to build your own connection pool and connection handling and put into a new client/server as configuration similar to the current transport for example. Use case: replacing the requests queue with a stack for example
  2. exported error types, instead of return fmt.Errorf(..)
  3. document the net error Temporary() and Timeout(), such that you can be sure how to find out a connection error on connection establishment from one that happened after you wrote HTTP data to the socket. This is quite important for retries in proxy servers.
  4. a working h2c implementation, that can be easily enabled, because often you internally don't need the complexity of a rotation TLS infrastructure which is a beast in itself. I tried to use internal/directory to hack around this, but it's quite complex to do if even possible.

I know some of them might not be httpclient related, but if you can fix or enable some of them while changing httpclient would be really nice.

@carlmjohnson

This comment has been minimized.

Copy link
Contributor

commented Dec 21, 2018

A feature of Python's requests and some Ruby http clients I have seen is to set up authentication once and specify a base URL to combine with paths (e.g. for different host endpoints for stage vs. prod). I don't think these should be direct features of this library, but it should be designed so that it's easy to wrap the client with something modifies the requests before sending them. With that, a lot of API client libraries out there could be even thinner wrappers than they are now, and just provide a set of JSON structs and a function that takes authentication keys and returns a working httpclient.

@shuLhan

This comment has been minimized.

Copy link
Contributor

commented Dec 22, 2018

What I don't understand from http client is why each request must have full URL with domain name?

Why not,

cl, err := http.NewClient("http://example.com")

rawsock, err := cl.Connect(addr)
res, err := cl.Get("/api", qparams)
res, err = cl.Post("/api/post?k=v", nil, body)

// And so on...

?

@djui

This comment has been minimized.

Copy link

commented Dec 22, 2018

@shuLhan Your example seems to hint at a single-host client, which I would consider a specialization, to which optimizations can be done, especially regarding connection pooling. The provided HTTP should be generic and stateless between requests, imho.

@szuecs

This comment has been minimized.

Copy link

commented Dec 22, 2018

@djui true, but it could also be working on the return value of Connect() and hint to some general optimization that could be done in protocol aware connections, for example Keep-Alive in case of http/1.1. Not sure if it makes sense or I miss something else here.

@anacrolix

This comment has been minimized.

Copy link
Contributor

commented Dec 23, 2018

Difficulty with magic methods and lack of h2c are big thorns with the existing implementation. Cleaner interfaces and support for server handler middleware would also be good.

@ghost

This comment has been minimized.

Copy link

commented Jan 8, 2019

Please provide a way to write to the request body. One example of where this is useful is writing a multipart request body using large files to stored on disk. It's possible to get a writer on the request body using io.Pipe and a goroutine, but it would be nice of the client handled these details or if the client plumbed a writer more directly to the network.

Here's a proposed API for the bradfitz/exp-httpclient:

// BodyFunc specifies the request body using a function.
// The HTTP client calls the function to write the request
// body to an io.Writer. The function may be called 0, 1
// or multiple times.
func (r *Request) BodyFunc(fn func(io.Writer) error) *Request {
	panic("TODO")
	return r
}

@abdullah2993

This comment has been minimized.

Copy link

commented Jan 8, 2019

better multipart support out of the box

@shuLhan

This comment has been minimized.

Copy link
Contributor

commented Jan 8, 2019

@shuLhan Your example seems to hint at a single-host client, which I would consider a specialization, to which optimizations can be done, especially regarding connection pooling.

Please forgive my ignorance, but is not that what a client should do? Unless we are writing a browser, I can't think of any use case where one client can or should handle connection to several hosts (except redirection).

From my point of view, which I am not familiar with the underlying HTTP client code, there are two layer of abstraction that the current HTTP client trying to handle: client layer and connection pooling/management layer; and IMO they should be separated. Case in example: in database, there is an API to create single client connection and there is an API to create connection pooling. Which one user want, its depends on the use case.

The provided HTTP should be generic and stateless between requests, imho.

generic. There are eight HTTP methods since RFC7231 (without PATCH) and I don't think it will be changed in the next ten years. The only thinks that dynamic in client is request query parameter, headers, and body; but, as bradfitz already mentioned, the Request object contains both fields required by client and server which cost allocated struct size when working only on client.

stateless between requests. I am not sure what does that means, since HTTP is build on top of TCP and current HTTP client is synchronous.

@djui

This comment has been minimized.

Copy link

commented Jan 8, 2019

@shuLhan I guess I was thinking of other use cases such as scrapers or simpler programs/scripts.

@carlmjohnson

This comment has been minimized.

Copy link
Contributor

commented Jan 8, 2019

Not sure if the httpclient is the right way to address this, but if you just spawn a bunch of goroutines to do a bunch of HTTP requests, you can easily eat through your file descriptor limit and crash your machine. Would a concurrency limit be appropriate in a the new httpclient? If not, how could the client make it easy to add a semaphore wrapper?

@creker

This comment has been minimized.

Copy link

commented Jan 8, 2019

@carlmjohnson isn't existing http client already gives you all the necessary dials? You can limit keep-alive and concurrent connections to prevent file descriptor exhaustion.

@carlmjohnson

This comment has been minimized.

Copy link
Contributor

commented Jan 8, 2019

You can set DefaultTransport.MaxConnsPerHost, but AFAIK, there's no way to set a limit if you're scraping multiple hosts. I could be wrong.

@szuecs

This comment has been minimized.

Copy link

commented Jan 10, 2019

@shuLhan Your example seems to hint at a single-host client, which I would consider a specialization, to which optimizations can be done, especially regarding connection pooling.

Please forgive my ignorance, but is not that what a client should do? Unless we are writing a browser, I can't think of any use case where one client can or should handle connection to several hosts (except redirection).

@shuLhan
Think about a reverse proxy case that handles multiple backends.
Of course you don't want to have a client for each backend and if it would make sense, then you would need to fix #23427 or we would have to have for each different http client a goroutine to make DNS requests happen after a while. I am also not sure about the memory footprint of a http client, which could increase of running a reverse proxy in Go.

@szuecs

This comment has been minimized.

Copy link

commented Feb 20, 2019

I just had another idea for the http client usage.
What about a usage like this:

response, err := http.Config({}).Client({}).Request({}).Do({})
cfg := http.Config({Version: http.Version11, Transport: {}, ConnectionPool: {},...})
// or use a default and override
cfg2 := http.DefaultConfig()
cfg2.Transport = ...
cfg2.Client({})

http.Client({ResponseTimeout: 3*time.Second, ...})
cli := http.DefaultClient()
cli.ResponseTimeout = 5*time.Second

Do(
     OnData: func() {},
      ...handlers...
) Response, error 

Maybe handlers to help pipelining large response bodies, that would be efficient in memory handling. Think of a list of Pods and you need only a fraction of data, in your data collection to process in your program. We could easily enable users to use json Encode/Decode, and not read the full Body and use Marshal/Unmarshal.

Error handling would need to be done with returning nil and check nil on itself and do some noop if it's nil:

// http.Config({}).Client({}).Request({}).Do({})
func Config(cfg *Cfg) {} *Cfg {
    if cfg == nil {
        return &DefaultConfig
   }
}

func (cfg *Cfg) Client(cli *Client) *Client {
   if cfg == nil { // this would be done for all other following func receivers to do "error checking"
        return nil
   }
   ... // default cli..
}

...
@szuecs

This comment has been minimized.

Copy link

commented Mar 19, 2019

One more thing, I realized today:
I would like to have client connection pool data accessible for observability. For example:
How many idle connections do we have for a host and in general?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.