Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: Replace Context with goroutine-local storage #21355

Closed
neild opened this issue Aug 8, 2017 · 25 comments
Closed

proposal: Replace Context with goroutine-local storage #21355

neild opened this issue Aug 8, 2017 · 25 comments

Comments

@neild
Copy link
Contributor

@neild neild commented Aug 8, 2017

This is a bit pie-in-the sky, but I think worth considering even if only to reject it.

One of the original motivations for the Context type was to provide a way to plumb distributed trace IDs (as used by e.g., Dapper: https://research.google.com/pubs/pub36356.html) though the call stack.

The Context type suffices to connect inbound and outbound RPCs by passing a trace ID from the inbound request handler to the outbound call, but it leaves significant gaps. In particular, distibuted profiling tools may need to associate trace IDs with specific goroutines in order to, for example, associate CPU time or memory allocations with specific inbound requests. This requires some form of goroutine tagging; e.g., http://golang.org/cl/43751.

Another aspect of Contexts is that for them to be useful, they must be pervasive. The entire purpose of a Context is to plumb values through layers of the call stack that don't care about them--e.g., to pass a cancellation signal or trace id through unrelated layers of code down to the place where a file is written or an RPC call made. This requires that almost every package become Context-aware. This is not, I think, a significant cost, but it is a cost. It is not difficult to find users complaining about the need to add Context parameters to their functions.

Given that goroutine tagging (a form of goroutine-local storage) may be necessary to support some profilers, should we consider replacing Context entirely with goroutine-local storage? Cancellation becomes a goroutine-local 'done' channel. Trace IDs are a goroutine-local value. Profile tags (if distinct from trace IDs) are just another value. The io.Reader/io.Writer interfaces can trivially support cancelation without modification. Existing packages can be retrofitted to support cancellation/value propagation without API changes.

A significant downside is that this would replace a fairly simple interface built from existing components with a language feature. I'm not convinced that this is an improvement. I think it's worth consideration, however.

@neild neild added Go2 Proposal labels Aug 8, 2017
@gopherbot gopherbot added this to the Proposal milestone Aug 8, 2017
@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Aug 8, 2017

We (well, @matloob) have put a fair amount of effort into making profiling tags work well with contexts.

In particular, the current approach provides mechanisms for running a goroutine with a new set of labels (this mechanism is unfortunately not very well documented at present). This proposal needs to explain how the go statement handles goroutine-local storage.

@neild
Copy link
Contributor Author

@neild neild commented Aug 9, 2017

A bit more concretely:

  • Each goroutine has, effectively, a local map[interface{}]interface{}.
  • New goroutines inherit the contents of the parent goroutine's local storage.
  • There is a mechanism for setting the local storage within a lexical scope. (e.g., local.Set(k, v, func() { /* k = v in this scope */ }.)

To run a goroutine with a specific set of labels rather than inheriting the current set, you would temporarily change the current goroutine's labels.

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Aug 9, 2017

For clarity: does a new goroutine inherit the map value, or does it inherit a copy of the map?

@neild
Copy link
Contributor Author

@neild neild commented Aug 9, 2017

A new goroutine would inherit a copy of the parent's map, not the actual map. (That wouldn't be very local.)

In simple cases, such as a function which starts some worker goroutines and waits for them to complete before returning, this should propagate cancellation or trace ids correctly with no effort on the user's part. More complex cases, such as long-running worker goroutines which service requests with their own cancellation, are of course still tricky.

@uluyol
Copy link
Contributor

@uluyol uluyol commented Aug 9, 2017

How would this work with worker pools? Consider systems which needs to do work in C or syscalls (e.g. write data to disk, RocksDB, sqlite). Right now you can bound your thread count by using a persistent worker pool. With context.Context, you can just send the context over the channel to keep tracing/logging info.

With GLS, it seems that you'd either need to have some way of copying this over to an existing goroutine, or you'd have to spawn a new goroutine per request and maintain some sort of counter to limit the number overall.

@neild
Copy link
Contributor Author

@neild neild commented Aug 9, 2017

Worker pools do need to have some way of associating work with a trace id/etc. With GLS, you would still need some Context-like object which can be passed to workers.

// Start a worker goroutine.
go func() {
  for req := range requestc {
    // Set the trace id, cancellation, etc. for this goroutine while processing this request.
    local.SetMap(req.ctx, func() {
      // Request processing goes here.
    })
  }
}()

This is more code for worker pools (although hopefully not much) in return for less in code which does not share a goroutine across multiple, independent operations.

Note that tracing systems which associate CPU time or allocations to specific requests still require that pool goroutines be temporarily associated with a particular trace tag. (c.f., http://golang.org/cl/43751) Arguably, using GLS makes it clearer to the authors of packages that use worker pools that this association may be necessary.

@carlmjohnson
Copy link
Contributor

@carlmjohnson carlmjohnson commented Aug 14, 2017

Merovius has a blog post arguing for GLS: https://blog.merovius.de/2017/08/14/why-context-value-matters-and-how-to-improve-it.html

It seems persuasive to me, but I also know that a lot of people have negative feelings about TLS and would not like Go to adopt a TLS-like mechanism.

I think if we follow Merovius's approach, we should make package-level variables GLS by default and use a keyword to make them non-sync shared mutable, which ISTM would be a rare thing to want.

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Aug 14, 2017

I don't understand the middleware argument in Merovius's blog post: why can't I use a function closure to pass my values through middleware?

Personally I think that dynamic scoping is a recipe for confusion. It's better to represent it explicitly rather than implicitly. Same for goroutine-local storage. Being explicit makes things very clear.

@Merovius
Copy link

@Merovius Merovius commented Aug 22, 2017

@ianlancetaylor I'm not sure what you mean with your question about the function closure. If I would, I'd be happy to try to explain what I'd see as a problem :) Say, for example, I want request-specific debug logging, e.g. using a X-Log-Verbosity: Debug header or the like. With context, an easy way to do that is (kind of handwaving an API here, try and focus on the essential ideas)

func SetVerbosity(h http.Handler) http.Handler {
    return http.HandlerFunc(res http.ResponseWriter, req *http.Request) {
        ctx := req.Context()
        ctx = context.WithValue(ctx, ctxVerbosity, parseVerbosity(req.Header.Get("X-Log-Verbosity")))
        req = req.WithContext(ctx)
        h(res, req)
    })
}

func Logf(ctx context.Context, v int, format string, vals ...interface{}) {
    if rv, _ := context.Value(ctxVerbosity).(int); rv >= v {
       log.Printf(format, vals...)
    }     
}

func main() {
    h := http.Handler(http.HandlerFunc(serve))
    h = SetVerbosity(h)
    http.ListenAndServe(":0", h)
}

func serve(res http.ResponseWriter, req *http.Request) {
    Logf(req.Context(), Debug, "Doing things!")
}

Now, how would this look without context? (You could argue, that the header-parsing doesn't have to happen in a middleware and could instead happen directly from a passed *http.Request in the handler, to which I'd respond that *http.Request.Header is just a string-based, http-specific version of context)

I don't understand how closures would help - to me, that would at best replace the question "how do I pass an int through an API" with "how do I pass a func() int through an API". I'm very much ready to be convinced otherwise, though.


And FTR, I am also not convinced there actually is much of a problem with context.Context - I tend to agree with Sameer here that most criticism of context seems to be opinion not really experience (that includes my article). I also tend to agree after some convincing, that the explicit nature of context is advantageous. I just wanted to separate out the different criticisms of context, taken at face value. And as much as possible distill its usefulness (though I probably did a bad job with that). Because I think depending on what you decide is bad about it, there might be alternative solutions.

@bcmills
Copy link
Member

@bcmills bcmills commented Aug 23, 2017

@uluyol

How would this work with worker pools? […] Right now you can bound your thread count by using a persistent worker pool.

With GLS, […] you'd have to spawn a new goroutine per request and maintain some sort of counter to limit the number overall.

In my experience, a semaphore tends produce a much cleaner API than a goroutine pool anyway: it does not require an explicit Close or Stop call to shut down background goroutines, and does not pollute goroutine dumps with idle workers in the event of a crash.

If the structure of context-passing discourages the "goroutine pool" pattern, I'd call that a feature, not a bug.

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Aug 23, 2017

@Merovius You're right, of course: if you want to pull a value from the HTTP request, and let that value influence the behavior of other functions invoked by the request, then you must pass that value to those functions in some way. You can pass the value in a Context, or you can pass down a function value (in your example a log function that invokes Logf with the right verbosity), but either way something must be passed explicitly.

I agree that goroutine-local variables avoid the need for explicit argument passing, but I also think they add considerable conceptual complexity.

@bcmills
Copy link
Member

@bcmills bcmills commented Aug 25, 2017

[F]or [Contexts] to be useful, they must be pervasive.

A somewhat less invasive way to reduce the syntactic overhead of Context would be to make it a built-in type (like error).

That would at least cut the stutter by a third:

-func Foo(ctx context.Context) error {
+func Foo(ctx context) error {
@bcmills
Copy link
Member

@bcmills bcmills commented Aug 25, 2017

An interesting side-effect of the current (explicit parameter) approach is that it is remarkably easy to leak a Context for a very long time by closing over it in (or passing it as an argument to) a go statement. Because the Context can contain arbitrary keys and values, the closure also references all of those keys and values, even if it only needs a small subset of them.

I think that the existence of context.Context as a map-like entity today may be encouraging us to think in terms of map-like solutions, when we really think about Context keys as something more like parametricity: the set of Context keys that a given function requires is the set of keys that it uses, plus the set of keys required by the things that it calls, minus the set of keys that it itself sets. (If you call a function that wants the "trace ID" variable, then you want the "trace ID" variable too.)

Ideally, those sets should be inferred at compile time. (Inferred, not declared: if there is one lesson we can learn from checked exceptions, it is that enumerating the set of things that you merely propagate is incredibly tedious.) Higher order functions and interfaces make that problem much more difficult, especially in the presence of type-assertions, but if we could crack the parametricity problem for context keys, then we could probably apply the same solution for error spaces to reap ~twice the benefit.

@as
Copy link
Contributor

@as as commented Dec 20, 2017

For this proposal, I don't see how it is possible for goroutine local storage to replace context. Perhaps it is at best an attempt to approximate process namespaces. But then those values need to be copied or shared and locked.

It's interesting how we are ok with explicit error propagation but simultaneously don't mind the though of a persistent per-goroutine request-scoped storage area. An observation follows: a go error is the way a callee informs the caller there was a problem, a context instead tells the callee there is a problem (i.e., timeout). In either case, a cancellation or error-propagation is a theme. With errors, error information can be dynamically ascertained because error is an interface. This is similar to context.Value, except that the information wants to flow in the reverse direction. In any case, It is not the process or goroutine that wants to communicate, but the request itself.

The one difference that bothers me is that context.Value is used not to provide contextual communication to the callee about why the request should be terminated (unlike error does for the caller) but mostly to pass extra arguments to the function (which is strange because we already have function parameters to solve the problem of giving functions parameters).

@neild
Copy link
Contributor Author

@neild neild commented Dec 20, 2017

Contexts pass values through intermediate layers of the call stack. Cancellation is one such value (in the form of the Done channel), but is not the only one and is not particularly different from any other value plumbed in this fashion.

The problem contexts aim to solve is not that of passing parameters to a function, but of passing parameters through a function that does not care about them.

@as
Copy link
Contributor

@as as commented Dec 20, 2017

The idea of to and through are neither mutually exclusive nor well-defined. Somewhere down the chain a callee will care about it, otherwise there wouldn't be a need to have it to begin with. The idea that control flow should have a state is not new, processes have environment blocks and threads have thread local storage on many systems (objects also have methods and member values). These systems also abstract away (poorly) the complexity of making access to those resources atomic. Go is not the type of language that will do this, since the communication and synchronization primitive is the channel: not a lock, context, parameter or object pointer. I don't see how goroutine local storage solves the problem of request scope, it rather seems like a step in the other direction: implicit state, untyped values, and a concurrency based on worker identity rather than communicating pipelines.

@Merovius
Copy link

@Merovius Merovius commented Dec 20, 2017

The idea of to and through are neither mutually exclusive nor well-defined.

I tried to give a detailed explanation of the problem in this comment. You might not be happy with the choice of words of "to" vs. "through", but please try to understand the difference they describe and work from there.

Go is not the type of language that will do this, since the communication and synchronization primitive is the channel: not a lock, context, parameter or object pointer.

That isn't really true, in practice, though. The usage of locks for synchronization is pretty extensive. I don't see why, once we would be able to place pointers in GLS, we couldn't just slap a mutex on whatever request-scoped mutable data we need, to make it concurrency-safe.

I don't see how goroutine local storage solves the problem of request scope

Basically, by a) storing the request-scoped data in GLS and b) inheriting the stored values (not the storage location itself, though) through go-statements. I gave a more extensive and detailed description of the idea in this post.

(note: I am not trying to argue in favor or against GLS here, just to keep the discussion problem-oriented and constructive)

@carlmjohnson
Copy link
Contributor

@carlmjohnson carlmjohnson commented Dec 27, 2017

The thing that strikes me is not so much that GLS is a good idea as that package-level variables without an automatic synchronization mechanism are a bad idea. With a PLV, you either need to initialize it at start up time and then never change it again, or you need to protect individual variables in an ad hoc manner with locks or a rule like only goroutine X is allowed to use the PLV. Moving to GLS for PLV would keep the initialize-then-don't-change pattern working and allow for a news pattern of initializing then having COW GLS for things like timeouts and request variables.

If removing PLVs as they exist now is too disruptive, perhaps there could be a new keyword instead of var to indicate that a given PLV should be treated as goroutine local. A keyword that many languages use is let, but I don't think it conveys the meaning very clearly. Maybe local (x, y int) for "goroutine local package level variables x and y are ints"?

@cflewis
Copy link
Member

@cflewis cflewis commented Dec 27, 2017

@as :

a go error is the way a callee informs the caller there was a problem, a context instead tells the callee there is a problem (i.e., timeout).

I can see what you're poking at, but I don't think this is the way devs think. When I see an error, I see Go telling me "OK, you've got an error, and I'm making them explicit because I want you to decide what you're going to do with it. You could log an error, you could crash out, you could make it mean false for some boolean check, knock yourself out."

The problem is that context is "You've got a context, and I'm going to force you to pass it down, because you have no idea what things below you exist and whether they are reliant on something in the context you got." Once contexts are introduced, there's no longer any agency for the developer to do anything but continue plumbing. That's why devs feel like they're working against the Go ideal of reducing code ceremony. However, hiding them altogether is in tension with the Go ideal of being explicit vs implicit, and that's why this problem continues to be difficult to solve.

@carlmjohnson : You'll have to forgive me as I'm a bit of a PL dummy, and I am trying to draw the line between your suggestion and what that does for context.Context. Are you suggesting that the context package be written with these locals and then a function like context.Current() would do The Right Thing for the calling goroutine?

@carlmjohnson
Copy link
Contributor

@carlmjohnson carlmjohnson commented Dec 29, 2017

Let's say you want to keep some metrics on a request. You might do:

package metrics

local (
    Start time.Time
    RequestID int
    Canceled chan struct{}
)

func MetricMiddleware(h http.Handler) http.Handler {
    return http.HandlerFunc(func (w http.ResponseWriter, r *http.Request) {
        // Set Start, etc.
        h.ServeHTTP(w, r)
    })
}

Since each request is run in its own goroutine, they each have a consistent view of the value of the locals. Any handler that wants to know what its request ID is can just look up metrics.RequestID, which is guaranteed to be concurrent safe and strongly typed. Handlers that don't know or care about metrics don't have to pass anything along to give to other handlers.

In this scenario, the context package would be pretty much obsolete, but there might be a new package like "timeouts" or something to replace that one aspect of context.

@pciet
Copy link
Contributor

@pciet pciet commented Jan 8, 2018

So far package context hasn’t been necessary for me to write a web app with client and database. net/http already has a CloseNotifier to trigger a cancellation and my application code handles that without also bringing along a generic map. Perhaps I’ll need to worry about cancelling database/sql requests later, but why isn’t that just a cancellation channel? All I’m seeing in a peek at the source is this pattern:

select {
default:
case <-ctx.Done():
    // the work done here varies
    return nil, ctx.Err()
}

To me context seems like application code creeping into the standard library; I understand that a lot of effort goes into making the uses of context work, but maybe that would be better served with specific solutions on a per-package basis in the standard library (since we’re talking about a Go 2 redesign). Perhaps there should be a standard library “package distributed” that provides standard library functionality for distributed Go applications that need the generic uses of context.

Anyway, having a map and map copy for every goroutine seems like a burden that may significantly lower the performance limit on Go concurrent programming. The lightweight nature of goroutines is an important feature of Go.

@Merovius
Copy link

@Merovius Merovius commented Jan 8, 2018

@pciet context.Context is not a map and it is not copied when passed. It is an immutable association list. Passing it is just as much work as passing any other interface value (i.e. copying two words). If you use GLS (not that I think it's necessarily a good idea, but that's what this issue is about), you'd get it even cheaper.

Secondly, the reason context.Context is in the stdlib is, that the stdlib needed it (AFAIK). And the advantage of having a generic, not application-dependent type is, that you can plug it into your API and automatically support all kinds of applications (think http middleware).

Feel free to read this post for an intro on what purpose it serves, (shameful plug) this post for why the alternatives don't work well and watch this and this for how it's implemented (i.e. not a map).

@pciet
Copy link
Contributor

@pciet pciet commented Jan 8, 2018

My only actual issue with context is seeing duplicated methods in database/sql.

When I first read that blog post awhile back there were two things that stood out to me that colored my perception of context up to now:

At Google, we require that Go programmers pass a Context parameter as the first argument to every function on the call path between incoming and outgoing requests.

I’ve seen this kind of pattern before and I don’t like it. My post above stems from wanting to keep this kind of requirement away from my Go applications.

Value(key interface{}) interface{}

Here’s a proposal I opened about the interface{} pattern with my complaints and idealistic requirements for Go 2: #23077

These patterns seem under-designed to me, but you’ve pointed out that context probably does have a place in the standard library. I should have said “context use in database/sql seems like application code creeping into the standard library”.

My comments about maps are in response to this:

A new goroutine would inherit a copy of the parent's map, not the actual map. (That wouldn't be very local.)

@bcmills
Copy link
Member

@bcmills bcmills commented Jan 8, 2018

Anyway, having a map and map copy for every goroutine seems like a burden that may significantly lower the performance limit on Go concurrent programming.

Copying an instance of Go's built-in map type can indeed be fairly expensive if the map is large.

However, the sort of map used by Context is fundamentally different, because the Context API does not provide any way to remove or modify existing entries. That makes it compatible with a functional-style immutable map, which would not need to be copied for every new goroutine (or new derived Context).

Unfortunately, a functional map would come with some downsides too: either slower reads or the need for some sort of compaction step (possibly a lazy one), and perhaps some subtlety around pinning stale entries during garbage collection (again addressable by compaction).

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Mar 7, 2018

The explicit mechanisms we use now were chosen intentionally and seem to be working.

The proposal of using a map seems potentially expensive--the map contents have to be copied for each new goroutine creation, and there are many goroutines. Nothing prevents this map from being large. (The implementation of goroutine labels used for profiling (runtime/pprof.Do) was carefully implemented to avoid this potential problem.)

Being explicit in the code tends to be clearer than implicit. We aren't going to make this change. Closing.

@golang golang locked and limited conversation to collaborators Apr 12, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
10 participants
You can’t perform that action at this time.