Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discuss: should we provide guidance on function arguments being functions #92

Closed
jeffbean opened this issue Jun 1, 2020 · 9 comments
Closed

Comments

@jeffbean
Copy link

jeffbean commented Jun 1, 2020

As it's not specifically called out I am looking to discuss adding our opinion (if any) around function parameters that are not variables or constants.

My motivation for failing this issue and want others' feedback was while doing some simple channel read loop, it occurred to me that we can do a blocking read as a function parameter. My opinion is the difference between <-event and event in this use case has a massive impact on behavior.

handleEvent(<-event)
update := <-event
handleEvent(update)
handleEvent(getEvent())

My opinion is that the first case should be avoided. As for the others, I'm not sure I have a strong opinion.

I understand this is specific to a channel in this example, and guidance ideally would not be specific to channels, but I also acknowledge that my point of <- being the issue is specific to channels.

@mway
Copy link
Member

mway commented Jun 2, 2020

I agree with your instincts about which of these examples are good/bad.

To rephrase (and correct me if I'm not properly capturing the intent), we'd essentially be recommending that folks prefer named or intermediate variables over temporary, unnamed values, because - ostensibly - the language, type, and/or API semantics for the types being passed aren't obvious, and explicitly using variables makes things easier to understand and harder to misinterpret.

If we extrapolate from this, we could change it to mean "any potentially blocking call to compute or retrieve a value should not be invoked such as it could be conflated with another blocking call at the same callsite". This could be a combination of channels, functions, and defer:

func probablyAnExpensiveFunc() string {
  // assume some likely runtime cost
  return "foo"
}

func maybeAnExpensiveFunc(s string) string {
  // assume some potential runtime cost
  return s + "bar"
}

// Is the channel send blocking, or is evaluating the function call blocking?
ch <- probablyAnExpensiveFunc()

// What is blocking here?
ch <- maybeAnExpensiveFunc(probablyAnExpensiveFunc())

// What is blocking here?
doSomething(<-ch, probablyAnExpensiveFunc(), maybeAnExpensiveFunc())

// Will users of defer be surprised?
defer doSomething(<-ch, probablyAnExpensiveFunc(), maybeAnExpensiveFunc())

// How about here?
defer func() {
  doSomething(<-ch, probablyAnExpensiveFunc(), maybeAnExpensiveFunc())
}

So on one hand, maybe we have opinions about where the language is unclear and could be augmented with basic guidance. On the other hand, this is not unique to Go: one could argue this is a "generic consideration when writing software", like many runtime concerns or using good variable names. Where do we draw the line? Is any potentially-obfuscated blocking bad? Is it even possible to remove all ambiguity, and if not, is it worth addressing any of it?

// Is this measurably better?
var (
  chanString  = <-ch
  probString  = probablyAnExpensiveFunc()
  maybeString = maybeAnExpensiveFunc()
)

doSomething(chanString, probString, maybeString)

// What about this?
var (
  probString  = probablyAnExpensiveFunc()
  maybeString = maybeAnExpensiveFunc(probString)
)

ch <- maybeString

As a devil's advocate argument, while I'd generally agree that the above are "better" from a clarity perspective, how much less error-prone are they in reality? In either case, is the net result different? Executing code takes the amount of time it takes, so are we "preventing" anything other than incorrect assumptions?

Right now, I'm split between "this is a code review problem" and "we should draft very generic guidance". We do have plenty of guidance already in the vein of "don't shoot yourself in the foot", but also want to be careful not to become unnecessarily prescriptive. The only guidance I think we could give would be something like "avoid ambiguous or 'shadowed' callsite blocking", which could use parameter and/or operand evaluation (function parameters, channel sends, and probably defer) to illustrate the problem, much like you originally showed.

Interested in other folks' thoughts. @abhinav, @prashantv?

@prashantv
Copy link
Contributor

prashantv commented Jun 2, 2020

I had a bit of an offline discussion with @jeffbean initially, where we first considered:

Specific guidance to discourage use of channel operations as function arguments, e.g., handleEvent(<-eventCh)

  • <- can be easy to miss, especially when it's part of a larger function call with many other arguments. In the single argument case, I'd actually prefer handleEvent(<-eventCh) than an unnecessary intermediate variable. However, I'd prefer to separate it it out if it's part of a larger call like handleEvent(logger, e.getCaller(), getTime(), <-eventCh) where it's easier to miss.
  • It seems inconsistent to have guidance only for channels, and not for other expressions that may block (e.g., handleEvent(getEvent()) where getEvent has the same channel operation.

To avoid the inconsistency, we could make it more general: avoid any complex expressions as function arguments.

However, I think that advice has too broad of an impact, and isn't necessarily useful. -- e.g., net.Dial("tcp", ln.Addr().String()) is an example where the 2nd argument has multiple function calls, but I don't think we want to force the user to make it more verbose.

What I think we want to avoid is complex expressions where the blocking argument is easy to miss, or there's multiple blocking operations like handle(<-c1, <-c2, <-c3) since that can:

  • make it harder to understand ordering (relies on understanding order of argument evaluation)
  • when debugging via /debug/pprof/goroutines, not provide visibility into which channel is blocking (since they all are on the same line).

@mway
Copy link
Member

mway commented Jun 2, 2020

Agree. I think the original phrasing of

avoid ambiguous or 'shadowed' callsite blocking

or something like

avoid using potentially blocking sub-expressions

is probably both specific enough to a class of pitfalls and generic enough to not be rigid or over-prescriptive. (Something similar to that phrasing is also flexible enough to apply to both functions and channels since it just describes blocking rather than its cause.)

This doesn't address whether we should add guidance for this, though. Agreed that it's probably not valuable to make a special callout for channels, but if we feel that "nested blocking" is a prevalent enough issue, then what we're discussing makes sense. I haven't seen this pop up as an issue in many (if any) code reviews/bugfixes, for what it's worth, but that's only one datapoint.

@jeffbean
Copy link
Author

jeffbean commented Jun 2, 2020

I had not considered the debugging aspect or argument ordering. This personally will encourage me to avoid this pattern.

@mway
Copy link
Member

mway commented Jun 2, 2020

After some internal discussion, the general thinking is that both (1) channel operations and (2) function call semantics are unambiguous enough that it should be reasonably clear, in all cases, that such expressions have the potential to block; therefore, we'll opt to lean on code reviewers' judgment rather than adding specific guidance here.

@mway mway closed this as completed Jun 2, 2020
@jeffbean
Copy link
Author

jeffbean commented Jun 2, 2020

thanks for the input all!

@peterbourgon
Copy link

peterbourgon commented Dec 30, 2021

Wound up here from a private discussion, so apologies for the thread necromancy 🧟

I like this guidance, and tend to express it slightly differently: each line of source code should contain only a single operation, where operation is basically shorthand for anything that can block, as identified above: function calls, chan sends or recvs, etc. It's a simple rule that's easy to follow and tends to make code easier to audit. For example,

// Avoid
val, err := f(<-c, someFunc(), fmt.Sprintf("prefix/%s", ns))

// Prefer
var (
    foo = <-c
    bar = someFunc()
    msg = fmt.Sprintf("prefix/%s", ns)
)
val, err := f(foo, bar, msg)

@prashantv
Copy link
Contributor

I'm generally for keeping each line relatively simple, but in trivial cases, such as net.Dial("tcp", ln.Addr().String()), is there much of a benefit to the multi-line approach (which adds a new variable to the scope):

lnAddrStr := ln.Addr().String()
conn, err := net.Dial("tcp", lnAddrStr)`

I prefer the one-line expression, since the ln.Addr().String() expression is simple to follow, but can imagine cases where the function calls are much more complex and make it difficult to follow.

@peterbourgon
Copy link

peterbourgon commented Dec 30, 2021

Sure! It's guidance, not an inviolable rule, and there are plenty of cases where it's probably preferable to inline function calls. Yours is a reasonable example of one of those cases, and — absent more context — the original form is probably just fine. (We can reasonably assume Addr and String will not block.) But I can definitely think of situations where an exploded version would be beneficial! And I'd probably do it like

var (
    netw = "tcp"
    addr = ln.Addr().String()
)
conn, err := net.Dial(netw, addr)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants