Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spec: add clear(x) builtin, to clear map, zero content of slice #56351

Closed
rsc opened this issue Oct 20, 2022 · 110 comments
Closed

spec: add clear(x) builtin, to clear map, zero content of slice #56351

rsc opened this issue Oct 20, 2022 · 110 comments

Comments

@rsc
Copy link
Contributor

rsc commented Oct 20, 2022

There is no way to clear a map in Go. You can write

for k := range m {
    delete(m, k)
}

but that only works if m does not contain any key values that contain NaNs.

Based on the discussion in #55002, we suggest adding delete(m) to the language to clear the map, always (even if it contains NaNs).

This is somewhat a reopening of #45328, which we closed because the loop was good enough, but we missed the earlier discussion in that issue of NaNs.

Adding delete(m) would then let us also add a similar mechanism in reflect.
We wouldn't need to add maps.Clear(m) since delete(m) is just as good.

@rsc
Copy link
Contributor Author

rsc commented Oct 20, 2022

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@atdiar
Copy link

atdiar commented Oct 20, 2022

Personally, I'm ok with it but can't determine if it is good enough.

Having a way to disallow NaN containing values in maps would be best. Then map clearing via iteration just works and the change in this proposal is unnecessary (if not for perf reasons).

Otherwise, comparing two maps may still have edge cases. (the crux of the issue is that comparison is actually underspecified it seems).
Notably relevant for Set types that would be implemented by using maps.

With this proposal, map clearing gets solved but not inadvertent comparison of potential NaN composites.
If deemed a serious issue, I'm a bit concerned that it could actually hide a programming mistake instead.

Other than that, this proposal should be semantically sound at least.

@robpike
Copy link
Contributor

robpike commented Oct 20, 2022

I tend to agree that this is really only about NaNs, and is a peculiar response to their presence. At least, if you want to justify a new predefined function with a new capability, making NaNs the selling point is off-key (see what I did there?).

@earthboundkid
Copy link
Contributor

I worry that this could lead to programmer error when clearing map[string]map[string]T or map[string]any.

@timothy-king
Copy link
Contributor

timothy-king commented Oct 20, 2022

With delete(m), will it be clear whether this is releasing memory of the underlying map?

  • If it does release the memory, it is equivalent to if m != nil { m = make(m[K]V) }. (But maybe I am not thinking through aliasing enough?) [edit: I was not thinking through aliasing enough.]
  • If it does not release the memory, will users assume that it does and misuse it? My guess is yes. So if this goes forward with this semantics, maybe document that in the spec?

@bcmills
Copy link
Contributor

bcmills commented Oct 20, 2022

@atdiar

If deemed a serious issue, I'm a bit concerned that it could actually hide a programming mistake instead.

I think this proposal strictly improves the situation for irreflexive keys: if you use it it does the right thing for clearing the map, and if you don't use it you're no worse off than if it never existed.

That said, this still wouldn't help use-cases that need to add and remove individual keys: delete(m) would only give an all-or-nothing approach, where in order to delete an irreflexive key you must burn the rest of the map to the ground along with it.


The programming mistake I'm more worried about is typos. If both delete(m, k) and delete(m) are defined, then I could inadvertently write delete(m) when I meant delete(m, k), and they're visually similar enough that the bug could be difficult to spot. 😅

@randall77
Copy link
Contributor

With delete(m), will it be clear whether this is releasing memory of the underlying map?

I would assume this does not release the memory, just as the delete-with-a-for-loop doesn't.

If we ever implement #20135 the map will shrink eventually if it is never refilled.

@Merovius
Copy link
Contributor

@carlmjohnson To be clear, what's the programming error? Both of these would work fine ISTM. For the first I can see that perhaps a programmer might've thought it deletes the inner maps (though that seems a bit far-fetched to me). But the second?

@timothy-king

If it does release the memory, it is equivalent to if m != nil { m = make(m[K]V) }. (But maybe I am not thinking through aliasing enough?)

You are not. func Clear[K , V](m map[K, V]) { m = make(m[K]V) } is a NOP, but func Clear[K, V](m map[K, V]) { delete(m) } is not.

If it does not release the memory, will users assume that it does and misuse it? My guess is yes.

What's the misuse? Semantics don't change. I am opposed to specify allocation behavior. That's an optimization that's up to implementers.

@earthboundkid
Copy link
Contributor

earthboundkid commented Oct 20, 2022

@carlmjohnson To be clear, what's the programming error?

What @bcmills said, a typo of delete(m, k) to delete(m) that is easy to overlook. You mean delete(m, "key") but write delete(m["key"]) by mistake, so m["key"] still exists but is blank.

@randall77
Copy link
Contributor

a typo of delete(m, k) to delete(m) that is easy to overlook.

I'm not too worried about this. For one, any reasonable test would catch this mistake. Second, you're not actually using k, which the compiler might just catch for you (e.g. if the previous line was k := ... and no other use of k appears later). Third, this is not a problem unique to this case. Forgetting an argument can happen in all manner of ...-arg'ed functions: fmt.Printf("%d"), append(s), path.Join(p), and so on.

@dheerajdlalwani

This comment was marked as resolved.

@ianlancetaylor

This comment was marked as resolved.

@ulikunitz
Copy link
Contributor

While I understand the motivation to reuse a predeclared function name here, I'm concerned because delete(m) doesn't actually delete the map but removes all entries. I understand that clear(m) would create backward compatibility issues, but a maps.Clear(m), reflect.ClearMap(m) or a runtime.ClearMap(m) would express the intention explicitly.

@Merovius
Copy link
Contributor

I understand that clear(m) would create backward compatibility issues

It wouldn't. All the builtin functions are predeclared identifiers. Adding new predeclared identifiers is backwards compatible - cf. comparable.

That's not saying we should (I have no strong feelings), but this, at least, is not a reason not to do it.

@blackgreen100
Copy link

blackgreen100 commented Oct 21, 2022

@bcmills

I think this proposal strictly improves the situation for irreflexive keys

at the cost of having an overloaded built-in function that behaves differently based on the number of arguments.

It's less user-friendly and more cognitively demanding to think about language features in terms of exceptions and/or gotchas, instead of just relying on different functions doing different things.

Yes, the language already has overloaded built-ins e.g. make but I feel the use case in this proposal is subtle enough to warrant a new built-in with a different name, e.g. clear(m) or reset(m).

Moreover, this would improve clarity if the new built-in went a step further and do additional things that delete(m, k) might or might not do, like shrinking the map (not sure if this is in scope for this proposal, though)

@DmitriyMV
Copy link
Contributor

DmitriyMV commented Oct 21, 2022

If #20135 is implemented what would be the actual difference between

delete(m)

and

m = nil // this is not actually needed
m = make(map[type1]type2)

?

@thepudds
Copy link
Contributor

thepudds commented Oct 21, 2022

If this proposal is adopted as delete(m), and if #20135 is implemented without any new API, then it would depend on the heuristic(s) adopted in #20135, but I would guess delete(m) would clear the elements but leave the capacity.

There is some related conversation in a different issue #54454, including this comment from Keith in #54454 (comment)

We probably don't want to shrink if people do the map clear idiom + reinsert a new set of stuff. Perhaps we only start shrinking after 2N operations, so that N delete + N add involves no shrinking. There's definitely a tradeoff here between responsiveness when you use fewer entries and extra work that needs to be done to regrow.

Perhaps #20135 and this proposal could be solved together (resize? remake?) or at least briefly considered together. (FWIW, I think some solution to #20135 probably has broader impact than this issue, though that probably is a better discussion for elsewhere).

@randall77
Copy link
Contributor

If #20135 is implemented what would be the actual difference between

delete(m)
and

m = nil // this is not actually needed
m = make(map[type1]type2)
?

The difference comes down to aliasing. What happens if other variables have a reference to the same map? See #56351 (comment)

@natefinch
Copy link
Contributor

I really don't like that delete(m) does not, in fact, delete m. Please don't take this shortcut. It makes the code lie, and that is a grievous offense for something that would be built into the language.

Personally, I think that it's a bug that you can have values that aren't equatable as map keys, and changing the language to better handle problems caused by that bug is not a good idea. I understand we can't change the fact that NaN is a valid map key, but we don't have to perpetuate the error by adding features to make it more usable.

My first test for most languages features is "is this just trying to avoid writing a loop or if statement?" and if it is, I put it loooow on the list. If we ignore the NaN edge case, that's all this is.

I'm not sure I've ever cleared a map in production code (though I do understand the uses). I am pretty sure I've never used floats as keys in a map. Adding a language feature for a tiny edge case based on a bug seems like an easy "no".

That being said, if people reeeealllly want this, maybe make it delete(m, _) instead? At least then it's kinda clear you're not deleting m itself, and you're not having one method that changes behavior based on how many arguments you give it.

@Merovius
Copy link
Contributor

My first test for most languages features is "is this just trying to avoid writing a loop or if statement?" and if it is, I put it loooow on the list. If we ignore the NaN edge case, that's all this is.

FWIW one of the reasons this is being asked for now is that I wished to have a method in reflect which efficiently (!) clears a map. One of the main reasons for not adding that method to reflect is that reflect should not allow doing something the language can't do either.

So, for me, this isn't really about the syntactical overhead and effort of writing a loop. It's about being able to clear a map from reflect at all, without having to pay the allocation to iterate it first and then pay the allocations for all of its storage when growing it again.

I agree that it would be better if we could've done something about irreflexive keys before Go 1. But to me, the reason for this isn't really to be able to clear them out. That's such an edge-case, it doesn't even factor into the calculation for me. As I said here, I'd even want something like this if delete(m) didn't remove irreflexive keys. Though, TBF, maps.Clear couldn't use the fallback I could do with reflect.

@ianlancetaylor
Copy link
Contributor

I want to note that the proposed delete(m) is precisely analogous to the existing os/signal.Notify. Notify(ch, sig) notifies for the signal sig, while Notify(ch) notifies for all signals. Similarly, delete(m, k) deletes k from m, while delete(m) deletes all k values from m.

That analogy suggests that perhaps we should permit delete(m, k1, k2) to delete both k1 and k2 from the map.

@natefinch
Copy link
Contributor

I'd vastly prefer clear(m) to delete(m) if we need it for optimization's sake. Then it's not a misleading name, it matches the other Clear methods, there's no confusion about how many arguments it takes, and it's much more clear what it does. Heck, we could even allow it to also take in a slice and zero out the memory of its backing array. There's no easy way to do that right now, either.

@atdiar
Copy link

atdiar commented Oct 21, 2022

@ianlancetaylor

I think I can understand why people would like to dissociate map clearing and delete.

In one case, everything is deleted regardless of the key.

In the other, NaN containing keys would not be deleted.

Is this an issue?

@bcmills
Copy link
Contributor

bcmills commented Oct 21, 2022

It seems to me that:

  1. We would not care about this form of delete if it were possible to delete irreflexive keys otherwise.
  2. Deleting irreflexive keys from a map is a very niche use-case, because putting irreflexive keys in a map in the first place is a very niche use-case.

Given those, I wonder if it would suffice to define delete as the inverse of m[k] = v. That is, instead of having delete(m, k) remove all keys equal to k, we could have it delete all map entries that could have been created by assigning to m[k].

That would change the behavior of a tiny number of existing programs that both add irreflexive keys to maps and use delete with those keys. My hypothesis is that essentially every such program is already broken to begin with, and redefining delete in this way would strictly reduce their degree of brokenness.

@jimmyfrasche
Copy link
Member

@ianlancetaylor if you could delete multiple keys delete(m, keys...) would be useful but it would also be dangerous if len(keys) == 0 cleared the map.

@timothy-king
Copy link
Contributor

If it does not release the memory, will users assume that it does and misuse it? My guess is yes.

What's the misuse? Semantics don't change. I am opposed to specify allocation behavior. That's an optimization that's up to implementers.

@Merovius Say they have a long lived struct and don't need a map field anymore so they want to free the underlying memory. The concern is a user writes delete(x.m) instead of x.m = nil here and is consuming more memory than they intended. My point was that the documentation for delete just needs to be written in a way that that users should not expect to get the memory back. If this can be done without specifying the allocation behavior, I agree that that would be preferable. I am just not sure how.

johanbrandhorst pushed a commit to Pryz/go that referenced this issue Feb 22, 2023
Fixes golang#56351

Change-Id: Ia87bf594553b7d0464b591106840f849571c5f39
Reviewed-on: https://go-review.googlesource.com/c/go/+/467755
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Robert Griesemer <gri@google.com>
TryBot-Bypass: Robert Griesemer <gri@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/481935 mentions this issue: runtime: mark map bucket slots as empty during map clear

gopherbot pushed a commit that referenced this issue Apr 8, 2023
So iterators that are in progress can know entries have been deleted and
terminate the iterator properly.

Update #55002
Update #56351
Fixes #59411

Change-Id: I924f16a00fe4ed6564f730a677348a6011d3fb67
Reviewed-on: https://go-review.googlesource.com/c/go/+/481935
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/498755 mentions this issue: doc/go1.21: document clear builtin and init order changes

@dmitshur dmitshur modified the milestones: Backlog, Go1.21 May 27, 2023
gopherbot pushed a commit that referenced this issue May 27, 2023
Also move all the language changes to the same part of the release notes.

For #56351
For #57411

Change-Id: Id1c51b5eb8f7d85e61a2ae44ee7d73bb13036631
Reviewed-on: https://go-review.googlesource.com/c/go/+/498755
TryBot-Bypass: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
@ghost
Copy link

ghost commented Jun 21, 2023

I dont see this mentioned yet, so let me do so now:

If the argument type is a type parameter, all types in its type set must be maps or slices, and clear performs the operation corresponding to the actual type argument.

https://tip.golang.org/ref/spec#Clear

this does not make sense, at least to me. clear can only operate on values, not types. what is the point of clear accepting a type parameter?

@Merovius
Copy link
Contributor

@4cq2 "Argument type" in this context means "the type of the argument" not "the type given as argument" (that is called a "type argument" in the spec. Isn't language fun?). I guess that could be made explicit, to remove that ambiguity?

@griesemer
Copy link
Contributor

Yeah, happy to use a more lucky formulation that is not too clunky. But the meaning here is, in italics:

If the argument type (the type of the argument provided to clear) is a type parameter (is of type parameter type), all types in its type set (in the type set of the constraint corresponding to the type parameter) must be maps or slices, and clear performs the operation corresponding to the actual type argument (corresponding to the type of the actual type argument with which the type parameter was instantiated).

@Merovius
Copy link
Contributor

@griesemer I think I would definitely replace "argument type" with "type of the argument" or introduce a name for it ("clear(x), where x has type T, does… if T is a type parameter…").

@griesemer
Copy link
Contributor

griesemer commented Jun 21, 2023

Reopening so we can reconsider spec wording.

@griesemer griesemer reopened this Jun 21, 2023
@griesemer griesemer self-assigned this Jun 21, 2023
@jimmyfrasche
Copy link
Member

it takes very little code to demonstrate the case and that could be used to introduce a good name to refer to:

func genericUseOfClear[T TypeSet](v T) {
  clear(v)
}

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/510935 mentions this issue: spec: clarify prose in rule for clear built-in

gopherbot pushed a commit that referenced this issue Jul 18, 2023
Per feedback on #56351.

For #56351.

Change-Id: I63dd1713a1efe4d7180d932dbd8e1510cbb32e90
Reviewed-on: https://go-review.googlesource.com/c/go/+/510935
TryBot-Bypass: Robert Griesemer <gri@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
@griesemer
Copy link
Contributor

Spec prose was adjusted. Closing.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests