Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: spec: cannot assign to a field of a map element directly: m["foo"].f = x #3117

Open
griesemer opened this issue Feb 23, 2012 · 36 comments
Labels
Go2 LanguageChange Proposal-Hold
Milestone

Comments

@griesemer
Copy link
Contributor

@griesemer griesemer commented Feb 23, 2012

The non-addressability requirement of map index expressions makes sense by itself since
we don't want a surviving pointer to a map element.

But surely it should be ok to have an invisible temporary pointer p to the respective
field which is just used during the assignment (*p = x). It seems that this doesn't
introduce any more issues than what we have already when assigning to a larger map
element that cannot be assigned to atomically.

E.g., the following program should be legal:

package main

import "fmt"

var m = map[string]struct{x, y int} {
    "foo": {2, 3},
}

func main() {
    m["foo"].x = 4 // cannot assign to m["foo"].x
    fmt.Println(m)
}
@gopherbot
Copy link

@gopherbot gopherbot commented Dec 11, 2012

Comment 1 by Robert.nr1:

Hi, what would be the best way to work around this until it gets fixed?
Is this the way to go? (pun intended.. ;)
var tmp = m["foo"]
tmp.x = 4 
m["foo"] = tmp

@griesemer
Copy link
Contributor Author

@griesemer griesemer commented Dec 11, 2012

Comment 2:

Yes, this is the "work-around". If m doesn't contain an entry with key "foo", tmp will
be the zero value for the map value type and this code still works (by setting the
respective entry for the first time).
Note that if direct assignment as in
m["foo"].x = 4
were permitted, one of the issues (in the language t.b.d.) would be what to do if
m["foo"] doesn't exist (non-trivial).

@gopherbot
Copy link

@gopherbot gopherbot commented Dec 11, 2012

Comment 3 by Robert.nr1:

Okay, thanks for the quick reply!
Yes, I guess that requires some thinking, it's not obvious what should happen. 
Maybe it could be defined like this, if 'm["foo"]' does not exist when assigning
'm["foo"].x':
* Create the new "map member" 'm["foo"]'
* Then try to assign "x" to it
* It will fail, but that does not really mater that much since we would still get a
reasonable error message out of it?
Just a thought, thanks again for the quick reply.

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Dec 11, 2012

Comment 4:

The two cases really are different.  Given a map holding a struct
    m[0] = s
is a write.
    m[0].f = 1
is a read-modify-write.  That is, we can implement
    m[0] = s
by passing 0 and s to the map insert routine.  That doesn't work for
    m[0].f
Instead we have to fetch a pointer, as you suggest, and assign through the pointer.
Right now maps are unsafe when GOMAXPROCS > 1 because multiple threads writing to the
map simultaneously can corrupt the data structure.  This is an unfortunate violation of
Go's general memory safeness.  This suggests that we will want to implement a
possibly-optional safe mode, in which the map is locked during access.  That will work
straightforwardly with the current semantics.  But it won't work with your proposed
addition.
That said, there is another possible implementation.  We could pass in the key, an
offset into the value type, the size of the value we are passing, and the value we are
passing.
A more serious issue: if we permit field access, can we justify prohibiting method
access?
    m[0].M()
But method access really does require addressability.
And let's not ignore
    m[0]++
    m[0][:]

@extemporalgenome
Copy link
Contributor

@extemporalgenome extemporalgenome commented Dec 11, 2012

Comment 5:

I don't think this buys _too_ much, since even with this there'd be no way to "merge"
values which happen to be structs/maps. For example:
type S struct { f, g, h int }
m := map[int]S{0: S{1,2,3}}
// I want to update f and g, but not h
m[0] #= S{f:5, g:4} // some sort of squirrely 'merge' syntax
I'm certainly not proposing such a thing (besides which, it doesn't fit with any other
part of the language) -- the fact does remain that:
m[0].f = 5
m[0].g = 4
is shorter, but likely much less efficient even if Ian's offset-set functionality were
added, compared to:
s := m[0]
s.f = 5
s.g = 4
m[0] = s
The performance gap would certainly increase with the number of fields you want to
update.
All that said, if this feature is particularly desirable, then I think the simpler way
to handle the semantics is to keep the addressability requirement, and consider some
cases to be syntactic sugar:
m[0][:] already works for []int and *[...]int. Because values can be moved around in the
map implementation, m[0][:] should never work for [...]int.
`m[0].f = 1` already works if the value is a pointer type. Why is that a big deal? If
you want less memory fragmentation and better cache lining, use "2nd order allocators".
The hashmap implementation already uses pointer indirection anyway.
Alternatively, `m[0].f = 1` for 'non addressable' values could be _sugar_ for `tmp :=
m[0]; tmp.f = 1; m[0] = tmp`. Ian's optimization would be transparent, and
implementation dependent. Same for m[0]++
m[0].M() already works for value receivers and pointer values. For the same reason array
values shouldn't be sliceable, pointer receivers on non-pointer values should not be
allowed.
&m[0].f should be prohibited in all cases where it's currently prohibited.

@rsc
Copy link
Contributor

@rsc rsc commented Mar 12, 2013

Comment 6:

[The time for maybe has passed.]

@rsc
Copy link
Contributor

@rsc rsc commented Nov 27, 2013

Comment 7:

Labels changed: added go1.3maybe.

@rsc
Copy link
Contributor

@rsc rsc commented Dec 4, 2013

Comment 8:

Labels changed: added release-none, removed go1.3maybe.

@rsc
Copy link
Contributor

@rsc rsc commented Dec 4, 2013

Comment 9:

Labels changed: added repo-main.

@dkinzer
Copy link

@dkinzer dkinzer commented Feb 8, 2015

I just had to use this work around in order to be able to assign some values to two struct fields simultaneously from a function that returns multiple values. Seems kind of an ugly work around. But I'm glad it works.

@yaoshengzhe
Copy link

@yaoshengzhe yaoshengzhe commented Mar 12, 2015

I feel upset that Go still keep pointers, perhaps to better integrate with C ? However, issue like this could be solved elegantly if Go could take Java's approach, reference types (not exact the reference in C++, more like Java's) for all non-primitive types (including struct). Nothing offense, but as a Go developer, I was bitten several times by interface, struct and pointer... Well, you can blame me as I am an inexperience Go developer...

@minux
Copy link
Member

@minux minux commented Mar 12, 2015

@rsc
Copy link
Contributor

@rsc rsc commented Jan 4, 2016

(Locked due to recent spam.)

@vkuzmin-uber
Copy link
Contributor

@vkuzmin-uber vkuzmin-uber commented Oct 3, 2018

Yes, I completely understand it. Just mentioned that I have seen somewhere that CGo allowed it unintentionally. I meant that I need to check if current map API (runtime.map*) can support this feature w/o significant changes.

@vkuzmin-uber
Copy link
Contributor

@vkuzmin-uber vkuzmin-uber commented Oct 11, 2018

I want to refresh and even explain motivation of this proposal.

As first comment says Golang doesn’t allow the following code:

func main() {
    type T struct{x, y int}
    m := make(map[string]T)
    m["foo"].x = 4 // cannot assign to m["foo"].x
    fmt.Println(m["foo"].x)
}

“cannot assign to struct field m["foo"].x in map”

As result, compiler forces you to use copy of whole object:

func main() {
    type T struct{x, y int}
    m := make(map[string]T)
    t := m["foo"]
    t.x = 4
    m["foo"] = t
    fmt.Println(m["foo"].x)
}

This solution has overhead: 2 lookups, 2 copies instead of 1 lookup and in-place change. Note that you can’t take reference to m[k] because map values are not addressable #11865

Lets consider another workaround - use *T that is allowed by Go 1.11:

func main() {
    type T struct{x, y int}
    m := make(map[string]*T)
    m["foo"] = &T{1, 5} // Initialization is required, otherwise it contains “nil” pointer
    m["foo"].x = 4
    fmt.Println(m["foo"].x)
}

You see, that you need to initialize m[k] always before you can read it or change. Inconvenience is more visible in the following example:

func main() {
    type T struct{x, y int}
    m := make(map[string]*T)
    m["foo"] = &T{}
    fmt.Println(m["foo"].x)
}

Here we initialize m[“foo”] with reference of default T{} value just to be able read it then.

Look at the code with map of structures:

func main() {
    type T struct{x, y int}
    m := make(map[string]T)
    fmt.Println(m["foo"].x)
}

It works and returns valid result 0.

We see here feature of map of structures - "auto-initialization" of elements of map (in fact, mapaccess returns "default zero object" but this is idea of what we get). But we still want to be able to update fields of elements if necessary in efficient way, w/o copy of whole object.

As result, I observe 2 Motivations to allow it:

  1. Elegant syntax. When we don’t need to do explicit lookup, copy, update and assign then.
  2. Efficiency. Even if we implement 1st as implicit actions, it may generate expensive code that is not obvious for author of code. Especially for cases like:
    m[“foo”].x = 4
    m[“foo”].x += 1
    m[“foo”].y = 3
    and etc.

This is why I look for implementation of this proposal where 1 and 2 can be achieved together.

*Note, comment above mentioned that "map of structures" assumes more efficient memory access but this is too general assumption to be "motivation" at this moment.

@gopherbot
Copy link

@gopherbot gopherbot commented Oct 13, 2018

Change https://golang.org/cl/141647 mentions this issue: cmd/compile: assign to a field of map element

@vkuzmin-uber
Copy link
Contributor

@vkuzmin-uber vkuzmin-uber commented Oct 13, 2018

Current map's internal API makes this change trivial. I have implemented a Proof of Concept and added simple test. Test passed.

@randall77
Copy link
Contributor

@randall77 randall77 commented Oct 13, 2018

I'm in favor of this proposal.

The thing to worry about is that although this change is trivial to do today (thanks, Vladimir!) it might not be trivial with maps implemented some other way. For instance, in the current implementation map assignment returns the value by reference so it can be updated:

m := map[K]V{}
v := mapassign(m, k)  // type *V
v.f = 7

If we still did it the old way, passing the updated value in, it would be harder. We'd have to pass the offset and type of the portion of the value we'd like to update.

m := map[K]V{}
mapassign(m, k, offsetof(V.f), typeof(V.f), &7)

Further, if we allow op= changes to fields, the op would also need to be passed in.

Not the end of the world, but it's one more thing that constrains the map implementation.

@mdempsky
Copy link
Member

@mdempsky mdempsky commented Oct 14, 2018

I'm fine with this proposal, but don't feel strongly about it.

Also, assuming it's to be accepted, I second @ianlancetaylor's suggestion that array indexing should be allowed too. Moreover, I'd suggest that an arbitrary number of struct field and array indexing operations (including mixing them) are allowed. (The latter doesn't appear to have been explicitly suggested yet in the thread.)

@vkuzmin-uber
Copy link
Contributor

@vkuzmin-uber vkuzmin-uber commented Oct 17, 2018

Ian> We should also consider array indexes for maps whose value type is an array type. `m["key"][1] =

Matt> Also, assuming it's to be accepted, I second @ianlancetaylor's suggestion that array indexing should be allowed too.

I tried PoC it and think this deserves a separate Issue/Proposal where we can better consider pros/cons.

The reason - I see that map of fixed array doesn't behave like a map of structs. It seems:

m := map[int][10]int{}
m[k][5] = 1

does some lazy allocation of fixed array and simple PoC change hits panic on assigning to "nil". Sure I can dig into it to understand, just thought it's better to separate and make a progress with map of structures first. Let me know if you are ok with separate proposal, and I will create a new one with link to this one.

Matt>Moreover, I'd suggest that an arbitrary number of struct field and array indexing operations (including mixing them) are allowed. (The latter doesn't appear to have been explicitly suggested yet in the thread.)

I updated PoC with tests that covers valid and invalid cases. I think this is what you meant (except map of fixed arrays cases):

Valid:
m[k].x.y = v
m[k].a[i] = v
m[k].t.a[i].x.y = v

where a is a fixed array, i index of array

Invalid cases:
m[k][i] = v
m[k].t.another_map[key][i] = v

where i is an index of fixed array.

@vkuzmin-uber
Copy link
Contributor

@vkuzmin-uber vkuzmin-uber commented Oct 17, 2018

I'm in favor of this proposal.

The thing to worry about is that although this change is trivial to do today (thanks, Vladimir!) it might not be trivial with maps implemented some other way. For instance, in the current implementation map assignment returns the value by reference so it can be updated:
Not the end of the world, but it's one more thing that constrains the map implementation.

Hi Keith,

I understand this concern and kept it in mind as one of the cons. But frankly, this limitation of language is so unusual so I think it was caused by initial implementation of map. So, first we need to understand if there is a value of the feature and I think yes:

  • User expects this behavior (I provided 2 User Motivations above: Elegant syntax + Efficiency)
  • If we allow it, next step can be "test/provide more efficient memory access", because theoretically memory access should be more efficient for "map of structures" vs "map of references to structure". I will look at Benchmarks to see if it shows something.

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Oct 17, 2018

@vkuzmin-uber The language limitation is not caused by the initial implementation of map. It's caused by the fact that a map element is not addressable. A map element can not be addressable because it leads to confusing situations, like what should happen if you take the address of an element and then delete the key from the map, and then add it back again.

The language specifies that you can't assign to a value that is not addressable. That makes sense for a language like Go: if you can't take the address of a value, how can you change it? Of course that would make maps useless, so there is an exception: you can assign to a value if it is addressable, or if it is a map index (https://golang.org/ref/spec#Assignments).

This issue is about changing the language to add an additional exception to the assignment rules.

Thanks for being willing to work on it, but I want to stress that this is a language change proposal and that proposal has not yet been accepted. That status is not going to change just because we have an implementation. This language change is going to be accepted or rejected as part of an overall Go 2 process.

In particular, from a language perspective, I do not think that we can separate field assignment to map elements from indexing of map elements. You seem to suggest that they should be handled separately, but I don't think that is OK. When I say that I'm not referring to the implementation details, but rather to how the language is changed. We are going to make a language change decision first, then an implementation, not the other way around.

Thanks for your interest in this.

@mdempsky
Copy link
Member

@mdempsky mdempsky commented Oct 17, 2018

It occurs to me that allowing array indexing involves an ambiguity over whether assignments happen before or after array bounds checking. For example:

m := make(map[string][10]int)
i := 100

func() {
    defer func() { recover() }()
    m["k"][i] = 42
}()

println(len(m))

Does this print 0 (we bounds check i < 10 before adding an element to the map) or 1 (bounds check after)? Or is it up to the implementation to decide?

@randall77
Copy link
Contributor

@randall77 randall77 commented Oct 17, 2018

@mdempsky We definitely want the bounds checks to happen before the slot in the map is allocated. It's very similar to

   m[k] = f()

If f panics. We never want to expose an allocated-but-not-yet-assigned map value.

@mdempsky
Copy link
Member

@mdempsky mdempsky commented Oct 18, 2018

@randall77 Yeah, I'm inclined to agree the bounds check should happen before the map element is allocated. It's not obvious to me though that the alternative (inserting a zero element into the map) is unreasonable. (I think exposing uninitialized memory is obviously unreasonable; not sure if that's what you had in mind by "not-yet-assigned".)

I think a better example for comparison is the first example from #23735, which emphasizes the question of whether (*p)[i] checks for nil or array index out of bounds first (and notably cmd/compile and gccgo differed). Except there we dodged addressing the question by pointing out the spec doesn't guarantee panic error text anyway.

@vkuzmin-uber
Copy link
Contributor

@vkuzmin-uber vkuzmin-uber commented Oct 18, 2018

@mdempsky , @randall77

Just to clarify, do we expect the same behavior of a similar case, when array is wrapped into struct:

m := make(map[string]struct{a [10]int})
i := 100

func() {
    defer func() { recover() }()
    m["k"].a[i] = 42
}()

println(len(m))

@randall77
Copy link
Contributor

@randall77 randall77 commented Oct 18, 2018

@vkuzmin-uber : yes. Your example should print 0.

@rsc
Copy link
Contributor

@rsc rsc commented Mar 6, 2019

There's a lot of subtle corner cases here. To move forward with this, we need a real design doc.

@gopherbot gopherbot removed the NeedsDecision label Aug 16, 2019
@pyang-1981
Copy link

@pyang-1981 pyang-1981 commented Aug 29, 2019

Anyway, I think we should be fine to use m[key].field++ and like. It is programmer's responsibility to protect against data structure corruption, or refer to invalid addresses by using locks. Right now Go is forbiding a valid use that can be solved by the synchronization structure it already offers.

@gopherbot gopherbot added the NeedsDecision label Sep 3, 2019
@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Mar 17, 2020

Putting this on hold until and unless someone decides to write a complete design doc for the change.

@ianlancetaylor ianlancetaylor added Proposal-Hold and removed NeedsDecision labels Mar 17, 2020
@ianlancetaylor ianlancetaylor removed this from the Unplanned milestone Mar 17, 2020
@ianlancetaylor ianlancetaylor added this to the Proposal milestone Mar 17, 2020
@seebs
Copy link
Contributor

@seebs seebs commented Oct 6, 2020

Wow, this is surprisingly hard to phrase cleanly.

I think the goal is: Given any valid expression that includes a map index and would otherwise be addressable, we want to be able to use it in assignments, increments, and so on, as if it were addressable. One of the challenges is that a simplistic description of this might imply that this should apply to expressions which wouldn't be addressable for other reasons not involving the map index. (Like, if you have a map of function-returning-int, we don't want m[k]()++ to be allowed, even though m[k]() is a reasonable expression.)

Thought experiment: What about pointer receiver methods?

type Counter {
    X int
}
var seebsno *Counter
func (c *Counter) Inc() { c.X++ }
func (c *Counter) Stash() { seebsno = c } 
var m map[int]Counter
m[0].X++ // allowed by this proposal
m[0].Inc() // sure
m[0].Stash() // oops

If you don't allow pointer-receiver methods, it's Annoying. If you do, you have opened the floodgates and made the value act as though it's addressable.

Possible resolution: If you do a thing to a map-index expression that would require it to be addressable, the thing happens on an addressable copy, and then the contents of that copy are assigned back into the map. If you stashed a copy of the address and then later tried to use it, this is considered Own Fault.

But... That feels like a sneaky gotcha.

So I think the right answer may be "still don't allow pointer-receiver methods". In which case... it's tricky to express the thing that is being allowed. Assignment, increment, etc, say that the thing must be addressable or a map index expression. Go doesn't really have a codified distinction between "lvalue" and "addressable", which is what I think is at issue.

Hmm. Given m[k].sel, where m[k] is a map index expression, and .sel is any combination of selectors or index expressions, the result can be used in assignment or increment operations, and works as though m[k] were extracted to a temporary variable and then stored back.

m[k].sel += 3
// equivalent to
tmp := m[k]
tmp.sel += 3
m[k] = tmp

Except that the map index expression doesn't get evaluated twice.

@mdempsky
Copy link
Member

@mdempsky mdempsky commented Oct 6, 2020

Thought experiment: What about pointer receiver methods?

No, we don't want to allow users to create pointers to the variables within a map. If we allowed that, then we might as well just make map indexing addressable. In your example, neither m[0].Inc() nor m[0].Stash() should be allowed.

The Go spec already states "Each left-hand side operand must be addressable, a map index expression, or (for = assignments only) the blank identifier." We probably just need to extend "a map index expression" here similarly to how we already define addressability: "The operand must be addressable, that is, either a variable, pointer indirection, or slice indexing operation; or a field selector of an addressable struct operand; or an array indexing operation of an addressable array."

We would need to define somewhere what happens when assigning to just a sub-variable within a map somewhere. Probably the easiest way would be to just define that when assigning to a key not already in the map, a corresponding value is first zero-initialized just before the assignment.

There's also nothing special about ++. m[0].X++ is defines as meaning the same thing as m[0].X += 1, which in turn means the same as m[0].X = m[0].X + 1. If we allow assigning to m[0].X, then m[0].X += 1 and m[0].X++ are implicitly allowed too (since m[0].X as a value expression is already valid).

@seebs
Copy link
Contributor

@seebs seebs commented Oct 6, 2020

I think the behavior with uninitialized values automatically falls out of the way a map yields a zero value when you access a nonexistent key.

I agree that pointer receivers are a bad idea, but they don't necessarily strictly imply making the content of the map addressable; they might imply making addressable copies of the content.

I'm not actually sure what I expect to "really" happen under the hood from something like m[0].X++; I would not be surprised if it resulted in exactly what currently happens when we write the code explicitly, which is that the entire contents of m[0] are copied, X is modified in the copy, and then the entire copy is written back. On the other hand, if it could arrange to only have to copy (and write back) that individual value, that would be a potentially significant performance win in some cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Go2 LanguageChange Proposal-Hold
Projects
None yet
Development

No branches or pull requests