Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: spec: cannot assign to a field of a map element directly: m["foo"].f = x #3117

Open
griesemer opened this issue Feb 23, 2012 · 31 comments

Comments

Projects
None yet
@griesemer
Copy link
Contributor

commented Feb 23, 2012

The non-addressability requirement of map index expressions makes sense by itself since
we don't want a surviving pointer to a map element.

But surely it should be ok to have an invisible temporary pointer p to the respective
field which is just used during the assignment (*p = x). It seems that this doesn't
introduce any more issues than what we have already when assigning to a larger map
element that cannot be assigned to atomically.

E.g., the following program should be legal:

package main

import "fmt"

var m = map[string]struct{x, y int} {
    "foo": {2, 3},
}

func main() {
    m["foo"].x = 4 // cannot assign to m["foo"].x
    fmt.Println(m)
}
@gopherbot

This comment has been minimized.

Copy link

commented Dec 11, 2012

Comment 1 by Robert.nr1:

Hi, what would be the best way to work around this until it gets fixed?
Is this the way to go? (pun intended.. ;)
var tmp = m["foo"]
tmp.x = 4 
m["foo"] = tmp
@griesemer

This comment has been minimized.

Copy link
Contributor Author

commented Dec 11, 2012

Comment 2:

Yes, this is the "work-around". If m doesn't contain an entry with key "foo", tmp will
be the zero value for the map value type and this code still works (by setting the
respective entry for the first time).
Note that if direct assignment as in
m["foo"].x = 4
were permitted, one of the issues (in the language t.b.d.) would be what to do if
m["foo"] doesn't exist (non-trivial).
@gopherbot

This comment has been minimized.

Copy link

commented Dec 11, 2012

Comment 3 by Robert.nr1:

Okay, thanks for the quick reply!
Yes, I guess that requires some thinking, it's not obvious what should happen. 
Maybe it could be defined like this, if 'm["foo"]' does not exist when assigning
'm["foo"].x':
* Create the new "map member" 'm["foo"]'
* Then try to assign "x" to it
* It will fail, but that does not really mater that much since we would still get a
reasonable error message out of it?
Just a thought, thanks again for the quick reply.
@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Dec 11, 2012

Comment 4:

The two cases really are different.  Given a map holding a struct
    m[0] = s
is a write.
    m[0].f = 1
is a read-modify-write.  That is, we can implement
    m[0] = s
by passing 0 and s to the map insert routine.  That doesn't work for
    m[0].f
Instead we have to fetch a pointer, as you suggest, and assign through the pointer.
Right now maps are unsafe when GOMAXPROCS > 1 because multiple threads writing to the
map simultaneously can corrupt the data structure.  This is an unfortunate violation of
Go's general memory safeness.  This suggests that we will want to implement a
possibly-optional safe mode, in which the map is locked during access.  That will work
straightforwardly with the current semantics.  But it won't work with your proposed
addition.
That said, there is another possible implementation.  We could pass in the key, an
offset into the value type, the size of the value we are passing, and the value we are
passing.
A more serious issue: if we permit field access, can we justify prohibiting method
access?
    m[0].M()
But method access really does require addressability.
And let's not ignore
    m[0]++
    m[0][:]
@extemporalgenome

This comment has been minimized.

Copy link

commented Dec 11, 2012

Comment 5:

I don't think this buys _too_ much, since even with this there'd be no way to "merge"
values which happen to be structs/maps. For example:
type S struct { f, g, h int }
m := map[int]S{0: S{1,2,3}}
// I want to update f and g, but not h
m[0] #= S{f:5, g:4} // some sort of squirrely 'merge' syntax
I'm certainly not proposing such a thing (besides which, it doesn't fit with any other
part of the language) -- the fact does remain that:
m[0].f = 5
m[0].g = 4
is shorter, but likely much less efficient even if Ian's offset-set functionality were
added, compared to:
s := m[0]
s.f = 5
s.g = 4
m[0] = s
The performance gap would certainly increase with the number of fields you want to
update.
All that said, if this feature is particularly desirable, then I think the simpler way
to handle the semantics is to keep the addressability requirement, and consider some
cases to be syntactic sugar:
m[0][:] already works for []int and *[...]int. Because values can be moved around in the
map implementation, m[0][:] should never work for [...]int.
`m[0].f = 1` already works if the value is a pointer type. Why is that a big deal? If
you want less memory fragmentation and better cache lining, use "2nd order allocators".
The hashmap implementation already uses pointer indirection anyway.
Alternatively, `m[0].f = 1` for 'non addressable' values could be _sugar_ for `tmp :=
m[0]; tmp.f = 1; m[0] = tmp`. Ian's optimization would be transparent, and
implementation dependent. Same for m[0]++
m[0].M() already works for value receivers and pointer values. For the same reason array
values shouldn't be sliceable, pointer receivers on non-pointer values should not be
allowed.
&m[0].f should be prohibited in all cases where it's currently prohibited.
@rsc

This comment has been minimized.

Copy link
Contributor

commented Mar 12, 2013

Comment 6:

[The time for maybe has passed.]
@rsc

This comment has been minimized.

Copy link
Contributor

commented Nov 27, 2013

Comment 7:

Labels changed: added go1.3maybe.

@rsc

This comment has been minimized.

Copy link
Contributor

commented Dec 4, 2013

Comment 8:

Labels changed: added release-none, removed go1.3maybe.

@rsc

This comment has been minimized.

Copy link
Contributor

commented Dec 4, 2013

Comment 9:

Labels changed: added repo-main.

@dkinzer

This comment has been minimized.

Copy link

commented Feb 8, 2015

I just had to use this work around in order to be able to assign some values to two struct fields simultaneously from a function that returns multiple values. Seems kind of an ugly work around. But I'm glad it works.

@yaoshengzhe

This comment has been minimized.

Copy link

commented Mar 12, 2015

I feel upset that Go still keep pointers, perhaps to better integrate with C ? However, issue like this could be solved elegantly if Go could take Java's approach, reference types (not exact the reference in C++, more like Java's) for all non-primitive types (including struct). Nothing offense, but as a Go developer, I was bitten several times by interface, struct and pointer... Well, you can blame me as I am an inexperience Go developer...

@minux

This comment has been minimized.

Copy link
Member

commented Mar 12, 2015

@rsc

This comment has been minimized.

Copy link
Contributor

commented Jan 4, 2016

(Locked due to recent spam.)

@rsc rsc changed the title spec: cannot assign to a field of a map element directly: m["foo"].f = x proposal: spec: cannot assign to a field of a map element directly: m["foo"].f = x Jun 20, 2017

@rsc rsc added the Go2 label Jun 20, 2017

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Jun 5, 2018

We should also consider array indexes for maps whose value type is an array type. m["key"][1] = 2.

@golang golang unlocked this conversation Oct 3, 2018

@vkuzmin-uber

This comment has been minimized.

Copy link
Contributor

commented Oct 3, 2018

I am willing to work on it if there are no objections. First I want to collect requirements, use-case, concerns. Then see what are the limitations of the current implementation (I have heard that in past Cgo implementation allowed it because implementation just didn't restrict it).

I will update this thread with details I get soon.

@bradfitz

This comment has been minimized.

Copy link
Member

commented Oct 3, 2018

@vkuzmin-uber, this isn't a compiler or runtime implementation bug. This is a spec bug, changing the language specification to permit such code.

@vkuzmin-uber

This comment has been minimized.

Copy link
Contributor

commented Oct 3, 2018

Yes, I completely understand it. Just mentioned that I have seen somewhere that CGo allowed it unintentionally. I meant that I need to check if current map API (runtime.map*) can support this feature w/o significant changes.

@vkuzmin-uber

This comment has been minimized.

Copy link
Contributor

commented Oct 11, 2018

I want to refresh and even explain motivation of this proposal.

As first comment says Golang doesn’t allow the following code:

func main() {
    type T struct{x, y int}
    m := make(map[string]T)
    m["foo"].x = 4 // cannot assign to m["foo"].x
    fmt.Println(m["foo"].x)
}

“cannot assign to struct field m["foo"].x in map”

As result, compiler forces you to use copy of whole object:

func main() {
    type T struct{x, y int}
    m := make(map[string]T)
    t := m["foo"]
    t.x = 4
    m["foo"] = t
    fmt.Println(m["foo"].x)
}

This solution has overhead: 2 lookups, 2 copies instead of 1 lookup and in-place change. Note that you can’t take reference to m[k] because map values are not addressable #11865

Lets consider another workaround - use *T that is allowed by Go 1.11:

func main() {
    type T struct{x, y int}
    m := make(map[string]*T)
    m["foo"] = &T{1, 5} // Initialization is required, otherwise it contains “nil” pointer
    m["foo"].x = 4
    fmt.Println(m["foo"].x)
}

You see, that you need to initialize m[k] always before you can read it or change. Inconvenience is more visible in the following example:

func main() {
    type T struct{x, y int}
    m := make(map[string]*T)
    m["foo"] = &T{}
    fmt.Println(m["foo"].x)
}

Here we initialize m[“foo”] with reference of default T{} value just to be able read it then.

Look at the code with map of structures:

func main() {
    type T struct{x, y int}
    m := make(map[string]T)
    fmt.Println(m["foo"].x)
}

It works and returns valid result 0.

We see here feature of map of structures - "auto-initialization" of elements of map (in fact, mapaccess returns "default zero object" but this is idea of what we get). But we still want to be able to update fields of elements if necessary in efficient way, w/o copy of whole object.

As result, I observe 2 Motivations to allow it:

  1. Elegant syntax. When we don’t need to do explicit lookup, copy, update and assign then.
  2. Efficiency. Even if we implement 1st as implicit actions, it may generate expensive code that is not obvious for author of code. Especially for cases like:
    m[“foo”].x = 4
    m[“foo”].x += 1
    m[“foo”].y = 3
    and etc.

This is why I look for implementation of this proposal where 1 and 2 can be achieved together.

*Note, comment above mentioned that "map of structures" assumes more efficient memory access but this is too general assumption to be "motivation" at this moment.

@gopherbot

This comment has been minimized.

Copy link

commented Oct 13, 2018

Change https://golang.org/cl/141647 mentions this issue: cmd/compile: assign to a field of map element

@vkuzmin-uber

This comment has been minimized.

Copy link
Contributor

commented Oct 13, 2018

Current map's internal API makes this change trivial. I have implemented a Proof of Concept and added simple test. Test passed.

@randall77

This comment has been minimized.

Copy link
Contributor

commented Oct 13, 2018

I'm in favor of this proposal.

The thing to worry about is that although this change is trivial to do today (thanks, Vladimir!) it might not be trivial with maps implemented some other way. For instance, in the current implementation map assignment returns the value by reference so it can be updated:

m := map[K]V{}
v := mapassign(m, k)  // type *V
v.f = 7

If we still did it the old way, passing the updated value in, it would be harder. We'd have to pass the offset and type of the portion of the value we'd like to update.

m := map[K]V{}
mapassign(m, k, offsetof(V.f), typeof(V.f), &7)

Further, if we allow op= changes to fields, the op would also need to be passed in.

Not the end of the world, but it's one more thing that constrains the map implementation.

@mdempsky

This comment has been minimized.

Copy link
Member

commented Oct 14, 2018

I'm fine with this proposal, but don't feel strongly about it.

Also, assuming it's to be accepted, I second @ianlancetaylor's suggestion that array indexing should be allowed too. Moreover, I'd suggest that an arbitrary number of struct field and array indexing operations (including mixing them) are allowed. (The latter doesn't appear to have been explicitly suggested yet in the thread.)

@vkuzmin-uber

This comment has been minimized.

Copy link
Contributor

commented Oct 17, 2018

Ian> We should also consider array indexes for maps whose value type is an array type. `m["key"][1] =

Matt> Also, assuming it's to be accepted, I second @ianlancetaylor's suggestion that array indexing should be allowed too.

I tried PoC it and think this deserves a separate Issue/Proposal where we can better consider pros/cons.

The reason - I see that map of fixed array doesn't behave like a map of structs. It seems:

m := map[int][10]int{}
m[k][5] = 1

does some lazy allocation of fixed array and simple PoC change hits panic on assigning to "nil". Sure I can dig into it to understand, just thought it's better to separate and make a progress with map of structures first. Let me know if you are ok with separate proposal, and I will create a new one with link to this one.

Matt>Moreover, I'd suggest that an arbitrary number of struct field and array indexing operations (including mixing them) are allowed. (The latter doesn't appear to have been explicitly suggested yet in the thread.)

I updated PoC with tests that covers valid and invalid cases. I think this is what you meant (except map of fixed arrays cases):

Valid:
m[k].x.y = v
m[k].a[i] = v
m[k].t.a[i].x.y = v

where a is a fixed array, i index of array

Invalid cases:
m[k][i] = v
m[k].t.another_map[key][i] = v

where i is an index of fixed array.

@vkuzmin-uber

This comment has been minimized.

Copy link
Contributor

commented Oct 17, 2018

I'm in favor of this proposal.

The thing to worry about is that although this change is trivial to do today (thanks, Vladimir!) it might not be trivial with maps implemented some other way. For instance, in the current implementation map assignment returns the value by reference so it can be updated:
Not the end of the world, but it's one more thing that constrains the map implementation.

Hi Keith,

I understand this concern and kept it in mind as one of the cons. But frankly, this limitation of language is so unusual so I think it was caused by initial implementation of map. So, first we need to understand if there is a value of the feature and I think yes:

  • User expects this behavior (I provided 2 User Motivations above: Elegant syntax + Efficiency)
  • If we allow it, next step can be "test/provide more efficient memory access", because theoretically memory access should be more efficient for "map of structures" vs "map of references to structure". I will look at Benchmarks to see if it shows something.
@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Oct 17, 2018

@vkuzmin-uber The language limitation is not caused by the initial implementation of map. It's caused by the fact that a map element is not addressable. A map element can not be addressable because it leads to confusing situations, like what should happen if you take the address of an element and then delete the key from the map, and then add it back again.

The language specifies that you can't assign to a value that is not addressable. That makes sense for a language like Go: if you can't take the address of a value, how can you change it? Of course that would make maps useless, so there is an exception: you can assign to a value if it is addressable, or if it is a map index (https://golang.org/ref/spec#Assignments).

This issue is about changing the language to add an additional exception to the assignment rules.

Thanks for being willing to work on it, but I want to stress that this is a language change proposal and that proposal has not yet been accepted. That status is not going to change just because we have an implementation. This language change is going to be accepted or rejected as part of an overall Go 2 process.

In particular, from a language perspective, I do not think that we can separate field assignment to map elements from indexing of map elements. You seem to suggest that they should be handled separately, but I don't think that is OK. When I say that I'm not referring to the implementation details, but rather to how the language is changed. We are going to make a language change decision first, then an implementation, not the other way around.

Thanks for your interest in this.

@mdempsky

This comment has been minimized.

Copy link
Member

commented Oct 17, 2018

It occurs to me that allowing array indexing involves an ambiguity over whether assignments happen before or after array bounds checking. For example:

m := make(map[string][10]int)
i := 100

func() {
    defer func() { recover() }()
    m["k"][i] = 42
}()

println(len(m))

Does this print 0 (we bounds check i < 10 before adding an element to the map) or 1 (bounds check after)? Or is it up to the implementation to decide?

@randall77

This comment has been minimized.

Copy link
Contributor

commented Oct 17, 2018

@mdempsky We definitely want the bounds checks to happen before the slot in the map is allocated. It's very similar to

   m[k] = f()

If f panics. We never want to expose an allocated-but-not-yet-assigned map value.

@mdempsky

This comment has been minimized.

Copy link
Member

commented Oct 18, 2018

@randall77 Yeah, I'm inclined to agree the bounds check should happen before the map element is allocated. It's not obvious to me though that the alternative (inserting a zero element into the map) is unreasonable. (I think exposing uninitialized memory is obviously unreasonable; not sure if that's what you had in mind by "not-yet-assigned".)

I think a better example for comparison is the first example from #23735, which emphasizes the question of whether (*p)[i] checks for nil or array index out of bounds first (and notably cmd/compile and gccgo differed). Except there we dodged addressing the question by pointing out the spec doesn't guarantee panic error text anyway.

@vkuzmin-uber

This comment has been minimized.

Copy link
Contributor

commented Oct 18, 2018

@mdempsky , @randall77

Just to clarify, do we expect the same behavior of a similar case, when array is wrapped into struct:

m := make(map[string]struct{a [10]int})
i := 100

func() {
    defer func() { recover() }()
    m["k"].a[i] = 42
}()

println(len(m))
@randall77

This comment has been minimized.

Copy link
Contributor

commented Oct 18, 2018

@vkuzmin-uber : yes. Your example should print 0.

@rsc

This comment has been minimized.

Copy link
Contributor

commented Mar 6, 2019

There's a lot of subtle corner cases here. To move forward with this, we need a real design doc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.