-
Notifications
You must be signed in to change notification settings - Fork 18.9k
Description
@mdempsky points out in #41191 (comment) that the late addition of string and []byte to the embedding proposal has some unfortunate side effects that we might want to make sure we are happy with.
There are three details of string and []byte embedding that are at least surprising. This issue is one of them. See #43217 for the other two.
The original design had only embedding in globals and only embed.FS. When I circulated the draft privately to a few people who had written embedding tooling, one person asked about local variables, and it seemed easy to add, so I did. (This was fine because the embed.FS was immutable, so it was really just a question of variable scope, not semantics.)
When we had the initial public discussions before making a formal proposal, many people asked for string and []byte. Those too seemed easy to add, so I did. But the two different additions interact poorly, because they created the intersection "function-local, mutable embedded data".
It seems like there are three options for how that would work:
(a) the []byte data is shared by all invocations of the function.
(b) the []byte data is freshly allocated (copied) on each call to the function.
(c) the []byte data is magically unwritable, causing a panic if written
Right now the behavior is (a), but I did not do that intentionally. It is difficult to explain to users, because it differs from the way every other local variable behaves. It seems like a clear bug.
The clearest alternative is (b): when the declaration statement in a function is executed, it is initialized with a fresh copy of the data instead of the actual data. That is less surprising than (a) but potentially very expensive. That's at least not a bug like (a) but certainly a performance surprise that would be good to avoid.
I listed (c) for completeness (mmap the data read-only) but it's not really on the table: we've considered the idea of that kind of data in the past in other, more compelling contexts and decided against it. This case is not nearly special enough to warrant breaking the data model by introducing "unwritable slices" into the core of the language.
That leads me to “(d) none of the above,” which I think is the right answer, not just for Go 1.16 but generally.
The string and []byte functionality seems important ergonomically. There are plenty of times when you just want to embed a single file as data, and having to go through the embed.FS machinery to get just a single string or []byte is a lot more work.
On the other hand, the function-local variable functionality seems much less important ergonomically. It's always easy to move the two lines (//go:embed and var declaration) up above the function.
So if these two can't coexist, the choice seems clear: string and []byte support is doing real work, while function-local variables are not. The underlying data is essentially a global anyway. Writing all embedded data as globals makes it very clear for []byte variables that there's only one instance of the data (and that there's no implicit copying either).
So that's what I suggest: remove support for embedding in local variables. It's easy to put the embedded variables just above the function that needs them, and then it's very clear that they are globals, there are no aliasing surprises, and so on.
What do people think? (Thumbs up / thumbs down is fine.)
Thank you!