-
Notifications
You must be signed in to change notification settings - Fork 17.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
weak: new package providing weak pointers #67552
Comments
glad this is being resurrected *scnr* |
How does this interact with the unique package (#62483)? Isn't it possible for a user to implement unique (or at least solve the same problems that unique solves) using weak pointers? If we add weak, will we regret having both weak and unique? |
@cespare I don't think we'll regret having both. The main reason to have A secondary reason is that it's not actually possible to safely implement the concurrent map underlying Weak pointers are still useful for lots of other cases where globalness is less useful and maybe even not what you want at all. For example, implementing weak-keyed maps, which are generally used for attaching data to memory you don't own the lifecycle of. There's also plenty more reasonable ad-hoc use-cases that don't neatly fall into something std would support out of the box. |
What does |
@Merovius Yep, that's right. Thanks! Updated the proposal. |
Although a (For example, a Rust analog would be Is the zero value of |
From my perspective, I believe that's what the semantics would have to imply. In the implementation, yes, I'd probably just have it return the zero value of the weak pointer. EDIT: Oh, it occurs to me that passing nil for a specific type could have a different value than the zero value. (Some per-type sentinel value.) I don't think we should do that. I think the zero value of a |
go/src/internal/weak/pointer.go Line 51 in 3776465
I see "weak pointer" under the "internal" package. Is there any progress on this proposal currently? |
@LukeXeon This proposal is awaiting review. If accepted, the |
Unlike server programs that allocate and release memory for each request and persist it in the DB, a weak pointer is essential for client programs that hold multiple containers or partially release items. I feel that Go lacks support for such programs. (As an aside, #64777 is another example of a feature needed for client programs) |
In our discussion yesterday, @bradfitz asked if weak pointers made it possible to maintain a map of expensive derived information about something without preventing the GC of that something. Here is a worked example of how weak pointer enables that:
I am assuming runtime.AddCleanup from #67535, which is guaranteed to work with any pointer f. |
@rsc can this please find its way into the official documentation or alternatively the Tour of Go? Such things really need to be easily accessible, and not buried only in an issue comment. |
I think it would make sense for this to be an example in the weak package, if the proposal is accepted. Putting it in the Go tour seems TMI. |
Yeah, it could otherwise become a "Tor-Tour of Go" (sorry, couldn't resist) |
Following up on my comment from last week, here is a potential weak cache abstraction that avoids holding the lock during expensiveComputation and also wraps up the idioms nicely so you don't have to retype them. I am not proposing to add it to the package (others can argue for that if they want) but it's here as an example, and we can include it as an example in the package too.
|
Since the cache doesn't do weak -> strong pointer, and with current GC this should also work, no? type Cache[K any, V any] struct {
f func(*K) V
m atomic.Map[uintptr, func() V]
}
func NewCache[K comparable, V any](f func(*K)V) *Cache[K, V] {
return &Cache[K, V]{f: f}
}
func (c *Cache[K, V]) Get(k *K) V {
kw := uintptr(unsafe.Pointer((k))
vf, ok := c.m.Load(kw)
if ok {
return vf()
}
vf = sync.OnceValue(func() V { return c.f(k) })
vf, loaded := c.m.LoadOrStore(kw)
if !loaded {
// Stored kw→vf to c.m; add the cleanup.
runtime.AddCleanup(k, c.cleanup, kw)
}
return vf()
}
func (c *Cache[K, V]) cleanup(kw uintptr) {
c.m.Delete(kw)
}
var cached = NewCache(expensiveComputation) |
@DmitriyMV There is a race in that code. It could theoretically happen that the GC identifies k as garbage, puts it back on a free list for future allocation, queues the cleanup, and then the program allocates a new k, gets the recycled address, and reuses the stale value because the cleanup has not gotten a chance to run yet. (It's also not guaranteed to work in an implementation that moves objects, but objects aren't moving today except on the stack, and k escapes to the heap so it won't be a stack object in the current implementation.) |
This proposal has been added to the active column of the proposals project |
Have all remaining concerns about this proposal been addressed? The details are in the top comment under the "Proposal" heading. |
I am trying to figure out what the Is that the While searching for that I also came across another related proposal I didn't see referenced here yet: #43615. Russ's example also uses Hopefully these cross-refs are useful to someone, but also it would seem we cannot use that example in the |
@ChrisHines You can't use the example as written, but you can use |
I once came to the conclusion that ephemorons (
https://dl.acm.org/doi/pdf/10.1145/263698.263733) were better than weak
pointers and starting with .NET many other post Java languages have come to
the same conclusion and implemented them (see
https://en.wikipedia.org/wiki/Ephemeron for a list). Ephemerons solve some
thorny key/value problems where reachability of key should be independent
of the value structure thus allowing value to reference key without keeping
key alive.
The GC trace implementation is 3 pass instead of the current 2 pass and
complexity is O(ND) where N is the number of unreachable ephemerons and D
is the depth of the structures needing to the traced from them in pass 3. I
suspect if we do weak pointers there will be a request for ephemerons once
users run into the problems.
I have not thought through the concurrency issues beyond noticing that Go
would be the first to deal with such issues. Implementing iterators that
can reference keys implicitly are another problem I haven't thought through.
In any case if Go doesn't implement ephemorons this issue should track the
discussion..
…On Thu, Aug 15, 2024 at 1:20 AM Axel Wagner ***@***.***> wrote:
@ChrisHines <https://github.com/ChrisHines> You can't use the example as
written, but you can use sync.Map with type-assertions instead.
—
Reply to this email directly, view it on GitHub
<#67552 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAHNNH4F64GQSXLL2HH3FFDZRQ3AZAVCNFSM6AAAAABICGOSGKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOJQGY2DSNJWGU>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Hi Rick! I'm glad you brought up ephemerons -- they resolve a real issue when working with weak pairs (like in maps) and this proposal certainly should mention them somewhere. This issue will almost certainly come up in the future. TL;DR: My gut feeling is we won't regret the weak pointers as described in this proposal. And now, the long version... In general, my main gripe with ephemerons is the value side. The question on my mind is "which value?" Having only one value for each possible key runs into composability problems like .NET 4.0 introduced the EDIT: A previous version of this comment contained an example that wasn't totally coherent. I've replaced it below with something much clearer. Like, I think in the spirit of Russ' cache example, given both
... in order to avoid any leak issues in case the value-generating closure or the cached value itself containing a strong reference to the key. (Which seems unlikely, but is of course still possible.) Since I haven't fully worked out the details, but here's what I think
This is not quite a perfect mapping. What's lost is the equality semantics that this proposal has for weak pointers. In fact, there's a general problem here with dependent handles and maps: how does one define the key type? A simple So... my gut feeling is that we won't regret weak pointers, and we may just end up with both. I think this state of things makes sense to me, but I'm happy to be proven wrong. The weak pointers as described in this proposal are fairly simple and straightforward (both in implementation and usage), and have clean semantics that fall out nicely from the API. On a final note, the implementation complexity of ephemerons/
Of course, the downside here is the extra check on the mark hot path. As I alluded to in (2) above, I suspect we can hide this cost, but not without some engineering work. (As an additional tangential note, I think Java never got ephemerons, but I'm not sure why.) |
I definitely got some things wrong in my previous comment earlier -- I've corrected them now. I've also worked out an example of how ephemerons might be exposed in Go, but there are some difficulties with composability and defining comparability on an ephemeron, such that it actually can be used in all the same ways. I think on a theoretical level you can write your code in most (maybe all?) use-cases to use an ephemeron instead, but it's more complex and may require you to reimplement a lot more than you bargained for (like, a new special map implementation). So, I think there's still some utility to keeping weak pointers around, and I also don't think we should ignore the complexity (and possible performance difficulties) introduced by an ephemeron implementation. I'm also happy to hear other opinions. |
Section 10 of this paper (https://dl.acm.org/doi/pdf/10.1145/3226225) is an
interesting read. The work is based on the same Sapphire work that the Go
GC is based on so things match up pretty well with the Go GC. It argues why
deletion barriers and atomic phase changes work. Their CLEAR phase is
similar to Go's sweep phase and thus atomic for similar reasons. I am a bit
concerned about making a weak ref Get fast (inlined) and atomic.
Earlier in the paper there are some warnings about how common weak refs
were in some of the DeCapo benchmarks. This made me concerned about sweep
storms caused by weak reference niling. In addition nobody seems to have a
good answer to clearing stale weak entries in large caches. Perhaps we will
end up yet again talking about building some sort of runtime notification
scheme.
WRT Ephermerons - I have no implementation but the big O and complexity of
adding another GC phase feels like the cost / benefit will be too high and
we have no user demand or requests.
None of this is intended to be negative about the proposal as is, I just
wanted to get issues and literature on the record.
…On Sat, Aug 17, 2024 at 3:33 AM Michael Knyszek ***@***.***> wrote:
I definitely got some things wrong in my previous comment earlier -- I've
corrected them now. I've also worked out an example of how ephemerons might
be exposed in Go, but there are some difficulties with composability and
defining comparability on an ephemeron, such that it actually can be used
in all the same ways. I think on a practical level you can write your code
in most (maybe all?) use-cases to use an ephemeron instead, but it's more
complex and may require you to reimplement a lot more than you bargained
for (like, a new special map implementation).
So, I think there's still *some* utility to keeping weak pointers around,
and I also don't think we should ignore the complexity (and possible
performance difficulties) introduced by an ephemeron implementation. I'm
also happy to hear other opinions.
—
Reply to this email directly, view it on GitHub
<#67552 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAHNNH646LWVKABYUOIFHIDZR34EJAVCNFSM6AAAAABICGOSGKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOJUG42DMNJRGY>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
The use of weak pointers and AddCleanup in the example above points out the
big footgun of both SetFinalizer and AddCleanup. Both result in a user
defined function being run asynchronously in undefined order by the GC.
Perhaps AddNotification would be a better API. Instead of passing
AddCleanup a func to call, AddNotification would be passed a channel.
Instead of calling the function the GC would clear the referent and then
put the weak pointer on the channel. The channel would connect to a user
defined goroutine and do whatever is needed at a time and manner completely
under user control. In the example above the user routine would read from
the channel and delete the cache entry. Reducing non-determinism is usually
a win.
…On Fri, Aug 23, 2024 at 12:03 PM Rick Hudson ***@***.***> wrote:
Section 10 of this paper (https://dl.acm.org/doi/pdf/10.1145/3226225) is
an interesting read. The work is based on the same Sapphire work that the
Go GC is based on so things match up pretty well with the Go GC. It argues
why deletion barriers and atomic phase changes work. Their CLEAR phase is
similar to Go's sweep phase and thus atomic for similar reasons. I am a bit
concerned about making a weak ref Get fast (inlined) and atomic.
Earlier in the paper there are some warnings about how common weak refs
were in some of the DeCapo benchmarks. This made me concerned about sweep
storms caused by weak reference niling. In addition nobody seems to have a
good answer to clearing stale weak entries in large caches. Perhaps we will
end up yet again talking about building some sort of runtime notification
scheme.
WRT Ephermerons - I have no implementation but the big O and complexity of
adding another GC phase feels like the cost / benefit will be too high and
we have no user demand or requests.
None of this is intended to be negative about the proposal as is, I just
wanted to get issues and literature on the record.
On Sat, Aug 17, 2024 at 3:33 AM Michael Knyszek ***@***.***>
wrote:
> I definitely got some things wrong in my previous comment earlier -- I've
> corrected them now. I've also worked out an example of how ephemerons might
> be exposed in Go, but there are some difficulties with composability and
> defining comparability on an ephemeron, such that it actually can be used
> in all the same ways. I think on a practical level you can write your code
> in most (maybe all?) use-cases to use an ephemeron instead, but it's more
> complex and may require you to reimplement a lot more than you bargained
> for (like, a new special map implementation).
>
> So, I think there's still *some* utility to keeping weak pointers
> around, and I also don't think we should ignore the complexity (and
> possible performance difficulties) introduced by an ephemeron
> implementation. I'm also happy to hear other opinions.
>
> —
> Reply to this email directly, view it on GitHub
> <#67552 (comment)>, or
> unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AAHNNH646LWVKABYUOIFHIDZR34EJAVCNFSM6AAAAABICGOSGKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOJUG42DMNJRGY>
> .
> You are receiving this because you commented.Message ID:
> ***@***.***>
>
|
We'd considered using a channel for I get your point about the hazards of an API that says "I'll run this function asynchronously in the future" in general, but executing a function asynchronously is always going to be something you can write yourself. And things like FWIW, I think this part of the discussion would be better suited to the |
Have all remaining concerns about this proposal been addressed? The details are in the top comment under the "Proposal" heading. |
It would be great analyzing top production use cases for weak pointers in other programming languages before deciding on how to implement them in Go. Probably, it would be better providing higher-level solutions for these practical use cases in standard library instead of exposing lower-level weak pointers. |
@mknyszek could you please add to your proposal 2-3 examples of practical problems could not be solved now without weak pointers? |
Here's one example use: #67552 (comment). |
Based on the discussion above, this proposal seems like a likely accept. The details are in the top comment under the "Proposal" heading. |
I'm unaware of other practical cases for weak pointers. Probably, it is better providing generic thread-safe cache implementation inside The API of this cache may look like the following (feel free making it generic - I tried to do this, but the result wasn't good): // Cache is goroutine-safe cache with automatic cleanup on memory shortage.
//
// TODO: define what does "memory shortage" mean.
// Ideally the cache shouldn't be cleared on every GC cycle
// if there is enough free memory and the memory usage is stable over time.
//
// It is OK to use zero cache.
type Cache struct {
// Retention is optional retention for the cached items.
// If it is set to positive value, then items without
// access for the given duration are automatically deleted.
//
// Do not change this field after the first use of the cache.
Retention time.Duration
// Free is optional callback, which is called when the cached item is deleted.
//
// Do not change this field after the first use of the cache.
Free func(k, v any)
}
// Get returns value for the given key k stored in c.
//
// Nil is returned if the cache doesn't contain value for the given key.
func (c *Cache) Get(k any) any
// Set stores the given value v under the given key k in c.
func (c *Cache) Set(k, v any) According to my practice the |
The proposal names three general use-cases:
First off, let me say that weak pointers are very niche. I think it's easy to write them off if you've never run into a case where they'd really come in handy. Also, apologies for not walking through concrete examples in the original proposals. Because weak pointers (and their higher-level abstractions) are so niche, it's often a distraction to try to talk about specific examples in detail. But I'll admit that the broader internet is... frighteningly bad at providing concrete examples. So, let me give it a shot with something I've worked on lately.
The thing is, stacks are usually best represented as slices of PCs. But slices cannot be used with As an example where binding lifetimes together is useful, we can take the previous example a step further. Every stack consists of a bunch of PCs which have some additional information associated with them. We want to keep that data around as long as that PC is still in use. One thing we can do is canonicalize each PC, turning it into a pointer:
And then use this canonical pointer to look up the rest of the stack information. (And we get the added benefit of deduplicating it along the way.)
(EDIT: Entries in the map would be cleared by calling Why store it separately? Because hashing a Is this the best design for this? I don't know -- there's certainly a space of designs here and maybe one of them is far and away the best. But I think it illustrates that with weak pointers, we can explore the space of designs to achieve what we want. Hopefully I didn't totally butcher this example and it makes sense to others as well. Coming back to the general applicability of weak pointers, I think it's fair to say that weak pointers are used with maps most of the time. Seebs said as much in the OP to #43615. But, I think that providing the building blocks will capture far more use-cases for a lower maintenance and complexity burden for the Go project. Today we have proposals for:
These are three powerful, orthogonal, and also surprisingly simple building blocks that can be used cover most of the "I need a weak something" design space. And I do want to stress that these things are fairly simple in their own right, including (As an additional footnote, I don't foresee the semantics described in this proposal as being a problem for possible future GC implementations either. If anyone has insights on that though, it would be good to hear them now.) To be clear, I do I think it is important to ask why not just try and provide higher-level abstractions as necessary. But I also want to point out that that's been the unstated status quo for years, and we haven't gotten very far with that. I think part of the problem may be that few of the higher-level abstractions really cover enough use-cases to be worth the time to design a really good API. We have Anyway, that's my take. |
I'm not sure I understand this. The map will always refer to the |
That's correct. Sorry about that; I elided the fact that one would also need to use EDIT: Thanks. I updated my comment above to include that detail, and also that once we have |
What is the relationship between maphash.Comparable and weak? |
A couple clarifying questions.
Is it possible for two pointers to compare equal if they were created from distinct objects? E.g. the first was reclaimed and its memory subsequently reused.
Although the language is unambiguous, can it be strengthened to be "if and only if the original pointer was reclaimed"? That is, are we guaranteed that a weak pointer's |
For an object to be reclaimed, there can't be any pointers to it anymore. So, no.
"If and only if" is not true under the proposed API. It also returns |
I think the scenario in question is one where you have a weak pointer to |
Ah, that makes sense. The doc comment says
I would read that as "yes, in that scenario, the weak pointers are equal", which does seem strange. Especially if it would mean that (by a strict reading) you would now have two weak pointers comparing equal, with one returning So, really "I don't know" and I'll step back from this question, sorry for the noise :) |
No, that is not possible. Note that this can't be true for things like using weak pointers in maps to be useful.
Perhaps I didn't explain this well in the original proposal, but weak pointers should be considered to be updated atomically, globally. It's not possible to have any skew, where one simultaneously returns nil and another does not.
This is kind of quarantining of memory not necessary, and it would be leaky. The implementation part of the proposal describes how this would work. In short, weak pointers point are themselves objects which are mapped 1:1 with some offset in some other object. The address of each "weak pointer object" at determines its identity. |
Great! Could this also be strengthened in the documentation, such as "Two Pointer values compare equal if and only if the pointers..."? It's not clear to intuit this property in the context of pointers to reclaimed objects. Edit: removed the second portion, I believe I realized my mistake. |
No change in consensus, so accepted. 🎉 The details are in the top comment under the "Proposal" heading. |
What is a weak pointer?
Weak pointers (or weak references, as they are referred to in other languages) allow developers to reference memory without preventing the garbage collector from reclaiming it. To prevent visible dangling references, weak pointers become nil when the referenced memory is collected. Weak pointers can be converted to regular ("strong") pointers which do prevent the garbage collector from reclaiming memory, and allow typical use and access to that memory.
Weak pointers are typically a bit harder to work with than regular pointers because they can become nil at any time. Virtually every weak-to-strong conversion must be guarded by a nil check. Often, weak pointers will become nil at unexpected times.
Nonetheless, weak pointers exist in many languages because they are useful. The main use-cases for weak pointers are related to efficient memory management and reclamation. For example, efficiently managing memory for canonicalization maps, or for memory whose lifetime is bound to the lifetime of another object (akin to JavaScript's WeakMap). Another good use case for weak pointers is for hinting to the GC that it's OK to drop some resources because they're cheap to reconstruct later, especially if those resources have a large memory footprint.
Proposal
I propose adding a new
weak
package to the standard library with the following API.Design discussion
Representation
The weak pointer we propose would be represented via an indirection object under the hood. This means that each pointer that has any weak pointers associated with it also has an associated 8 byte allocation that contains a single pointer. This pointer is the actual weak reference. This representation is very CPU efficient, since the GC can atomically set a single pointer to make all weak references to an object nil.
This 8 byte allocation is rarely a problem in practice since weak pointers are most often useful for memory efficiency anyway. For example, canonicalization maps are deduplicating memory already, so the extra 8 bytes per canonical copy is relatively cheap. However, it may be worth calling out this detail in the documentation as the C# language does.
This representation has other benefits. One benefit is that the API's equality semantics fall out very naturally. Weak pointers created from the same pointer remain equal even after that pointer no longer exists. This makes it possible to build maps keyed by weak pointers. Another benefit is its simplicity: very little of the GC has to change to support weak pointers.
Avoiding object resurrection
A subtle but important property of the weak pointer implementation is that it must not enable object resurrection. This would create many complications, including the fact that weakly-referenced objects would take 2 GC cycles to reclaim; and they could not be involved in cycles, or they would never be reclaimed (much like finalizers).
A simple way to implement weak references that maintains this property is to guard weak-to-strong conversions with ensuring the object is swept. A swept object cannot be resurrected, because it is always definitely reachable or unreachable, nothing in between. This means the weak-to-strong conversion path may need to sweep the object, but this happens at most once per GC cycle, and the Go runtime's sweeper is quite fast, so the cost shouldn't be noticeable in practice.
Why this specific interaction with finalizers?
The interaction between weak pointers and finalizers that resurrect objects is subtle. There are two choices: the weak pointer can go nil before the finalizer runs, or it can track the object's resurrection, and only go nil once the object is well and truly dead. In C# jargon, this is the difference between "short" weak reference and a "long" weak reference.
At first glance, the latter seems clearer since the weak pointer really only goes nil once nothing can reference the object. However, this semantics has a big problem, and it's subtle.
Today, the decision to resurrect an object in a finalizer is internal to an API’s implementation. If we allowed weak-to-strong conversions after finalization, then clients of an API could resurrect an object without the cooperation of its implementation, at which point the object's internals may be in an inconsistent state. This creates the potential for a race that is not immediately apparent to either the API authors or the client. Worse still, these races will be very rare and unlikely to be caught by testing, even with the race detector enabled. If the weak pointer goes nil before the finalizer runs, then such races are impossible.
It is telling that other garbage collected languages either make "short" weak references the default and do not recommend their "long" form (C#), or only offer the "short" form to begin with (Java).
Risks
Breaking the GC abstraction
The
unique.Handle
proposal explicitly rejected weak pointers as an alternative on the grounds that they are difficult to use. Fundamentally, this is because they break the GC abstraction by revealing the timing with which memory is actually reclaimed, and this is mainly a problem because that timing is unpredictable. The GC abstraction of "infinite memory" (only allocation, no apparent free) is important in general for composability.While it is true that weak pointers are more difficult to use than regular pointers, we believe that we have picked an API (building on the experiences of other languages) that minimizes surprises, maintains composability, and has a simple implementation. This was discovered during the implementation of
unique.Handle
and it was compelling enough that it led to this proposal.Although weak pointers will be misused in some cases, providing documentation, examples, and advice in the GC guide will go a long way to mitigating that.
The text was updated successfully, but these errors were encountered: