Smaller Refcounts #23

cgaebel · 2014-03-29T00:50:30Z

No description provided.

alexcrichton · 2014-03-29T01:22:26Z

This was discussed at great length on the original pr as well as in a meeting.

This RFC is quite terse and has very little statistical information to base any claims on. This would likely be a more amenable RFC with concrete real-world statistics, more thought out names, and more analysis into why the original decisions should be overturned. The rationale of "but there's one extra word" is not precise enough to have this be more actionable.

cgaebel · 2014-03-29T01:25:20Z

Thanks for the links! I didn't know this was previously discussed.

I'll get more data.

huonw · 2014-03-29T03:40:34Z

With allocators with size classes, how often does the weak count actually promote an allocation to the next class?

(Also, rustc probably won't be using just Rc. The ast etc are likely to be better suited to being an arena.)

eddyb · 2014-03-30T08:29:07Z

I actually switched the focus of my P experiment to an owning AST model instead of a sharing one, after I ended up with an i1 ref-count and realized the ast_map can do without sharing.

There's a few good reasons against sharing AST nodes - and if there currently was a way to sneak duplicates past the last folding stage (ast_map itself), you could use safe code to bypass borrowck.

huonw · 2014-04-13T03:17:25Z

Something like this came up on /r/rust maybe we could have:

pub struct Rc<T, Strong=uint, Weak=uint> {
     data: *mut RcBox<T, Strong, Weak>
}
struct RcBox<T, Strong, Weak> {
    data: T,
    strong: Cell<Strong>,
    weak: Cell<Weak>
}

impl<T, Strong: Num, Weak: Num> Rc<T, Strong, Weak> {
    fn downgrade(&self) -> Weak<T, Strong, Weak> { ... }
}

// no assumptions on Weak here
impl<T, Strong: Copy + Num, Weak> Clone for Rc<T, Strong, Weak> {
    fn clone() {
        ...
    }
}

Notably this would allow instantiating something like Rc<T, uint, ()> to avoid weak counts statically (the downgrade method would be designed to not work with (), by having a trait bound it doesn't satisfy), without needing to duplicate the functionality.

This would presumably also require CheckedAdd, so that Rc<T, u8, u8> worked correctly, i.e. didn't wrap for something with 256 references. One might be concerned about this from a performance perspective for the common case of Rc<T> (i.e. both uint) but theoretically we could have something like

fn clone(&self) -> Rc<T, Strong, Weak> {
    if Strong::max_value() < uint::max_value() {
        // use checked_add
    } else {
        // range is large enough, just add directly
    }
}

(which trades a little bit of no-opt performance for opt performance.)

(NB. this is mostly a chain of thought, so I haven't worked through the details.)

esummers · 2014-04-13T21:46:03Z

I'm not sure if this is doable in Rust, but Objective-C does smart pointers (via Automatic Reference Counting) with the reference count stored in the most significant 16-bits of the pointer on 64-bit platforms. That allows them to use a single word to represent a smart pointer since weak references are stored in the object itself (weak references are a bit different in Obj-C since they are tracked by pointer and zeroed when the original goes away). Apparently it also helps with atomic updates to the reference count since everything is in a single word.

huonw · 2014-04-13T22:17:56Z

Doesn't that mean they're storing the reference count at the non-shared end of pointers? i.e. copying/cloning one reference isn't reflected in the counts of other references.

dobkeratops · 2014-04-13T22:20:26Z

maybe its embedded in their object class info pointer or something

thestinger · 2014-04-13T22:21:20Z

I think it would be better to consider exposing a subset of the type without weak pointer support before adding another failure case.

esummers · 2014-04-13T22:25:42Z

My bad, I obviously was not thinking clearly. They store the ref count in the isa pointer, not the object pointer. That technique would only work for some sort of smart pointer to structs that have virtual inheritance. https://www.mikeash.com/pyblog/friday-qa-2013-09-27-arm64-and-you.html

dobkeratops · 2014-04-13T22:27:27Z

to throw another idea in there, could a compact refcount select an extended area when it reaches a certain value .. probably too much complexity. I think anyone actually selecting a compact refcount would be doing it for a good reason ... eg, I know i'm not going to have over 4billion objects refering to a texture, because the vast majority of the 8gb memory is actually storing textures, not objects)

esummers · 2014-04-13T22:35:22Z

Why are two words used anyway? Can't the weak ref and strong ref be 32-bit on 64-bit platforms and 16-bit on 32-bit platforms?

EDIT: OK, answer is safety.

thestinger · 2014-04-13T22:37:45Z

You can certainly make more than 65536 references on 32-bit, and more than 4294967296 on 64-bit. In addition to making the type less scalable, it would break memory safety without adding overflow checks.

huonw · 2014-04-13T22:37:48Z

You can easily have more than 65536 references to an object on a 32 bit platform

let x = Rc::new(1);
let v = range(0, 1_000_000).map(|_| x.clone()).collect::<Vec<Rc<int>>>();

(You may regard this as unlikely, but it's still entirely possible. uint is the smallest type guaranteed to be large enough to store the maximum number of references.)

thestinger · 2014-04-13T22:39:29Z

Creating more than 2^32 references at 8 bytes a pop only requires 32GiB of memory.

cgaebel · 2014-04-13T22:46:19Z

How expensive would it be to just have the refcount overflow check, where the "handling" code is just statically predicted as unlikely? Wouldn't this just be a well predicted jump conditional on the overflow register? Isn't that crazy cheap?

Also, this is a cost that's only paid when creating and destroying a bunch of references to something, which shouldn't be especially common. As far as I know, the "best practice" when dealing with refcounts is to borrow it and use the borrowed pointer as much as possible.

dobkeratops · 2014-04-13T22:48:13Z

I can see the default useful to most people is uint refcounts. There are situations where you control the number of objects though.. eg. when you explicitely manage textures and numbers of objects to fit within memory and frame-rate budgets There's plenty of situations where you might be using 64bit adress space but 32 or even 16 bits worth of 'count' handles any 'management'. In games the majority of memory is textures, then vertex arrays, and the CPU doesn't traverse these at fine grain, it just tells the GPU what to do with large batches. But back on the xbox360 and ps3 we were kept very busy shaving bytes off control structures to prevent cache misses that crippled the cpu, and fiddling with alignment to keep things on cache-lines boundaries. (... and reworking things to avoid branches.. whch also crippled its pipeline sadly, worst of all worlds - even extra checks wouldn't have been acceptable, you'd have needed to ensure you had the option to compile them out)

esummers · 2014-04-13T22:51:22Z

I think if you have a specialty case, you would just make your own RC for your crate. It probably doesn't make sense to have something in std unless it is safe.

thestinger · 2014-04-13T22:52:01Z

@cgaebel: Adding new sources of unwinding is never cheap. It breaks many optimization passes all the way up the stack. If it called abort then sure, it would likely only result in wasted instruction cache space. However, that's not what Rust does when it encounters failures like this.

dobkeratops · 2014-04-13T22:53:47Z

i guess if the language has HKT in future algorithms will be able to abstract over custom pointer types :)

dobkeratops · 2014-04-14T08:51:44Z

Would you consider the same thing for vectors, & slices..;

struct Vec<T,IndexType=uint> {
    len: IndexType,
    cap: IndexType,
    ptr:*mut T
}
impl<T,IndexType> Index for Vec<T,IndexType> {
    fn index(i:IndexType)->&T {... }
}

That would end a lot of the pain I was having with casting indices , in the right way.

I gather the rust compiler itself has u32 node id's. Its this middle ground of machines with 4,8,16 mb where 64bit addressing is overkill,but 32bits is insufficient and segmenting things into multiple 32 bit spaces per resource works well.

u32 indexing would be my most common case

I know the servo people also perceive problems with pointer overhead, they want to express a node hierarchy, I would suspect their use case might suit this sort of thing...an array of nodes and 32bit indexing, or 32bit offsets within an arena with a max size of 4gb for the DOM when running on phones ..

with objects of 16byte alignment (which you want for SIMD vec4 types) a 32bit index is sufficient to cover 64 gb, and its more likely your memory is divided between different classes of resource anyway

i've also heard talk of a 'smallvector' elsewhere. Might parameterizing the index (and allocator) mean the Vec can do that job.

thestinger · 2014-04-14T20:03:15Z

i've also heard talk of a 'smallvector' elsewhere. Might parameterizing the index (and allocator) mean the Vec can do that job.

The small vector optimization is the opposite of what you're proposing.

dobkeratops · 2014-04-14T20:38:47Z

is there a link describing the 'small vector' then . i'd also heard slices 'might fill a niche a bit like small vectors', but i think slices can be slices into large vectors...
do i just need to roll my own to be content

thestinger · 2014-04-14T20:44:08Z

Slices and smaller index fields are both unrelated to the small vector optimization.

The libc++ implementation of std::string is still 24 bytes on x86_64 (pointer, length, capacity) but is capable of storing up to 23 byte strings directly in the object itself without performing dynamic allocation. One byte is used to distinguish between large and small strings and record the small string length. In this case, reducing the size of the length and capacity fields would even be counter-productive.

A more general small vector: http://llvm.org/docs/ProgrammersManual.html#llvm-adt-smallvector-h

brson · 2014-04-29T21:33:16Z

Merged as RFC 13, but not accepted. This discussion was had previously when the original decision was made to merge Rc with weak refcounting. Although it's a difficult tradeoff, the extra word was seen as worth it to reduce the number of refcounted types.

dobkeratops · 2014-04-30T05:39:28Z

was the idea of parameterizing the refcount type itself also rejected? i know theres' details with introducing a fail case if you use a smaller refcount type, but this would save the community from implementing their own variations to get the desired behaviour (somethign that will happen many times, independantly). for those of us who target machines with 4-16mb ram , uint counts and indices everywhere are wasteful, and 32bit builds are insufficient

Fix some typos in tutorial

Add shell completion

smaller refcounts RFC

a1d6e9e

brson merged commit a1d6e9e into rust-lang:master Apr 29, 2014

withoutboats pushed a commit to withoutboats/rfcs that referenced this pull request Jan 15, 2017

Merge pull request rust-lang#23 from killercup/patch-2

b7f6d7d

Fix some typos in tutorial

Centril added the A-allocation Proposals relating to allocation. label Nov 23, 2018

wycats pushed a commit to wycats/rust-rfcs that referenced this pull request Mar 5, 2019

Merge pull request rust-lang#23 from lazybensch/add_shell_completion

454203d

Add shell completion

crlf0710 mentioned this pull request Jan 8, 2020

Tracking issue for RFC 2044: dual-MIT/Apache2 licensing rust-lang/rust#43461

Open

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Smaller Refcounts #23

Smaller Refcounts #23

cgaebel commented Mar 29, 2014

alexcrichton commented Mar 29, 2014

cgaebel commented Mar 29, 2014

huonw commented Mar 29, 2014

eddyb commented Mar 30, 2014

huonw commented Apr 13, 2014

esummers commented Apr 13, 2014

huonw commented Apr 13, 2014

dobkeratops commented Apr 13, 2014

thestinger commented Apr 13, 2014

esummers commented Apr 13, 2014

dobkeratops commented Apr 13, 2014

esummers commented Apr 13, 2014

thestinger commented Apr 13, 2014

huonw commented Apr 13, 2014

thestinger commented Apr 13, 2014

cgaebel commented Apr 13, 2014

dobkeratops commented Apr 13, 2014

esummers commented Apr 13, 2014

thestinger commented Apr 13, 2014

dobkeratops commented Apr 13, 2014

dobkeratops commented Apr 14, 2014

thestinger commented Apr 14, 2014

dobkeratops commented Apr 14, 2014

thestinger commented Apr 14, 2014

brson commented Apr 29, 2014

dobkeratops commented Apr 30, 2014

Smaller Refcounts #23

Smaller Refcounts #23

Conversation

cgaebel commented Mar 29, 2014

alexcrichton commented Mar 29, 2014

cgaebel commented Mar 29, 2014

huonw commented Mar 29, 2014

eddyb commented Mar 30, 2014

huonw commented Apr 13, 2014

esummers commented Apr 13, 2014

huonw commented Apr 13, 2014

dobkeratops commented Apr 13, 2014

thestinger commented Apr 13, 2014

esummers commented Apr 13, 2014

dobkeratops commented Apr 13, 2014

esummers commented Apr 13, 2014

thestinger commented Apr 13, 2014

huonw commented Apr 13, 2014

thestinger commented Apr 13, 2014

cgaebel commented Apr 13, 2014

dobkeratops commented Apr 13, 2014

esummers commented Apr 13, 2014

thestinger commented Apr 13, 2014

dobkeratops commented Apr 13, 2014

dobkeratops commented Apr 14, 2014

thestinger commented Apr 14, 2014

dobkeratops commented Apr 14, 2014

thestinger commented Apr 14, 2014

brson commented Apr 29, 2014

dobkeratops commented Apr 30, 2014