Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add consuming cheap cast #142

Merged
merged 5 commits into from Jan 21, 2018
Merged

Conversation

@faern
Copy link
Contributor

faern commented Jan 14, 2018

There are a lot of places where we increment and decrement the retain count when we would not have to. For example, a very common use-case with PropertyLists is to just directly cast them to their correct sub type and then throw the PropertyList away (because there is nothing you can do with the PropertyList instance directly.)

The reason we do this, I guess, is mostly because all Drop implementations force a retain count decrement. So we have to keep incrementing it to keep it correct. But if we add a flag to types that can be used to disable the decrement, then we can also get rid of the increment when we virtually just want to cast a pointer between two types.

This PR implements this by adding a boolean flag to all CF structs that is used by the Drop impl to determine if it should CFRelease the instance or not. Then this flag is modified in appropriate places where we cast to other types and consume ourselves.

I added some benchmarks locally. I did not want to commit them since benchmarking only works on nighty, and I did not feel like adding a feature flag for it etc. Anyhow, the following benchmarks:

#[bench]
fn bench_before(b: &mut Bencher) {
   let string = CFString::from_static_string("Bar");
    b.iter(|| unsafe {
        string.clone()
            .to_CFPropertyList()
            .downcast::<_, CFString>()
            .unwrap()
            .disable_release();
    })
}

#[bench]
fn bench_after(b: &mut Bencher) {
    let string = CFString::from_static_string("Bar");
    b.iter(|| unsafe {
        string.clone()
            .into_CFPropertyList()
            .downcast_into::<_, CFString>()
            .unwrap()
            .disable_release();
    })
}

Given the following results:

test propertylist::test::bench_after         ... bench:          16 ns/iter (+/- 3)
test propertylist::test::bench_before        ... bench:          73 ns/iter (+/- 14)

The final call to disable_release() in the benchmarks are to prevent the type coming out of the unwrap() from calling CFRelease on drop, and thus avoid measuring that. I want to try to measure only {to,into}_CFPropertyList, downcast and downcast_into. The clone is needed because the into version requires it and it would then not be fair to not have it on both.

I would say that speeding it up by a factor of ~4 can be well worth it? (Well, that 4 is of course no good measurement of anything, microbenchmarks etc. etc.. The benchmark mostly shows that a lot can be gained from avoiding some retain counting.)


This change is Reviewable

@faern
Copy link
Contributor Author

faern commented Jan 14, 2018

Other than the cheap into casts this PR also adds the normal downcast to CFType making it easy to downcast it directly to a subtype. As you can see this makes scenarios such as is the CFArray tests much easier to write (and without need for unsafe)

@jrmuizel
Copy link
Collaborator

jrmuizel commented Jan 15, 2018

I haven't looked at this in detail so don't understand it yet but I'm not really a fan of growing the size of the CFTypes

@jrmuizel
Copy link
Collaborator

jrmuizel commented Jan 15, 2018

Can't we just transmute between Ref types to avoid Drop?

@jrmuizel
Copy link
Collaborator

jrmuizel commented Jan 15, 2018

Or rather transmute the wrapper types?

@jrmuizel
Copy link
Collaborator

jrmuizel commented Jan 15, 2018

Or perhaps add a method on the wrapper type that returns the Ref type but doesn't drop()? Most wrapper types could implement this using transmute to avoid the drop.

@faern
Copy link
Contributor Author

faern commented Jan 15, 2018

@jrmuizel I guess it would be possible to use mem::forget(self) in the consuming methods that are now utilizing disable_release instead. I just felt this was a bit more general and safe though, since a mem::forget(self) would rely on the fact that the drop only did the release. If any type introduced any other drop logic it would also not run that. But I guess we might be able to assume avoiding drop only does the release after all, since I now moved the drop impl to the macro, so all CF types automatically have the same code there. Except the special ones who does not yet use declare_TCFType.

What are your arguments against growing the structs by one bool?

@faern
Copy link
Contributor Author

faern commented Jan 15, 2018

I have now pushed a commit that removes disable_release and instead uses mem::forget(self) to avoid the drop impl and thus the CFRelease call. This leaves all structs at their original size (one pointer). If people agree this is a preferred way I can squash this last fix into the main commit doing this change.

@jrmuizel
Copy link
Collaborator

jrmuizel commented Jan 15, 2018

What are your arguments against growing the structs by one bool?
None of my arguments are amazing, but here they are:

  • it causes us to lose what's mostly a zero-cost abstraction
  • it has a small increase of the overhead of drop()
  • it means that arrays of CFTypes use up twice as much space
  • it means that we don't have any chance of transmuting the void* array that CFArrayGetValues produces into the wrapper types.
@faern
Copy link
Contributor Author

faern commented Jan 15, 2018

Yes, keeping them pointer sized is probably very desirable. Especially since it was so easy to achieve the same results without adding the bool.

@faern faern force-pushed the faern:add-consuming-cheap-cast branch 3 times, most recently from 80ab652 to 1433fec Jan 15, 2018
@jdm
jdm approved these changes Jan 17, 2018
Copy link
Member

jdm left a comment

I like these changes.

sum += number.to_i64().unwrap()
}

assert!(sum == 15);

for elem in arr.iter() {
let number: CFNumber = TCFType::wrap_under_get_rule(mem::transmute(elem));
let number: CFNumber = TCFType::wrap_under_get_rule(mem::transmute(elem.as_concrete_TypeRef()));

This comment has been minimized.

@jdm

jdm Jan 17, 2018

Member

Why are these changes necessary?

This comment has been minimized.

@faern

faern Jan 17, 2018

Author Contributor

Ehm.. They are not, any longer. they were before when I actually changed the structs to contain a bool, then the transmute did not work because the types were of different size. I'll remove it! It's removed in the commit after anyway, but anyhow.

@faern faern force-pushed the faern:add-consuming-cheap-cast branch from 1433fec to cde8e00 Jan 17, 2018
@jdm
Copy link
Member

jdm commented Jan 17, 2018

@bors-servo
Copy link
Contributor

bors-servo commented Jan 17, 2018

📌 Commit cde8e00 has been approved by jdm

@faern
Copy link
Contributor Author

faern commented Jan 18, 2018

Optimally this should be rebased when #135 is merged and updated, since that PR add's a type that does not make use of the macro here.

@faern faern force-pushed the faern:add-consuming-cheap-cast branch from cde8e00 to 01e02dc Jan 18, 2018
@faern faern force-pushed the faern:add-consuming-cheap-cast branch from 01e02dc to 237e838 Jan 20, 2018
@faern faern force-pushed the faern:add-consuming-cheap-cast branch from 237e838 to 3ec0d57 Jan 20, 2018
@jdm
Copy link
Member

jdm commented Jan 20, 2018

@bors-servo
Copy link
Contributor

bors-servo commented Jan 20, 2018

📌 Commit 3ec0d57 has been approved by jdm

@jdm
Copy link
Member

jdm commented Jan 20, 2018

@faern
Copy link
Contributor Author

faern commented Jan 21, 2018

PR #84 also adds a type without making use of the declare_TCFType! macro. So if that one is merged and I'm not away until this gets auto merged as well then I'll rebase this and fix that.

@bors-servo
Copy link
Contributor

bors-servo commented Jan 21, 2018

Testing commit 3ec0d57 with merge 8834563...

bors-servo added a commit that referenced this pull request Jan 21, 2018
Add consuming cheap cast

There are a lot of places where we increment and decrement the retain count when we would not have to. For example, a very common use-case with PropertyLists is to just directly cast them to their correct sub type and then throw the PropertyList away (because there is nothing you can do with the PropertyList instance directly.)

The reason we do this, I guess, is mostly because all `Drop` implementations force a retain count decrement. So we have to keep incrementing it to keep it correct. But if we add a flag to types that can be used to disable the decrement, then we can also get rid of the increment when we virtually just want to cast a pointer between two types.

This PR implements this by adding a boolean flag to all CF structs that is used by the `Drop` impl to determine if it should `CFRelease` the instance or not. Then this flag is modified in appropriate places where we cast to other types and consume ourselves.

I added some benchmarks locally. I did not want to commit them since benchmarking only works on nighty, and I did not feel like adding a feature flag for it etc. Anyhow, the following benchmarks:
```rust
#[bench]
fn bench_before(b: &mut Bencher) {
   let string = CFString::from_static_string("Bar");
    b.iter(|| unsafe {
        string.clone()
            .to_CFPropertyList()
            .downcast::<_, CFString>()
            .unwrap()
            .disable_release();
    })
}

#[bench]
fn bench_after(b: &mut Bencher) {
    let string = CFString::from_static_string("Bar");
    b.iter(|| unsafe {
        string.clone()
            .into_CFPropertyList()
            .downcast_into::<_, CFString>()
            .unwrap()
            .disable_release();
    })
}
```
Given the following results:
```
test propertylist::test::bench_after         ... bench:          16 ns/iter (+/- 3)
test propertylist::test::bench_before        ... bench:          73 ns/iter (+/- 14)
```
The final call to `disable_release()` in the benchmarks are to prevent the type coming out of the `unwrap()` from calling `CFRelease` on drop, and thus avoid measuring that. I want to try to measure only `{to,into}_CFPropertyList`, `downcast` and `downcast_into`. The `clone` is needed because the `into` version requires it and it would then not be fair to not have it on both.

I would say that speeding it up by a factor of ~4 can be well worth it? (Well, that 4 is of course no good measurement of anything, microbenchmarks etc. etc.. The benchmark mostly shows that a lot can be gained from avoiding some retain counting.)

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/core-foundation-rs/142)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

bors-servo commented Jan 21, 2018

☀️ Test successful - status-travis
Approved by: jdm
Pushing 8834563 to master...

@bors-servo bors-servo merged commit 3ec0d57 into servo:master Jan 21, 2018
2 checks passed
2 checks passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
homu Test successful
Details
@faern faern deleted the faern:add-consuming-cheap-cast branch Jan 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants
You can’t perform that action at this time.