Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite all the bindings in a better way #85

Closed
wants to merge 2 commits into from

Conversation

@nox
Copy link
Member

nox commented Sep 30, 2016

All the FFI functions have fancy types now and use references instead of raw pointers for arguments.

Now, a CFRef<T> value represents a refcounted value to be released on Drop, CFShared<T> represents a shared value that can be retained to produce a new CFRef<T> (this will be a useful distinction for mutable types, where a bare T shouldn't be retainable to keep unique ownership).

To simplify things and avoid either having wrappers of wrappers of all methods behind traits, I put the safe and thin methods directly on the opaque FFI types.

Let's begin the infinite bikeshedding!


This change is Reviewable

All the FFI functions have fancy types now and use references instead
of raw pointers for arguments.

Now, a CFRef<T> value represents a refcounted value to be released on Drop,
CFShared<T> represents a shared value that can be retained to produce a new
CFRef<T> (this will be a useful distinction for mutable types, where a
bare T shouldn't be retainable to keep unique ownership).

To simplify things and avoid either having wrappers of wrappers of all
methods behind traits, I put the safe and thin methods directly on the
opaque FFI types.

Let's begin the infinite bikeshedding!
@nox
Copy link
Member Author

nox commented Sep 30, 2016

@sfackler
Copy link
Contributor

sfackler commented Sep 30, 2016

As I believe you are aware, I am very opposed to putting this much logic in a -sys package. It breaks with the convention of literally every -sys crate I am aware of. That convention is there because it allows -sys crates to be "dumb" reflections of the API they bind to, and allow people to make whatever abstractions they want to on top of that. In particular, since core-foundation-sys declares in its Cargo.toml that it links to CoreFoundation.framework, it cannot coexist in a dependency hierarchy with any other crate that also provides bindings to that library. People have basically no choice but to use this crate.

The fact that it simplifies things for the implementation of core-foundation is not really an excuse IMO. I tend to care far more about the convenience of downstream users of the libraries I work on than my own.

@nox
Copy link
Member Author

nox commented Sep 30, 2016

That convention is there because it allows -sys crates to be "dumb" reflections of the API they bind to, and allow people to make whatever abstractions they want to on top of that

And that's why the -sys crate still include the FFI prototypes publicly.

In particular, since core-foundation-sys declares in its Cargo.toml that it links to CoreFoundation.framework

If we are on the topic of conventions, are you aware that -sys crates are not supposed to do this? I forgot to remove it in this patch. Removing it now.

The fact that it simplifies things for the implementation of core-foundation is not really an excuse IMO. I tend to care far more about the convenience of downstream users of the libraries I work on than my own.

Yes, I agree with you. But the convenience of downstream users with my new bindings still greatly improve, because they don't need to have to use traits, and they still see only one type, and they don't have an additional level of indirection, and they don't need to retain every single value retrieved with "the get rule" like in the current bindings.

@sfackler
Copy link
Contributor

sfackler commented Sep 30, 2016

If we are on the topic of conventions, are you aware that -sys crates are not supposed to do this? I forgot to remove it in this patch. Removing it now.

Why isn't it supposed to? links exists specifically for -sys crates.

Yes, I agree with you. But the convenience of downstream users with my new bindings still greatly improve, because they don't need to have to use traits, and they still see only one type, and they don't have an additional level of indirection, and they don't need to retain every single value retrieved with "the get rule" like in the current bindings.

I don't see why any of that is incompatible with leaving the bindings unchanged and overhauling the core-foundation crate?

@nox
Copy link
Member Author

nox commented Sep 30, 2016

Why isn't it supposed to? links exists specifically for -sys crates.

Oh right. I can't read. What's the problem though since you have the FFI functions? Just don't use the impls. I don't understand your point about not being able to coexist with other things or whatever.

I don't see why any of that is incompatible with leaving the bindings unchanged and overhauling the core-foundation crate?

Needing at least 3 or 4 types per binding, compared to 1 per binding and CFRef and CFShared. And core-graphics-rs is a downstream user of that crate, so you end up with an explosion in the number of wrappers the more things you bind.

@nox
Copy link
Member Author

nox commented Sep 30, 2016

Is it even a problem if 2 crates link to the same framework through #[link(...)], btw?

@sfackler
Copy link
Contributor

sfackler commented Sep 30, 2016

It is I believe undefined behavior to bind the same extern fn with multiple different signatures.

In addition, though it doesn't apply here since Core Foundation is a system library, you don't want to end up with two copies of a library with e.g. two crates dynamically linking to different versions of the same library, or pulling in 2 statically linked versions, etc.

@sfackler
Copy link
Contributor

sfackler commented Sep 30, 2016

Needing at least 3 or 4 types per binding, compared to 1 per binding and CFRef and CFShared. And core-graphics-rs is a downstream user of that crate, so you end up with an explosion in the number of wrappers the more things you bind.

Again, I really don't understand why the definitions need to be in the -sys crate to pull off a CFRef setup.

@eddyb
Copy link

eddyb commented Sep 30, 2016

@sfackler FWIW, rustc handles multiple imports (with different signatures) of the same symbol just fine.
It has been needed for different versions of the libc crate coexisting, for a long while.

@sfackler
Copy link
Contributor

sfackler commented Sep 30, 2016

How can other downstream users who don't necessarily want to use all of the other stuff in the -sys crate use this as an FFI crate if the FFI definitions depend on that stuff e.g.

    pub fn CFAllocatorGetDefault() -> Option<&'static CFShared<CFAllocator>>;
@nox
Copy link
Member Author

nox commented Sep 30, 2016

Just store a pointer to CFShared<CFAllocator>? What's the problem? They can also just store a pointer to the opaque CFAllocator by using the method on CFShared<T>. What's wrong with that?

Note that this is all mostly a theoretical problem, given there are only 2 crates that depends on core-foundation-sys outside of what is maintained by Servo, and fortunately none of them are 1.0.0.

@sfackler
Copy link
Contributor

sfackler commented Sep 30, 2016

I am the owner of one of those 2 crates and I am telling you that the changes to the -sys side is a significant step backwards.

@nox
Copy link
Member Author

nox commented Sep 30, 2016

I'm quite sad to focus on a theoretical problem that seems to come down to "but my -sys convention!" without ever considering what goodness bring the new refcounting encoding and safer FFI functions.

@nox
Copy link
Member Author

nox commented Sep 30, 2016

You didn't explain how it is a significant step backwards except that it goes against a convention that we can very much evolve, without even trying to use the code. This feels like a gut reaction more than anything else.

@sfackler
Copy link
Contributor

sfackler commented Sep 30, 2016

I will write up my thoughts when I have a chance (might not have a chance until tomorrow, sorry).

@nox
Copy link
Member Author

nox commented Sep 30, 2016

@nox
Copy link
Member Author

nox commented Sep 30, 2016

@sfackler I am going to make a PR on core-graphics-rs using this so we can see how it goes for a downstream crate, and then do two other ones, one here and one there, with your preferred way of keeping the split, so we can compare with the same features in both directions.

This type provides DerefMut access to the underlying Core Foundation type.
Will be used for CFMutableString and friends.
@nox nox force-pushed the nox:fix-all-the-things branch from 12e4c5c to f077c50 Oct 12, 2016
@nox
Copy link
Member Author

nox commented Oct 20, 2016

@sfackler Ping?

@sfackler
Copy link
Contributor

sfackler commented Oct 21, 2016

Ah yes, sorry.

So, here is a basic breakdown of my thoughts:

  1. FFI bindings are hard to get right. @alexcrichton did a bunch of work to modernize openssl-sys and found numerous places where the function declarations there were incorrect. This ranged from signedness issues (e.g. u8 instead of c_char), to use of raw types instead of C definitions that would fail on non-x86 architectures, to even extra parameters! Some of these issues had been around for 4-5 years, and were probably not noticed because they weren't being used heavily, or it happened to work out on the ABIs that they were used on.
  2. It's important to structure things such that it's both easy for people to be confident that you've gotten the bindings right, and to add new ones correctly - you almost never bind 100% of a C API on the initial implementation in my experience. Using types that are 'obviously' equivalent to those in the C bindings makes this easier. You see a function returning a *mut Foo in Rust and know that maps to a function returning a Foo * in C.
  3. As I mentioned above, there should ideally be a single -sys crate in the ecosystem for a given C API. For non-system libraries this is basically a requirement due to the prohibition against doubly-linking. If there can only be one crate, it shouldn't force downstream code to behave in any ways other than what's already imposed by the C API itself. If you're writing a crate that wraps a C API, you're going to be working off of the documentation and references of the C API itself. If the -sys crate isn't exposing that API, it's going to get confusing. If the -sys crate exposes an API that's incompatible with or even puts up roadblocks in the way you're trying to structure things, that's a bummer.
  4. CoreFoundation is a very well behaved C API. It's built around reference counting to a large extent, and the ownership semantics of consumed and returned values is straightforward. From the other libraries I've worked on, that niceness is somewhat rare. C libraries in general expose dozens of strange and unique ownership and safety semantics that can't be expressed via adjusted FFI definitions in this way. For example:
    1. OpenSSL's X509_verify_cert_error_string function returns a C string with a static lifetime if ERR_load_crypto_strings has already been called, ERR_free_strings hasn't been called, and the number passed to it maps to a "valid" error code. If the error code is invalid, a pointer to a static mutable buffer filled in with the stringified version of that code is returned, which is obviously not thread safe.
    2. OpenSSL's BN_mod_mul function takes 4 bignums: r, a, b, and mn and places the result of (a * b) % m in r. r can actually be equal to one of a or b (but not m!), in which case the operation is performed in-place.
    3. SChannel's CryptAcquireCertificatePrivateKey function pulls the private key out of a certificate. It will normally return an owned copy of the key, but you can pass the CRYPT_ACQUIRE_CACHE_FLAG flag to request that a referenced to a cached copy is returned. This is apparently not guaranteed, and you have to check a boolean output to see if you actually need to free the key. In addition, the returned key is one of two types, and you need to check another output parameter to figure out which one, since they have different functions to free them!
  5. The point of all of this is that there are very commonly nontrival pre- and post-conditions that must be met when interacting with C APIs. If FFI bindings try to present a "safer" interface than what exists at the raw level, the amount of unsafety a caller still has to worry about will vary wildly between different libraries, or even different functions in the same library. I'm worried that users may overestimate the amount of safety that the interface is providing and neglect to carefully read and understand the caveats of the C API.
  6. The interface that's being presented seems reasonable, though I haven't taken a super close look. Being able to represent borrowed and owned refcounted values is definitely nice. However, none of this needs to propagate to the FFI layer itself. I am more than willing to write a bit more code in the safe wrapper layer to keep the -sys layer identical to what the C looks like.
@sfackler
Copy link
Contributor

sfackler commented Oct 21, 2016

Here's an example of some stuff in OpenSSL that I think is somewhat similar to this, and how it's pretty painless do do without needing to modify the FFI definitions directly: sfackler/rust-openssl@0f4a22f. Nothing more than some .as_ptr() and ::from_ptr() calls.

@nox
Copy link
Member Author

nox commented Oct 21, 2016

You see a function returning a *mut Foo in Rust and know that maps to a function returning a Foo * in C.

I wouldn't trust a tool doing this, everything must be checked by a human because C and Rust don't match. And if you check it, you can do it yourself and use better types. I can understand that you dislike custom types in FFI, but what is wrong with &T instead of const *T?

CoreFoundation is a very well behaved C API.

Wrong, I had to check the source code of CF many times to even be sure what were the exact semantics with regard to retained arguments.

I'm worried that users may overestimate the amount of safety that the interface is providing and neglect to carefully read and understand the caveats of the C API.

You can't abuse the amount of safety, that's the point: For example, you can't retain a bare &T but you can retain a &CFShared<T>, and if a FFI function takes a &T, you know it won't retain it. That's the only way to have a safe way to do mutable values.

However, none of this needs to propagate to the FFI layer itself. I am more than willing to write a bit more code in the safe wrapper layer to keep the -sys layer identical to what the C looks like.

Do you mean you want to experiment with my branch and do the separation again on top of it? That would be cool, I'm willing to see if that ends up with better code.

@sfackler
Copy link
Contributor

sfackler commented Nov 6, 2016

I wouldn't trust a tool doing this, everything must be checked by a human because C and Rust don't match.

These tools exist for a reason. libc had several things mis-bound until it added CI-time checks against the raw C headers, and the same was true when we added those checks to rust-openssl.

@nox
Copy link
Member Author

nox commented Nov 6, 2016

But CF headers don't say anything against get/create rules etc. Some things I had to check directly in the source code of CF.

@sfackler
Copy link
Contributor

sfackler commented Nov 6, 2016

I know, just like I have to source dive into openssl on top of having automated checks of FFI bindings.

@nox
Copy link
Member Author

nox commented Nov 6, 2016

Fair enough, I'll think about how to preserve the split. Meanwhile, did you have the time to look at the higher-level stuff I wrote?

@bors-servo
Copy link
Contributor

bors-servo commented Nov 7, 2016

The latest upstream changes (presumably #86) made this pull request unmergeable. Please resolve the merge conflicts.

@ghost
Copy link

ghost commented Jan 2, 2017

@nox I took a look at the higher level stuff, and I think it is a lot nicer than the current API in many ways. Have you thought about how we can get this working with the core-foundation crate split from the core-foundation-sys crate?

@nox
Copy link
Member Author

nox commented Feb 24, 2017

@sfackler did the neat foreign_types crate that may be very useful to finally clean this up and land it in a way that satisfies everyone. Should I go back to it @jdm?

@jrmuizel
Copy link
Collaborator

jrmuizel commented Mar 11, 2017

@sfackler did the neat foreign_types crate that may be very useful to finally clean this up and land it in a way that satisfies everyone. Should I go back to it @jdm?

Yes please.

@pcwalton
Copy link
Contributor

pcwalton commented Mar 14, 2017

For what it's worth, I concur with @sfackler that -sys crates should be as low-level and raw as possible.

@nox
Copy link
Member Author

nox commented Mar 14, 2017

FWIW I'm willing to rewrite it this way only because the community asks so, but I still believe I'm right that this is a bad idea, given it leads to way more code. Time will tell who was right.

@jrmuizel
Copy link
Collaborator

jrmuizel commented May 27, 2017

@nox do you mind giving an overview of what your approach here is going to be?

@nox
Copy link
Member Author

nox commented May 28, 2017

@nox do you mind giving an overview of what your approach here is going to be?

Basically following what @sfackler said.

@nox
Copy link
Member Author

nox commented Jul 4, 2017

I firmly believe that blocking this PR set back the whole core-* ecosystem by far because no one else is willing to clean up all our bindings. I should have just went ahead with my original proposal.

I still don't agree with @sfackler that the bindings should be as dumb as possible, because dumb code is a bad idea.

@nox
Copy link
Member Author

nox commented Jul 4, 2017

No need to keep this PR open though.

@nox nox closed this Jul 4, 2017
jdm pushed a commit that referenced this pull request Feb 1, 2018
Font smooth

<!-- Reviewable:start -->
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/core-graphics-rs/85)
<!-- Reviewable:end -->
jdm pushed a commit that referenced this pull request Feb 1, 2018
Make CTFontDescriptor::font_path() return filesystem path, not URL

Prior to this commit, the return string was in the format
`file:///foo/bar.otf`. We now omit the leading `file://` portion. This
also properly retains characters that are not URL safe.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/core-text-rs/85)
<!-- Reviewable:end -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

6 participants
You can’t perform that action at this time.