Whoops, this page doesn’t exist :-(
++
![ferris](https://www.rust-lang.org/static/images/ferris-error.png)
diff --git a/.nojekyll b/.nojekyll new file mode 100644 index 000000000..f17311098 --- /dev/null +++ b/.nojekyll @@ -0,0 +1 @@ +This file makes sure that Github Pages doesn't process mdBook's output. diff --git a/404.html b/404.html new file mode 100644 index 000000000..f58f0d52f --- /dev/null +++ b/404.html @@ -0,0 +1,165 @@ + + +
+ + +Direct FFI of async functions is absolutely in scope for CXX (on C++20 and up) +but is not implemented yet in the current release. We are aiming for an +implementation that is as easy as:
+
+
+For now the recommended approach is to handle the return codepath over a oneshot
+channel (such as futures::channel::oneshot
) represented in an opaque Rust
+type on the FFI.
+
+
+ The top-level cxx::bridge attribute macro takes an optional namespace
argument
+to control the C++ namespace into which to emit extern Rust items and the
+namespace in which to expect to find the extern C++ items.
+Additionally, a #[namespace = "..."]
attribute may be used inside the bridge
+module on any extern block or individual item. An item will inherit the
+namespace specified on its surrounding extern block if any, otherwise the
+namespace specified with the top level cxx::bridge attribute if any, otherwise
+the global namespace.
+The above would result in functions ::second_priority::f
,
+::first_priority::g
, ::third_priority::h
.
Sometimes you want the Rust name of a function or type to differ from its C++ +name. Importantly, this enables binding multiple overloads of the same C++ +function name using distinct Rust names.
+
+The #[rust_name = "..."]
attribute replaces the name that Rust should use for
+this function, and an analogous #[cxx_name = "..."]
attribute replaces the
+name that C++ should use.
Either of the two attributes may be used on extern "Rust" as well as extern +"C++" functions, according to which one you find clearer in context.
+The same attribute works for renaming functions, opaque types, shared +structs and enums, and enum variants.
+ +
+Box<T> does not support T being an opaque C++ type. You should use +UniquePtr<T> or SharedPtr<T> instead for +transferring ownership of opaque C++ types on the language boundary.
+If T is an opaque Rust type, the Rust type is required to be Sized i.e. size +known at compile time. In the future we may introduce support for dynamically +sized opaque Rust types.
+This program uses a Box to pass ownership of some opaque piece of Rust state +over to C++ and then back to a Rust callback, which is a useful pattern for +implementing async functions over FFI.
+
+
+
+
+ The Rust binding of std::string is called CxxString
. See the link for
+documentation of the Rust API.
Rust code can never obtain a CxxString by value. C++'s string requires a move +constructor and may hold internal pointers, which is not compatible with Rust's +move behavior. Instead in Rust code we will only ever look at a CxxString +through a reference or smart pointer, as in &CxxString or Pin<&mut CxxString> +or UniquePtr<CxxString>.
+In order to construct a CxxString on the stack from Rust, you must use the
+let_cxx_string!
macro which will pin the string properly. The code below
+uses this in one place, and the link covers the syntax.
This example uses C++17's std::variant to build a toy JSON type. JSON can hold +various types including strings, and JSON's object type is a map with string +keys. The example demonstrates Rust indexing into one of those maps.
+
+
+
+
+ The Rust binding of std::vector<T> is called CxxVector<T>
. See the
+link for documentation of the Rust API.
Rust code can never obtain a CxxVector by value. Instead in Rust code we will +only ever look at a vector behind a reference or smart pointer, as in +&CxxVector<T> or UniquePtr<CxxVector<T>>.
+CxxVector<T> does not support T being an opaque Rust type. You should use a +Vec<T> (C++ rust::Vec<T>) instead for collections of opaque Rust types on +the language boundary.
+This program involves Rust code converting a CxxVector<CxxString>
(i.e.
+std::vector<std::string>
) into a Rust Vec<String>
.
+
+
+
+Function pointers with a Result return type are not implemented yet.
+Passing a function pointer from C++ to Rust is not implemented yet, only from
+Rust to an extern "C++"
function is implemented.
Function pointers are commonly useful for implementing async functions over +FFI. See the example code on that page.
+ +Redirecting to... ../bindings.html.
+ + diff --git a/binding/rawptr.html b/binding/rawptr.html new file mode 100644 index 000000000..979ec06a2 --- /dev/null +++ b/binding/rawptr.html @@ -0,0 +1,259 @@ + + + + + +Generally you should use references (&mut T
, &T
) or std::unique_ptr<T>
+where possible over raw pointers, but raw pointers are available too as an
+unsafe fallback option.
Extern functions and function pointers taking a raw pointer as an argument must
+be declared unsafe fn
i.e. unsafe to call. The same does not apply to
+functions which only return a raw pointer, though presumably doing anything
+useful with the returned pointer is going to involve unsafe code elsewhere
+anyway.
This example illustrates making a Rust call to a canonical C-style main
+signature involving char *argv[]
.
+
+
+
+ Result<T> is allowed as the return type of an extern function in either +direction. Its behavior is to translate to/from C++ exceptions. If your codebase +does not use C++ exceptions, or prefers to represent fallibility using something +like outcome<T>, leaf::result<T>, StatusOr<T>, etc then you'll need to +handle the translation of those to Rust Result<T> using your own shims for +now. Better support for this is planned.
+If an exception is thrown from an extern "C++"
function that is not declared
+by the CXX bridge to return Result, the program calls C++'s std::terminate
.
+The behavior is equivalent to the same exception being thrown through a
+noexcept
C++ function.
If a panic occurs in any extern "Rust"
function, regardless of whether it is
+declared by the CXX bridge to return Result, a message is logged and the program
+calls Rust's std::process::abort
.
An extern "Rust"
function returning a Result turns into a throw
in C++ if
+the Rust side produces an error.
Note that the return type written inside of cxx::bridge must be written without +a second type parameter. Only the Ok type is specified for the purpose of the +FFI. The Rust implementation (outside of the bridge module) may pick any error +type as long as it has a std::fmt::Display impl.
+
+The exception that gets thrown by CXX on the C++ side is always of type
+rust::Error
and has the following C++ public API. The what()
member function
+gives the error message according to the Rust error's std::fmt::Display impl.
+An extern "C++"
function returning a Result turns into a catch
in C++ that
+converts the exception into an Err for Rust.
Note that the return type written inside of cxx::bridge must be written without
+a second type parameter. Only the Ok type is specified for the purpose of the
+FFI. The resulting error type created by CXX when an extern "C++"
function
+throws will always be of type cxx::Exception
.
+The specific set of caught exceptions and the conversion to error message are
+both customizable. The way you do this is by defining a template function
+rust::behavior::trycatch
with a suitable signature inside any one of the
+headers include!
'd by your cxx::bridge.
The template signature is required to be:
+
+The default trycatch
used by CXX if you have not provided your own is the
+following. You must follow the same pattern: invoke func
with no arguments,
+catch whatever exception(s) you want, and invoke fail
with the error message
+you'd like for the Rust error to have.
+
+ The Rust binding of std::shared_ptr<T> is called SharedPtr<T>
. See
+the link for documentation of the Rust API.
SharedPtr<T> does not support T being an opaque Rust type. You should use a +Box<T> (C++ rust::Box<T>) instead for transferring ownership of +opaque Rust types on the language boundary.
+
+
+
+
+ &[T]
is written rust::Slice<const T>
in C++&mut [T]
is written rust::Slice<T>
in C++
+T must not be an opaque Rust type or opaque C++ type. Support for opaque Rust +types in slices is coming.
+Allowed as function argument or return value. Not supported in shared structs.
+Only rust::Slice<const T> is copy-assignable, not rust::Slice<T>. (Both are +move-assignable.) You'll need to write std::move occasionally as a reminder that +accidentally exposing overlapping &mut [T] to Rust is UB.
+This example is a C++ program that constructs a slice containing JSON data (by +reading from stdin, but it could be from anywhere), then calls into Rust to +pretty-print that JSON data into a std::string via the serde_json and +serde_transcode crates.
+
+
+Testing the example:
+
+
+
+Be aware that rust::Str behaves like &str i.e. it is a borrow! C++ +needs to be mindful of the lifetimes at play.
+Just to reiterate: &str is rust::Str. Do not try to write &str as const rust::Str &
. A language-level C++ reference is not able to capture the fat
+pointer nature of &str.
Allowed as function argument or return value. Not supported in shared structs
+yet. &mut str
is not supported yet, but is also extremely obscure so this is
+fine.
+
+
+
+
+None. Strings may be used as function arguments and function return values, by +value or by reference, as well as fields of shared structs.
+
+
+
+
+ The Rust binding of std::unique_ptr<T> is called UniquePtr<T>
. See
+the link for documentation of the Rust API.
Only std::unique_ptr<T, std::default_delete<T>>
is currently supported. Custom
+deleters may be supported in the future.
UniquePtr<T> does not support T being an opaque Rust type. You should use a +Box<T> (C++ rust::Box<T>) instead for transferring ownership of +opaque Rust types on the language boundary.
+UniquePtr is commonly useful for returning opaque C++ objects to Rust. This use +case was featured in the blobstore tutorial.
+
+
+
+
+
+Vec<T> does not support T being an opaque C++ type. You should use +CxxVector<T> (C++ std::vector<T>) instead for collections of opaque C++ +types on the language boundary.
+
+
+
+
+ In addition to all the primitive types (i32 <=> int32_t), the following +common types may be used in the fields of shared structs and the arguments and +returns of extern functions.
+name in Rust | name in C++ | restrictions |
---|---|---|
String | rust::String | |
&str | rust::Str | |
&[T] | rust::Slice<const T> | cannot hold opaque C++ type |
&mut [T] | rust::Slice<T> | cannot hold opaque C++ type |
CxxString | std::string | cannot be passed by value |
Box<T> | rust::Box<T> | cannot hold opaque C++ type |
UniquePtr<T> | std::unique_ptr<T> | cannot hold opaque Rust type |
SharedPtr<T> | std::shared_ptr<T> | cannot hold opaque Rust type |
[T; N] | std::array<T, N> | cannot hold opaque C++ type |
Vec<T> | rust::Vec<T> | cannot hold opaque C++ type |
CxxVector<T> | std::vector<T> | cannot be passed by value, cannot hold opaque Rust type |
*mut T, *const T | T*, const T* | fn with a raw pointer argument must be declared unsafe to call |
fn(T, U) -> V | rust::Fn<V(T, U)> | only passing from Rust to C++ is implemented so far |
Result<T> | throw/catch | allowed as return type only |
The C++ API of the rust
namespace is defined by the include/cxx.h file in
+the CXX GitHub repo. You will need to include this header in your C++ code when
+working with those types. When using Cargo and the cxx-build crate, the header
+is made available to you at #include "rust/cxx.h"
.
The rust
namespace additionally provides lowercase type aliases of all the
+types mentioned in the table, for use in codebases preferring that style. For
+example rust::String
, rust::Vec
may alternatively be written rust::string
,
+rust::vec
etc.
The following types are intended to be supported "soon" but are just not +implemented yet. I don't expect any of these to be hard to make work but it's a +matter of designing a nice API for each in its non-native language.
+name in Rust | name in C++ |
---|---|
BTreeMap<K, V> | tbd |
HashMap<K, V> | tbd |
Arc<T> | tbd |
Option<T> | tbd |
tbd | std::map<K, V> |
tbd | std::unordered_map<K, V> |
Starlark-based build systems with the ability to compile a code generator and
+invoke it as a genrule
will run CXX's C++ code generator via its cxxbridge
+command line interface.
The tool is packaged as the cxxbridge-cmd
crate on crates.io or can be built
+from the gen/cmd/ directory of the CXX GitHub repo.
+The CXX repo maintains working Bazel BUILD
and Buck2 BUCK
targets for
+the complete blobstore tutorial (chapter 3) for your reference, tested in CI.
+These aren't meant to be directly what you use in your codebase, but serve as an
+illustration of one possible working pattern.
+
+
+ As one aspect of delivering a good Rust–C++ interop experience, CXX turns
+Cargo into a quite usable build system for C++ projects published as a
+collection of crates.io packages, including a consistent and frictionless
+experience #include
-ing C++ headers across dependencies.
CXX's integration with Cargo is handled through the cxx-build crate.
+
+The canonical build script is as follows. The indicated line returns a
+cc::Build
instance (from the usual widely used cc
crate) on which you can
+set up any additional source files and compiler flags as normal.
+The rerun-if-changed
lines are optional but make it so that Cargo does not
+spend time recompiling your C++ code when only non-C++ code has changed since
+the previous Cargo build. By default without any rerun-if-changed
, Cargo will
+re-execute the build script after any file changed in the project.
If stuck, try comparing what you have against the demo/ directory of the CXX +GitHub repo, which maintains a working Cargo-based setup for the blobstore +tutorial (chapter 3).
+With cxx-build, by default your include paths always start with the crate name.
+This applies to both #include
within your C++ code, and include!
in the
+extern "C++"
section of your Rust cxx::bridge.
Your crate name is determined by the name
entry in Cargo.toml.
For example if your crate is named yourcratename
and contains a C++ header
+file path/to/header.h
relative to Cargo.toml, that file will be includable as:
+A crate can choose a prefix for its headers that is different from the crate
+name by modifying CFG.include_prefix
from build.rs:
+Subsequently the header located at path/to/header.h
would now be includable
+as:
+The empty string ""
is a valid include prefix and will make it possible to
+have #include "path/to/header.h"
. However, if your crate is a library, be
+considerate of possible name collisions that may occur in downstream crates. If
+using an empty include prefix, you'll want to make sure your headers' local path
+within the crate is sufficiently namespaced or unique.
If your #[cxx::bridge]
module contains an extern "Rust"
block i.e. types or
+functions exposed from Rust to C++, or any shared data structures, the
+CXX-generated C++ header declaring those things is available using a .rs.h
+extension on the Rust source file's name.
+For giggles, it's also available using just a plain .rs
extension as if you
+were including the Rust file directly. Use whichever you find more palatable.
+You get to include headers from your dependencies, both handwritten ones
+contained as .h
files in their Cargo package, as well as CXX-generated ones.
It works the same as an include of a local header: use the crate name (or their +include_prefix if their crate changed it) followed by the relative path of the +header within the crate.
++
Note that cross-crate imports are only made available between direct +dependencies. You must directly depend on the other crate in order to #include +its headers; a transitive dependency is not sufficient.
+Additionally, headers from a direct dependency are only importable if the
+dependency's Cargo.toml manifest contains a links
key. If not, its headers
+will not be importable from outside of the same crate. See the links
+manifest key in the Cargo reference.
The following CFG settings are only relevant to you if you are writing a library
+that needs to support downstream crates #include
-ing its C++ public headers.
CFG.exported_header_dirs
(vector of absolute paths) defines a set
+of additional directories from which the current crate, directly dependent
+crates, and further crates to which this crate's headers are exported (more
+below) will be able to #include
headers.
Adding a directory to exported_header_dirs
is similar to adding it to the
+current build via the cc
crate's Build::include
, but also makes the
+directory available to downstream crates that want to #include
one of the
+headers from your crate. If the dir were added only using Build::include
, the
+downstream crate including your header would need to manually add the same
+directory to their own build as well.
When using exported_header_dirs
, your crate must also set a links
key for
+itself in Cargo.toml. See the links
manifest key. The reason is
+that Cargo imposes no ordering on the execution of build scripts without a
+links
key, which means the downstream crate's build script might otherwise
+execute before yours decides what to put into exported_header_dirs
.
One of your crate's headers wants to include a system library, such as #include "Python.h"
.
+Your crate wants to rearrange the headers that it exports vs how they're laid +out locally inside the crate's source directory.
+Suppose the crate as published contains a file at ./include/myheader.h
but
+wants it available to downstream crates as #include "foo/v1/public.h"
.
+CFG.exported_header_prefixes
(vector of strings) each refer to the
+include_prefix
of one of your direct dependencies, or a prefix thereof. They
+describe which of your dependencies participate in your crate's C++ public API,
+as opposed to private use by your crate's implementation.
As a general rule, if one of your headers #include
s something from one of your
+dependencies, you need to put that dependency's include_prefix
into
+CFG.exported_header_prefixes
(or their links
key into
+CFG.exported_header_links
; see below). On the other hand if only your C++
+implementation files and not your headers are importing from the dependency,
+you do not export that dependency.
The significance of exported headers is that if downstream code (crate 𝒜)
+contains an #include
of a header from your crate (ℬ) and your header
+contains an #include
of something from your dependency (𝒞), the exported
+dependency 𝒞 becomes available during the downstream crate 𝒜's build.
+Otherwise the downstream crate 𝒜 doesn't know about 𝒞 and wouldn't be
+able to find what header your header is referring to, and would fail to build.
When using exported_header_prefixes
, your crate must also set a links
key
+for itself in Cargo.toml.
Suppose you have a crate with 5 direct dependencies and the include_prefix
for
+each one are:
Your header involves types from the first four so we re-export those as part of +your public API, while crate4 is only used internally by your cc file not your +header, so we do not export:
+
+For more fine grained control, there is CFG.exported_header_links
+(vector of strings) which each refer to the links
attribute (the links
+manifest key) of one of your crate's direct dependencies.
This achieves an equivalent result to CFG.exported_header_prefixes
by
+re-exporting a C++ dependency as part of your crate's public API, except with
+finer control for cases when multiple crates might be sharing the same
+include_prefix
and you'd like to export some but not others. Links attributes
+are guaranteed to be unique identifiers by Cargo.
When using exported_header_links
, your crate must also set a links
key for
+itself in Cargo.toml.
+
+ There is not an officially endorsed CMake setup for CXX, but a few developers +have shared one that they got working. You can try one of these as a starting +point. If you feel that you have arrived at a CMake setup that is superior to +what is available in these links, feel free to make a PR adding it to this list.
+https://github.com/XiangpengHao/cxx-cmake-example
+https://github.com/david-cattermole/cxx-demo-example
+https://github.com/trondhe/rusty_cmake
+https://github.com/geekbrother/cxx-corrosion-cmake
+https://github.com/paandahl/cpp-with-rust
+Redirecting to... ../building.html.
+ + diff --git a/build/other.html b/build/other.html new file mode 100644 index 000000000..efcd7e5c9 --- /dev/null +++ b/build/other.html @@ -0,0 +1,239 @@ + + + + + +You will need to achieve at least these three things:
+Not all build systems are created equal. If you're hoping to use a build system +from the '90s, especially if you're hoping to overlaying the limitations of 2 or +more build systems (like automake+cargo) and expect to solve them +simultaneously, then be mindful that your expectations are set accordingly and +seek sympathy from those who have imposed the same approach on themselves.
+CXX's Rust code generation automatically happens when the #[cxx::bridge]
+procedural macro is expanded during the normal Rust compilation process, so no
+special build steps are required there.
But the C++ side of the bindings needs to be generated. Your options are:
+Use the cxxbridge
command, which is a standalone command line interface to
+the CXX C++ code generator. Wire up your build system to compile and invoke
+this tool.
+It's packaged as the cxxbridge-cmd
crate on crates.io or can be built from
+the gen/cmd/ directory of the CXX GitHub repo.
Or, build your own code generator frontend on top of the cxx-gen crate. This +is currently unofficial and unsupported.
+However you like. We can provide no guidance.
+When linking a binary which contains mixed Rust and C++ code, you will have to
+choose between using the Rust toolchain (rustc
) or the C++ toolchain which you
+may already have extensively tuned.
Rust does not generate simple standalone .o
files, so you can't just throw the
+Rust-generated code into your existing C++ toolchain linker. Instead you need to
+choose one of these options:
Use rustc
as the final linker. Pass any non-Rust libraries using -L <directory>
and -l<library>
rustc arguments, and/or #[link]
directives in
+your Rust code. If you need to link against C/C++ .o
files you can use
+-Clink-arg=file.o
.
Use your C++ linker. In this case, you first need to use rustc
and/or
+cargo
to generate a single Rust staticlib
target and pass that into your
+foreign linker invocation.
staticlib
perhaps using lots of extern crate
statements to
+include multiple Rust rlib
s. Multiple Rust staticlib
files are likely
+to conflict.Passing Rust rlib
s directly into your non-Rust linker is not supported (but
+apparently sometimes works).
See the Rust reference's Linkage page for some general information +here.
+The following open rust-lang issues might hold more recent guidance or +inspiration: rust-lang/rust#73632, rust-lang/rust#73295.
+ +CXX is designed to be convenient to integrate into a variety of build systems.
+If you are working in a project that does not already have a preferred build +system for its C++ code or which will be relying heavily on open source +libraries from the Rust package registry, you're likely to have the easiest +experience with Cargo which is the build system commonly used by open source +Rust projects. Refer to the Cargo chapter about CXX's +Cargo support.
+Among build systems designed for first class multi-language support, Bazel is a +solid choice. Refer to the Bazel chapter.
+If your codebase is already invested in CMake, refer to the +CMake chapter.
+If you have some other build system that you'd like to try to make work with +CXX, see this page for notes.
+ +This page is a brief overview of the major concepts of CXX, enough so that you +recognize the shape of things as you read the tutorial and following chapters.
+In CXX, the language of the FFI boundary involves 3 kinds of items:
+Shared structs — data structures whose fields are made visible to +both languages. The definition written within cxx::bridge in Rust is usually +the single source of truth, though there are ways to do sharing based on a +bindgen-generated definition with C++ as source of truth.
+Opaque types — their fields are secret from the other language.
+These cannot be passed across the FFI by value but only behind an indirection,
+such as a reference &
, a Rust Box
, or a C++ unique_ptr
. Can be a type
+alias for an arbitrarily complicated generic language-specific type depending
+on your use case.
Functions — implemented in either language, callable from the other +language.
+#[cxx::bridge]
+mod ffi {
+ // Any shared structs, whose fields will be visible to both languages.
+ struct BlobMetadata {
+ size: usize,
+ tags: Vec<String>,
+ }
+
+ extern "Rust" {
+ // Zero or more opaque types which both languages can pass around
+ // but only Rust can see the fields.
+ type MultiBuf;
+
+ // Functions implemented in Rust.
+ fn next_chunk(buf: &mut MultiBuf) -> &[u8];
+ }
+
+ unsafe extern "C++" {
+ // One or more headers with the matching C++ declarations for the
+ // enclosing extern "C++" block. Our code generators don't read it
+ // but it gets #include'd and used in static assertions to ensure
+ // our picture of the FFI boundary is accurate.
+ include!("demo/include/blobstore.h");
+
+ // Zero or more opaque types which both languages can pass around
+ // but only C++ can see the fields.
+ type BlobstoreClient;
+
+ // Functions implemented in C++.
+ fn new_blobstore_client() -> UniquePtr<BlobstoreClient>;
+ fn put(&self, parts: &mut MultiBuf) -> u64;
+ fn tag(&self, blobid: u64, tag: &str);
+ fn metadata(&self, blobid: u64) -> BlobMetadata;
+ }
+}
+
+Within the extern "Rust"
part of the CXX bridge we list the types and
+functions for which Rust is the source of truth. These all implicitly refer to
+the super
module, the parent module of the CXX bridge. You can think of the
+two items listed in the example above as being like use super::MultiBuf
and
+use super::next_chunk
except re-exported to C++. The parent module will either
+contain the definitions directly for simple things, or contain the relevant
+use
statements to bring them into scope from elsewhere.
Within the extern "C++"
part, we list types and functions for which C++ is the
+source of truth, as well as the header(s) that declare those APIs. In the future
+it's possible that this section could be generated bindgen-style from the
+headers but for now we need the signatures written out; static assertions verify
+that they are accurate.
Be aware that the design of this library is intentionally restrictive and +opinionated! It isn't a goal to be flexible enough to handle an arbitrary +signature in either language. Instead this project is about carving out a highly +expressive set of functionality about which we can make powerful safety +guarantees today and extend over time. You may find that it takes some practice +to use CXX bridge effectively as it won't work in all the ways that you may be +used to.
+When it comes to interacting with an idiomatic Rust API or idiomatic C++ API +from the other language, the generally applicable approaches outside of the CXX +crate are:
+Build a C-compatible wrapper around the code (expressed using extern "C"
+signatures, primitives, C-compatible structs, raw pointers). Translate that
+manually to equivalent extern "C"
declarations in the other language and
+keep them in sync. Preferably, build a safe/idiomatic wrapper around the
+translated extern "C"
signatures for callers to use.
Build a C wrapper around the C++ code and use bindgen to translate that
+programmatically to extern "C"
Rust signatures. Preferably, build a
+safe/idiomatic Rust wrapper on top.
Build a C-compatible Rust wrapper around the Rust code and use cbindgen
+to translate that programmatically to an extern "C"
C++ header. Preferably,
+build an idiomatic C++ wrapper.
If the code you are binding is already "effectively C", the above has you +covered. You should use bindgen or cbindgen, or manually translated C +signatures if there aren't too many and they seldom change.
+Bindgen has some basic support for C++. It can reason about classes, member +functions, and the layout of templated types. However, everything it does +related to C++ is best-effort only. Bindgen starts from a point of wanting to +generate declarations for everything, so any C++ detail that it hasn't +implemented will cause a crash if you are lucky (bindgen#388) or more likely +silently emit an incompatible signature (bindgen#380, bindgen#607, +bindgen#652, bindgen#778, bindgen#1194) which will do arbitrary +memory-unsafe things at runtime whenever called.
+Thus using bindgen correctly requires not just juggling all your pointers +correctly at the language boundary, but also understanding ABI details and their +workarounds and reliably applying them. For example, the programmer will +discover that their program sometimes segfaults if they call a function that +returns std::unique_ptr<T> through bindgen. Why? Because unique_ptr, despite +being "just a pointer", has a different ABI than a pointer or a C struct +containing a pointer (bindgen#778) and is not directly expressible in Rust. +Bindgen emitted something that looks reasonable and you will have a hell of a +time in gdb working out what went wrong. Eventually people learn to avoid +anything involving a non-trivial copy constructor, destructor, or inheritance, +and instead stick to raw pointers and primitives and trivial structs only +— in other words C.
+The CXX project attempts a different approach to C++ FFI.
+Imagine Rust and C and C++ as three vertices of a scalene triangle, with length +of the edges being related to similarity of the languages when it comes to +library design.
+The most similar pair (the shortest edge) is Rust–C++. These languages +have largely compatible concepts of things like ownership, vectors, strings, +fallibility, etc that translate clearly from signatures in either language to +signatures in the other language.
+When we make a binding for an idiomatic C++ API using bindgen, and we fall down
+to raw pointers and primitives and trivial structs as described above, what we
+are really doing is coding the two longest edges of the triangle: getting from
+C++ down to C, and C back up to Rust. The Rust–C edge always involves a
+great deal of unsafe
code, and the C++–C edge similarly requires care
+just for basic memory safety. Something as basic as "how do I pass ownership of
+a string to the other language?" becomes a strap-yourself-in moment,
+particularly for someone not already an expert in one or both sides.
You should think of the cxx
crate as being the midpoint of the Rust–C++
+edge. Rather than coding the two long edges, you will code half the short edge
+in Rust and half the short edge in C++, in both cases with the library playing
+to the strengths of the Rust type system and the C++ type system to help
+assure correctness.
If you've already been through the tutorial in the previous chapter, take a +moment to appreciate that the C++ side really looks like we are just writing +C++ and the Rust side really looks like we are just writing Rust. Anything you +could do wrong in Rust, and almost anything you could reasonably do wrong in +C++, will be caught by the compiler. This highlights that we are on the "short +edge of the triangle".
+But it all still boils down to the same things: it's still FFI from one piece of +native code to another, nothing is getting serialized or allocated or +runtime-checked in between.
+The role of CXX is to capture the language boundary with more fidelity than what
+extern "C"
is able to represent. You can think of CXX as being a replacement
+for extern "C"
in a sense.
From this perspective, CXX is a lower level tool than the bindgens. Just as
+bindgen and cbindgen are built on top of extern "C"
, it makes sense to think
+about higher level tools built on top of CXX. Such a tool might consume a C++
+header and/or Rust module (and/or IDL like Thrift) and emit the corresponding
+safe cxx::bridge language boundary, leveraging CXX's static analysis and
+underlying implementation of that boundary. We are beginning to see this space
+explored by the autocxx tool, though nothing yet ready for broad use in the
+way that CXX on its own is.
But note in other ways CXX is higher level than the bindgens, with rich support +for common standard library types. CXX's types serve as an intuitive vocabulary +for designing a good boundary between components in different languages.
+ +
+The extern "C++"
section of a CXX bridge declares C++ types and signatures to
+be made available to Rust, and gives the paths of the header(s) which contain
+the corresponding C++ declarations.
A bridge module may contain zero or more extern "C++" blocks.
+Type defined in C++ that are made available to Rust, but only behind an +indirection.
+
+For example in the Tutorial we saw BlobstoreClient
+implemented as an opaque C++ type. The blobstore client was created in C++ and
+returned to Rust by way of a UniquePtr.
Mutability: Unlike extern Rust types and shared types, an extern C++ type is
+not permitted to be passed by plain mutable reference &mut MyType
across the
+FFI bridge. For mutation support, the bridge is required to use Pin<&mut MyType>
. This is to safeguard against things like mem::swap-ing the contents of
+two mutable references, given that Rust doesn't have information about the size
+of the underlying object and couldn't invoke an appropriate C++ move constructor
+anyway.
Thread safety: Be aware that CXX does not assume anything about the thread
+safety of your extern C++ types. In other words the MyType
etc bindings which
+CXX produces for you in Rust do not come with Send
and Sync
impls. If you
+are sure that your C++ type satisfies the requirements of Send
and/or Sync
+and need to leverage that fact from Rust, you must provide your own unsafe
+marker trait impls.
+Take care in doing this because thread safety in C++ can be extremely tricky to
+assess if you are coming from a Rust background. For example the
+BlobstoreClient
type in the tutorial is not thread safe despite doing only
+completely innocuous things in its implementation. Concurrent calls to the tag
+member function trigger a data race on the blobs
map.
This largely follows the same principles as extern
+"Rust" functions and methods. In particular, any signature
+with a self
parameter is interpreted as a C++ non-static member function and
+exposed to Rust as a method.
The programmer does not need to promise that the signatures they have typed +in are accurate; that would be unreasonable. CXX performs static assertions that +the signatures exactly correspond with what is declared in C++. Rather, the +programmer is only on the hook for things that C++'s static information is not +precise enough to capture, i.e. things that would only be represented at most by +comments in the C++ code unintelligible to a static assertion: namely whether +the C++ function is safe or unsafe to be called from Rust.
+Safety: the extern "C++" block is responsible for deciding whether to expose
+each signature inside as safe-to-call or unsafe-to-call. If an extern block
+contains at least one safe-to-call signature, it must be written as an unsafe extern
block, which serves as an item level unsafe block to indicate that an
+unchecked safety claim is being made about the contents of the block.
+C++ types holding borrowed data may be described naturally in Rust by an extern +type with a generic lifetime parameter. For example in the case of the following +pair of types:
+
+we'd want to expose this to Rust as:
+
+Extern C++ types support a syntax for declaring that a Rust binding of the +correct C++ type already exists outside of the current bridge module. This +avoids generating a fresh new binding which Rust's type system would consider +non-interchangeable with the first.
+
+In this case rather than producing a unique new Rust type ffi::MyType
for the
+Rust binding of C++'s ::path::to::MyType
, CXX will reuse the already existing
+binding at crate::existing::MyType
in expressing the signature of f
and any
+other uses of MyType
within the bridge module.
CXX safely validates that crate::existing::MyType
is in fact a binding for the
+right C++ type ::path::to::MyType
by generating a static assertion based on
+crate::existing::MyType
's implementation of ExternType
, which is a trait
+automatically implemented by CXX for bindings that it generates but can also be
+manually implemented as described below.
ExternType
serves the following two related use cases.
In the following snippet, two #[cxx::bridge] invocations in different files
+(possibly different crates) both contain function signatures involving the same
+C++ type example::Demo
. If both were written just containing type Demo;
,
+then both macro expansions would produce their own separate Rust type called
+Demo
and thus the compiler wouldn't allow us to take the Demo
returned by
+file1::ffi::create_demo
and pass it as the Demo
argument accepted by
+file2::ffi::take_ref_demo
. Instead, one of the two Demo
s has been defined as
+an extern type alias of the other, making them the same type in Rust.
+
+Handwritten ExternType
impls make it possible to plug in a data structure
+emitted by bindgen as the definition of a C++ type emitted by CXX.
By writing the unsafe ExternType
impl, the programmer asserts that the C++
+namespace and type name given in the type id refers to a C++ type that is
+equivalent to Rust type that is the Self
type of the impl.
+The ExternType::Id
associated type encodes a type-level representation of the
+type's C++ namespace and type name. It will always be defined using the
+type_id!
macro exposed in the cxx crate.
The ExternType::Kind
associated type will always be either
+cxx::kind::Opaque
or cxx::kind::Trivial
identifying whether a C++ type
+is soundly relocatable by Rust's move semantics. A C++ type is only okay to hold
+and pass around by value in Rust if its move constructor is trivial and it has
+no destructor. In CXX, these are called Trivial extern C++ types, while types
+with nontrivial move behavior or a destructor must be considered Opaque and
+handled by Rust only behind an indirection, such as a reference or UniquePtr.
If you believe your C++ type reflected by the ExternType impl is indeed fine to +hold by value and move in Rust, you can specify:
+
+which will enable you to pass it into C++ functions by value, return it by
+value, and include it in struct
s that you have declared to cxx::bridge
. Your
+claim about the triviality of the C++ type will be checked by a static_assert
+in the generated C++ side of the binding.
This is a somewhat niche feature, but important when you need it.
+CXX's support for C++'s std::unique_ptr and std::vector is built on a set of +internal trait impls connecting the Rust API of UniquePtr and CxxVector to +underlying template instantiations performed by the C++ compiler.
+When reusing a binding type across multiple bridge modules as described in the +previous section, you may find that your code needs some trait impls which CXX +hasn't decided to generate.
+
+You can request a specific template instantiation at a particular location in
+the Rust crate hierarchy by writing impl UniquePtr<A> {}
inside of the bridge
+module which defines A
but does not otherwise contain any use of
+UniquePtr<A>
.
+
+
+The extern "Rust"
section of a CXX bridge declares Rust types and signatures
+to be made available to C++.
The CXX code generator uses your extern "Rust" section(s) to produce a C++
+header file containing the corresponding C++ declarations. The generated header
+has the same path as the Rust source file containing the bridge, except with a
+.rs.h
file extension.
A bridge module may contain zero or more extern "Rust" blocks.
+Types defined in Rust that are made available to C++, but only behind an +indirection.
+
+For example in the Tutorial we saw MultiBuf
used in this
+way. Rust code created the MultiBuf
, passed a &mut MultiBuf
to C++, and C++
+later passed a &mut MultiBuf
back across the bridge to Rust.
Another example is the one on the Box<T> page, which
+exposes the Rust standard library's std::fs::File
to C++ as an opaque type in
+a similar way but with Box as the indirection rather than &mut.
The types named as opaque types (MyType
etc) refer to types in the super
+module, the parent module of the CXX bridge. You can think of an opaque type T
+as being like a re-export use super::T
made available to C++ via the generated
+header.
Opaque types are currently required to be Sized
and Unpin
. In
+particular, a trait object dyn MyTrait
or slice [T]
may not be used for an
+opaque Rust type. These restrictions may be lifted in the future.
For now, types used as extern Rust types are required to be defined by the same +crate that contains the bridge using them. This restriction may be lifted in the +future.
+The bridge's parent module will contain the appropriate imports or definitions +for these types.
+
+Rust functions made callable to C++.
+Just like for opaque types, these functions refer implicitly to something in
+scope in the super
module, whether defined there or imported by some use
+statement.
+Extern Rust function signature may consist of types defined in the bridge, +primitives, and any of these additional bindings.
+Any signature with a self
parameter is interpreted as a Rust method and
+exposed to C++ as a non-static member function.
+The self
parameter may be a shared reference &self
, an exclusive reference
+&mut self
, or a pinned reference self: Pin<&mut Self>
. A by-value self
is
+not currently supported.
If the surrounding extern "Rust"
block contains exactly one extern type, that
+type is implicitly the receiver for a &self
or &mut self
method. If the
+surrounding block contains more than one extern type, a receiver type must be
+provided explicitly for the self parameter, or you can consider splitting into
+multiple extern blocks.
+An extern Rust function signature is allowed to contain explicit lifetimes but +in this case the function must be declared unsafe-to-call. This is pretty +meaningless given we're talking about calls from C++, but at least it draws some +extra attention from the caller that they may be responsible for upholding some +atypical lifetime relationship.
+
+Bounds on a lifetime (like <'a, 'b: 'a>
) are not currently supported. Nor are
+type parameters or where-clauses.
This library provides a safe mechanism for calling C++ code from Rust and Rust +code from C++. It carves out a regime of commonality where Rust and C++ are +semantically very similar and guides the programmer to express their language +boundary effectively within this regime. CXX fills in the low level stuff so +that you get a safe binding, preventing the pitfalls of doing a foreign function +interface over unsafe C-style signatures.
+From a high level description of the language boundary, CXX uses static analysis +of the types and function signatures to protect both Rust's and C++'s +invariants. Then it uses a pair of code generators to implement the boundary +efficiently on both sides together with any necessary static assertions for +later in the build process to verify correctness.
+The resulting FFI bridge operates at zero or negligible overhead, i.e. no +copying, no serialization, no memory allocation, no runtime checks needed.
+The FFI signatures are able to use native data structures from whichever side +they please. In addition, CXX provides builtin bindings for key standard library +types like strings, vectors, Box, unique_ptr, etc to expose an idiomatic API on +those types to the other language.
+In this example we are writing a Rust application that calls a C++ client of a
+large-file blobstore service. The blobstore supports a put
operation for a
+discontiguous buffer upload. For example we might be uploading snapshots of a
+circular buffer which would tend to consist of 2 pieces, or fragments of a file
+spread across memory for some other reason (like a rope data structure).
+Now we simply provide Rust definitions of all the things in the extern "Rust"
+block and C++ definitions of all the things in the extern "C++"
block, and get
+to call back and forth safely.
The Tutorial chapter walks through a fleshed out version of
+this blobstore example in full detail, including all of the Rust code and all of
+the C++ code. The code is also provided in runnable form in the demo directory
+of https://github.com/dtolnay/cxx. To try it out, run cargo run
from that
+directory.
The key takeaway, which is enabled by the CXX library, is that the Rust code in +main.rs is 100% ordinary safe Rust code working idiomatically with Rust types +while the C++ code in blobstore.cc is 100% ordinary C++ code working +idiomatically with C++ types. The Rust code feels like Rust and the C++ code +feels like C++, not like C-style "FFI glue".
+Chapter outline: See the hamburger menu in the top left if you are on a +small screen and it didn't open with a sidebar by default.
+ +