-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PoC: add Rust submodule as libbasic_rust #19598
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the meson integration doesn't look to bad.
one thing that makes me wonder: does it make sense to build mixed C+rust projects still with gcc at all? i.e. during the fedora default compiler discussion people suggested gcc and llvm still generate slightly incompatible interfaces. but otoh it sounds weird using llvm for the rust stuff but then still use C for the gcc part. what#s the story there?
@mbiebl on a scale from 1 to 10 how much would debian hate if we actually would adopt this?
src/basic/string-util.rs
Outdated
pub extern "C" fn string_extract_line(s: *const c_char, i: usize, ret: *mut *mut c_char) -> i32 { | ||
if ret.is_null() { | ||
panic!("string_extract_line(): 'ret' cannot be NULL"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i ifgure the 4ch indenting is rust-native styling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, not too much clue about rust here, but there must be a way to encode this in the function call signature, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i ifgure the 4ch indenting is rust-native styling?
https://github.com/rust-dev-tools/fmt-rfcs/blob/master/guide/guide.md#indentation-and-line-width
hmm, not too much clue about rust here, but there must be a way to encode this in the function call signature, no?
Quite possibly, but I don't know. First time ever I wrote Rust :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rust person here; can I offer some help? 😀
i ifgure the 4ch indenting is rust-native styling?
Yes
hmm, not too much clue about rust here, but there must be a way to encode this in the function call signature, no?
Not really. An actual Rust-y version of this function would look somewhat like this:
fn string_extract_line(s: &str, line_index: usize) -> Option<&str> {
s.lines().nth(line_index)
}
(Note that this always returns a slice of the original string and never does a copy.)
But this function is instead written to use C types (and C ABI) and be callable from external C. So it's rather pointless to try to enforce anything in its Rust-side signature, as C is still going to see its own definition. That, and also raw pointers (as opposed to references) can be null, dangling and unaligned.
On the other hand, since this function is already unsafe to call (it dereferences raw pointers! — and by the way, it should be marked unsafe
for this reason), and since the (old) C version of this function check doesn't appear to check the pointer for being null, it's OK to just say this is similarly a precondition here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rust person here; can I offer some help?
Yes, please! Pushed a new version, with some improvements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Rust code should be tested thoroughly as well.
Sure.
I'm talking about various analyzers (like Coverity and LGTM for C), fuzzers and so on.
Well, I again admit I don't have any experience with using those tools with Rust, but Rust should not be much different from C in the regard. You pass the appropriate flags to the compiler (and perhaps Meson handles this for you similarly to how Cargo does), and then you invoke whatever analyzer tool. As discovered below, you'd likely need libstd built with the same flags, but that should be it.
But this is not my area of expertise (I do use Cargo for all my Rust projects...); better ask Meson developers for what they recommend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So there are 2 ways that cargo runs commands
- The subcommand is built into cargo. In this case it is intended that anything you can do with cargo you can also do by calling rustc or other binaries like rustdoc directly.
- Cargo also provides an extension mechanism so arbitrary commands can masquerade as cargo subcommands. Basically, if some executable with a name like
cargo-mytool
orcargo-mytool.exe
on windows exists, then runningcargo mytool
will run your tool and pass command line parameters in.
It should be possible to run those programs simply by calling them directly (e.g. cargo-fuzz --my --args
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A cursory search of cargo-fuzz
suggests that there's no requirement for cargo
when running the tool (although building it does require cargo).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the second case the command very likely uses cargo internally. Depending on the tool, the cargo subcommand either has a hard dependency on cargo or just uses cargo to build everything with the necessary flags and for example overrides the rustc used by cargo or gets the necessary information about the target crate after building all dependencies and then invokes the main tool. cargo clippy
for example uses RUSTC_WORKSPACE_WRAPPER
to run clippy-driver
instead of rustc for all crates inside the current workspace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I more or less figured out how to extract the part of "cargo fuzz" dealing with the compiler options like -Cpasses=sancov
, -Cllvm-args=-sanitizer-coverage-level=4
and so on I need. It kind of even works in the sense that I was able to run the systemd fuzz targets built with both clang and rustc. So I agree that it's possible to kind of bring cargo plugins to the systemd build system. The problem is that borrowing code like that isn't maintainable. Someone would have to keep track of what changed where and update systemd
accordingly.
i wonder if one approach to the rustification would be to start with the stuff in src/basic/ and convert it one by one, always making sure our tests for this still pass, and keeping the C version in the codebase, verifying they match the rest version in behaviour by providing careful tests. THen one day, when we are confident about it we could just remove the C code and then make rust a requirement. This would allow us to graudually move things over, and more forward looking distros could build the rust versions of stuff while more traditional distros (where compat with legacy archs like sparc or alpha is important) could hold out a bit longer with the C versions of stuff. The price for such an approach would of course be that we'd have to maintain two versions of much of the stuff in src/basic for a while, but maybe that's a good thing, while we are still learning (and by "we" I mean in particular myself, since at this point my superficial understanding of Rust is very superficial) |
https://buildd.debian.org/status/package.php?p=rustc |
I know @smcv was involved there, maybe he can share his experience. |
But how did that play out? Is DEbian still stuck with the old C version of librsvg for all arcs? or did they finally make the jump? does debian ship the rust version on modern archs and the C version on legacy archs? |
Would it make more sense to start with a leaf/optional component, like say homed, which can be disabled if necessary to start getting familiar with rust? |
src/basic/string-util.rs
Outdated
}; | ||
let mut found: bool = false; | ||
|
||
for (j, item) in s.lines().enumerate() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's a "line" in rust btw? i.e. what separators are accepted? \n? \r? \0? combinations thereof?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's quite some difference to our current logic... i'd prefer if we didn#t introduce that silently
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the tests are passing :-) Apart from the \r\n
, what else is different?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW using \n
unconditionally is simple enough:
for (j, item) in s.split('\n').enumerate() {
I think older versions of the C-based library still linger on in the archive as binary packages. Afaik, there is no source package of the old librsvg anymore. So librsvg can't be rebuilt on those architectures. I assume for librsvg one could introduce a separate source package which a different name for those architectures but it seems nobody has done it, as this is quite a maintenance overhead. For systemd this wouldn't be feasible. |
|
It's using rustc though, not llvm? Or are they the same thing? The rustc debian package pulls in gcc for example, not llvm |
rustc is LLVM-based. Currently GCC doesn't have a rust frontend, although one is currently being built and there is also the mrustc project not related to either LLVM or GCC. |
I guess rustc uses llvm via libllvm? Not sure why it has the gcc dependency tbh. |
Yeah, that's pretty much what I was thinking too, it would make sense. The double implementation wouldn't probably be too bad - in the end when we add helpers to src/basic/ they are always pretty small and self-contained, so writing them twice shouldn't take too much time. And tree-wide refactors are often due to style or C-specific stuff, so wouldn't apply anyway. The main issue I see is the one you already clocked - there are many architectures where rustc is not available. This is a known problem, and unfortunately I do not have an answer there. In Debian, the affected architectures are not "release" archs - so by the letter of the law, one is allowed to break things there. But it is not very nice :-) If we do the double-implementation thing, then I can cook up the meson-ry required to make it work, so that if there's no rustc things work gracefully. |
FWIW, a Rust frontend for GCC is currently work-in-progress with the developers being commercially supported: So it might not be need to use the original Rust compiler in the near future. |
Ah I see, thank you - it shows that I'm a total Rust n00b :-) |
Yes, we're still stuck with these old librsvg versions on the architectures which don't have Rust support yet. m68k is gaining Rust support in the foreseeable future since LLVM has now an M68k backend as well and I have already added support to m68k in Rust in a forked repository. However, I think in the long-term, the better approach will be to use the upcoming Rust frontend in GCC. This way, there also won't be any need for additional compilers or tools to be able to build the codebase. |
There is on snapshot.debian.org.
It can. But the API has changed in the meantime a bit so some packages using librsvg currently FTBFS.
librsvg will be buildable on the other architectures in the foreseeable future when GCC has merged the upcoming Rust frontend. From my discussion with GCC developers I know that they are looking forward to the frontend being part of the official sources. I already tested it myself and while it doesn't support macros yet which are required for stuff like |
That's really promising! I had heard a frontend was under development, but didn't know it was that far ahead, as haven't been following closely. |
so maybe we can do as i proposed and the time where we drop the C versions would be the moment where gcc rust is merged? |
I would probably wait for that particular GCC version to be released first, but yes, that's sounds like a reasonable approach. |
I assume you are suggesting here, that we build all of systemd with clang/llvm? |
Yeah, that's what I was wondering. i.e. to me it sounds like the best option to go full-llvm or full-gcc, but mix and match not sure.
No idea, this was pretty much the question i was asking. But i guess if gcc-rust is a thing soon, then maybe the question doesn't matter that much, as then one can easily do a pure-gcc build |
Do we have any idea how this might affect binary sizes? |
b499177
to
a52e4ed
Compare
Proof-of-concept to test integrating Rust submodules. This adds a (very naive) rewrite of string_extract_line from string-util.h in Rust, and compiles it in so that the rest of our C code, including the unit tests, call into it instead. Given we are only using the standard library, there is no need for a (complicated) integration with Cargo, and the only additional requirement is the Rust compiler. Meson abstracts it quite nicely.
Property-based testing (basically fuzzing of unittests with different inputs) could be a tool to check for differential behaviour of the Rust and C implementations. This should decrease the maintenance overhead of having two implementations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at this, I can't shake the feeling that systemd is probably not the best project for exploration of integration methods. It'd be better to develop a working solution somewhere else first and then apply it cleanly here. Our CI is a monster with a thousand heads already, and we would make it doubly hard by mixing Rust into it.
@@ -1071,6 +1071,7 @@ int string_truncate_lines(const char *s, size_t n_lines, char **ret) { | |||
return truncation_applied; | |||
} | |||
|
|||
#ifdef BUILD_LEGACY_HELPERS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't use ifdef
, but change the define to be 0 or 1, and use #if
here.
- { COMPILER: "gcc", COMPILER_VERSION: "11", LINKER: "bfd", CRYPTOLIB: "gcrypt" } | ||
- { COMPILER: "gcc", COMPILER_VERSION: "11", LINKER: "bfd", CRYPTOLIB: "gcrypt", RUST: "yes" } | ||
- { COMPILER: "gcc", COMPILER_VERSION: "12", LINKER: "gold", CRYPTOLIB: "openssl" } | ||
- { COMPILER: "clang", COMPILER_VERSION: "13", LINKER: "mold", CRYPTOLIB: "gcrypt" } | ||
- { COMPILER: "clang", COMPILER_VERSION: "14", LINKER: "lld", CRYPTOLIB: "openssl" } | ||
- { COMPILER: "clang", COMPILER_VERSION: "14", LINKER: "lld", CRYPTOLIB: "openssl", RUST: "yes" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'd generally want to try rust with the latest compilers, not oldest. More likely to work ;)
} | ||
} | ||
|
||
*ret = CString::new("").expect("CString::new failed").into_raw(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'd want to use Result
and ?
everywhere, and then just have a wrapper at the Rust↔C interface which converts it into a negative-errno.
I agree with this stance. Luckily, the Rust frontend for GCC is already progressing nicely and I think projects will be able to integrate Rust code in the foreseeable future much easier. |
I have a pretty good idea about how to do the integration, and it relies on rust-lang/rust#106560 landing. Once that's happened, I'll update the Meson PR. I wouldn't consider this anywhere ready until well after there are stable versions of both rustc and meson available with everything buttoned up and ready. |
rust-lang/rust#106560 landed yesterday and is available on the latest nightly behind a feature flag. |
Nice, thank you! I'll resume work on the meson side as soon as I can |
1 similar comment
Nice, thank you! I'll resume work on the meson side as soon as I can |
@bluca what's the status of this? I notice the meson PR has been open for a while, so I'm curious if Rust integration is still planned for systemd. |
Haven't had time yet, sadly |
Yeah, we should get on it since the kernel is having more mature rust support (would be cool to rewrite memory safety critical parts of systemd into rust) |
Sorry if any of these have already been answered and I forgot to pull them back off my TODO pile while reading through the thread, but I noticed a few spots where I felt I may be able to add some useful information:
Rustc's POSIX platform, which assumes BSD/UNIX-style "libc is the point of ABI stability for syscalls", is also used on Linux, where it treats delegating to to the system C compiler for the final link as the point of API stability for identifying what linker flags are needed for libc.
The abi_stable crate may be relevant. It's sort of a Rust-to-C-to-Rust FFI framework intended to provide a stable way to encode Rust types in C FFI.
I'd suggest starting by seeing what cargo-bloat (sort of a clone of Bloaty McBloatface) says about something that got much bigger in Rust. In my experience, it tends to be a mixture of the following:
...and possibly pathological expressions of LTO not being used, though I haven't investigated how likely or severe that would be. Though it's much older than min-sized-rust, Why is a Rust executable large? is also good, since it does detailed comparisons, including test builds, between C, C++, and Rust, to explore how much each change helps and why.
In case it's relevant, a According to this comment use of it requires that calling C code be compiled with There's currently an RFC to close the soundness hole relating to
Bear in mind that that Rust makes no affordances for C++-style de facto ABI stabilization, so even a day-to-day bump in the nightly compiler might turn a dynamic link against libstd into undefined behaviour, corruption, and crashes.
There's ongoing discussion about a stable ABI, but it's not intended to be the internal ABI used for libstd but, rather, an easier-to-design-and-bikeshed external ABI for things like plugins. Rust's liberal use of generics like However, if the concern is symbol collisions between different copies of the same libraries, I'd focus on the binary sizes, not the effects on whether it will build successfully. Rust's whole "fearless upgrades" mantra stems in part from an attitude that symbol collisions should only be possible when dependency resolution fails to prevent two different versions of the same C library from getting pulled in. As such, it's designed to default to not exporting symbols and to use name-mangling unless asked not to, with a hash getting incorporated into the symbol to allow non-conflicting use of different versions of the same dependency in the same artifact if that's what it takes to get a dependency tree to resolve. (eg. the
If it helps, here's the documentation page that covers all the crate types ( |
When using cargo this will likely be fixed in the near future by having cargo passing
I don't think this is the case at all. While the ABI itself is not stable, if you use the same libstd when running as when compiling, you shouldn't ever have any issues. And because the filename of libstd is unique for every rustc version and every mangled symbol mangled differently depending on the rustc version string, so even the official builds and distro builds of the same rust release will use different filenames and mangled symbol names and thus don't risk accidentally mixing different abi's. The only case I can imagine where mixing would be possible is if a distro does a patch release of rust without changing the version reported by rustc using |
That does sound sensible, but, especially for a piece of important infrastructure, I'd want to see if anyone from the Rust team can point out something we're missing. |
I'm part of the compiler contributors team. This means that I have merge permission, but don't have decision making power like full compiler team members do. (See https://forge.rust-lang.org/compiler/membership.html for more details) I have a fair amount of knowledge about the current state of the compilation/linking model of rust. I can tell you that it should currently work without getting UB and will at least for a long time as rustc itself depends on it. I don't have the decision making power to block any changes that would break this, but I don't think anyone wants to break it and I did personally oppose breaking it. Be aware however that dynamic linking in rust has a couple of restrictions and rough edges:
If on the other hand you choose to use the staticlib crate type, be aware of rust-lang/rust#104707. You will almost certainly want to use a version script to avoid exporting the entire rust standard library from the dynamic library into which you are linking the staticlib. Not doing so increases the size of the dynamic library (as |
Proof-of-concept to test integrating Rust submodules.
This adds a (very naive) rewrite of string_extract_line from string-util.h
in Rust, and compiles it in so that the rest of our C code, including the
unit tests, call into it instead.
Given we are only using the standard library, there is no need for a
(complicated) integration with Cargo, and the only additional
requirement is the Rust compiler. Meson abstracts it quite nicely.
Note that this is the first time I've written Rust - so it's most likely horrible and completely wrong. Learning was the main idea of the exercise :-)
Opening a PR as I think there were questions in the past about how easily it would be to add Rust modules in the code base.