Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider separating rav1e into subcrates #3347

Closed
FreezyLemon opened this issue Feb 3, 2024 · 11 comments
Closed

Consider separating rav1e into subcrates #3347

FreezyLemon opened this issue Feb 3, 2024 · 11 comments

Comments

@FreezyLemon
Copy link
Contributor

This is a common pattern for larger Rust projects (e.g. rust-analyzer and gitoxide). matklad also published a blog post on the topic (and a series on larger projects in general)

I think it could make sense to use this kind of structure for rav1e, too. At least some things can probably be moved into subcrates without much hassle:

  • x86/ARM assembly
  • src/asm/* (asm bindings and alternative rust implementations)
  • binaries

Other modules that don't have many tangled dependencies could possibly be refactored to allow this (transforms? partitioning?).

In the short term, this can simplify some things (build.rs, the dependencies in Cargo.toml). In the longer term, it can allow better organization of these subcrates (clearer boundaries between rav1e components) and better build times. A better organization will also help projects that try to reuse parts of rav1e, like av-scenechange. That project currently has to pull in the entire rav1e crate to reuse a small part of its API.

Let me know what you think. I'd be interested in getting this started, if this is something you can get behind. I already have a very experimental branch that shows what it could generally look like.

@lu-zero
Copy link
Collaborator

lu-zero commented Feb 3, 2024

I think it would make build and rebuild faster so it would be worth looking into it even if just for having the local crates, if we manage to spin out more reusable components is even better.

That said could you please check what is the best practice to publish all the crates? IIRC cargo publish --workspace hadn't landed yet so there isn't an immediate way to publish all :/

Splitting does not simplify anything, it spreads the complexity across more crates, so we pay some additional burden for some nicer experience rebuilding.

NOTE: In order to not break everybody expectation we should keep cargo install rav1e working.

@tdaede
Copy link
Collaborator

tdaede commented Feb 3, 2024

Well one of the reasons it makes builds for some projects faster is that by default LTO across crates is disabled. Because we enable it explicitly for our release builds, the benefit might be smaller than expected.

@decathorpe
Copy link

Splitting does not simplify anything, it spreads the complexity across more crates, so we pay some additional burden for some nicer experience rebuilding.

To add another perspective here - from a Linux distribution point of view, it is currently quite nice that rav1e is pretty much self-contained. I'm the maintainer of the package in Fedora Linux, and it's always pretty easy to update rav1e to new versions for this reason. When splitting things into multiple sub-components, that makes things harder ... gitoxide is a pretty bad example, it requires updating dozens and dozens of packages for all its subcomponents regularly, which is a huge PITA.

@shssoichiro
Copy link
Collaborator

I would really not like linux packagers to pull in individual components of rav1e, although I'm aware that some distributions prefer going against Rust's static-by-default build methods, but I also very strongly favor rav1e being one binary for end users. The reason I see this being potentially beneficial is primarily around crate development, as mentioned there are crates such as av-scenechange which currently have to pull in the entirety of rav1e. I will say as well, however, from working on av-scenechange--Ideally that crate would host all of the scenechange code, and rav1e would pull in that crate, instead of the other way around, but there is a lot of code in rav1e that is used in multiple places within the codebase, so it's actually not a simple matter to split rav1e into isolated components that have a one-way dependency tree.

@decathorpe
Copy link

I would really not like linux packagers to pull in individual components of rav1e, although I'm aware that some distributions prefer going against Rust's static-by-default build methods, but I also very strongly favor rav1e being one binary for end users.

I'm not sure what you mean here? Dynamic linking for Rust crates is basically impossible in most contexts. We distribute binaries that are built like usual (with the equivalent of cargo build --release) ... so, statically linked and just one binary. The only difference is that some distros (primarily debian and Fedora) package crates separately and do not use vendored dependencies. But that does not affect static / dynamic linking, those crate packages contain only source code, no compiled artifacts.

@shssoichiro
Copy link
Collaborator

Maybe I should have omitted that first sentence because I suppose it wasn't really relevant to the point I was trying to make. I don't want end users (essentially, anyone who is not a crate developer) to have to be concerned with rav1e's internal package structure, and to continue to have just one binary. If there is any splitting within rav1e, it should leave the binary intact as the primary interface for users.

@lu-zero
Copy link
Collaborator

lu-zero commented Feb 5, 2024

I would try to use a local subcrate for the asm code, it isn't really impacted by lto and isn't more work due the lack of cargo publish --workspace). Ideally it would simplify the build.rs for the hash checking and maybe speed up a little rebuilding overall.

@decathorpe
Copy link

Can you clarify what you mean by "local subcrate"? Support for actual "nested" crates in cargo is not implemented yet: rust-lang/rfcs#3452

@lu-zero
Copy link
Collaborator

lu-zero commented Feb 6, 2024

My bad, somehow I assumed it got in.

@FreezyLemon
Copy link
Contributor Author

Well one of the reasons it makes builds for some projects faster is that by default LTO across crates is disabled. Because we enable it explicitly for our release builds, the benefit might be smaller than expected.

That's fair. I think there could still be some speed improvements from improved build parallelization before LTO, but that might require more work specifically for better build speeds.

That said could you please check what is the best practice to publish all the crates? IIRC cargo publish --workspace hadn't landed yet so there isn't an immediate way to publish all :/

I wasn't aware of this perspective. It seems like there is no official support for this yet, but there are some external tools like cargo-release (which supports --workspace) or cargo-workspace. This is just what I found when searching though, I have no experience with these.

Splitting does not simplify anything, it spreads the complexity across more crates, so we pay some additional burden for some nicer experience rebuilding.

Subcrates shouldn't be created just for the sake of more subcrates. There's no obligation to create more subcrates if there are disadvantages to decoupling things. I think there are a few things that are easy to move into a subcrate though.

If there is any splitting within rav1e, it should leave the binary intact as the primary interface for users.

I agree. Maybe I should've been clearer in my initial post. I didn't mean to suggest that rav1e (as in: the encoder rav1e) should be packaged as multiple components. I would assume that most of these subcrates would be version = 0.0.0 and publish = false and just exist for organization of code like crates for asm builds. A release build would still create a statically-linked rav1e executable that can be used and distributed like usual.

@FreezyLemon
Copy link
Contributor Author

On revisiting this, I suggested this under a false assumption. I thought there could be subcrates in this repository which rav1e depends on, which would not need to be published to crates.io separately (I think this would only be possible with the "nested subcrate" feature talked about above).

That changes my stance a bit - I'm not sure having to publish rav1e in X parts without an explicit reason is a good idea given the added workload around publishing these crates. I guess it would make more sense to only split functionality into a separate crate as needed (e.g. av-scenechange, if possible).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants