-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bevy needs a better asset-hosting solution for example assets #13875
Comments
The main thing I think we need to move away from is having assets in the repo. I don't feel strongly about git lfs on another host, or something like just separate hosting and a script or tool to update/fetch differences and having assets/ in .gitignore or something. |
a small first step would be to move assets to another git repo, and make it available through a submodule on that other repo, we could nuke history without breaking every PR when needed we could also have an "example" data source for assets that would know how to fetch them and cache them locally |
The Rust Project has had nothing but extremely negative things to say about git submodules, so I'm a bit nervous. I don't mind the idea in principle though! |
I've worked and do work with submodules quite a bit across a few projects. They are awkward, but they are a solution to a problem, as well as creating problems. :) |
Having an "http" asset source that knew how to fetch and cache assets from static hosting code be useful far beyond just the examples |
there's a risk it will be a more painful experience for anyone trying to contribute a new example: they will have to guess the new URL the asset will be available at, and the example won't work until the asset is available there |
Allowing multiple sources for one single asset may be enough I guess? Let bevy fetch in order and synchronize among sources. Developers can easily work on local storage and easily distribute via network. |
I think the last "No assets in repo" approach is best. |
Right, I think that's a reasonable feature :) What this actually reminds me of is font fallback: maybe we can reuse the mechanisms. I think that the best way forward is:
|
In a previous attempt I modified https://github.com/bevyengine/bevy/blob/main/crates/bevy_asset/src/io/wasm.rs to use reqwest, the only downside is that reqwest is fairly large for non wasm builds |
I like the simplicity of assets go in the "assets" folder. I agree that large assets ought not go directly in the repository. But I think it'd be a shame if all the examples referred to remote assets that are unalterable for the user and potentially leave them thinking they need to setup an http server before they can use assets. Also which assets are we talking about remoting? Shaders and other text-based assets seem like a good thing to keep in the repository because they're small and you want that version history. |
Assets == "binary blobs" here. My preferred remote solution is "search the assets folder, falling back to a download, which then populates the asset folder". Basically |
We could do it at the "Asset Source" level to configure a global fallback for every request. Ex: override the default asset source to use a |
Or more likely a |
Caching assets locally probably also means there needs to be a story around cache invalidation. If an asset is updated on the HTTP server, the local copies need to be replaced. Depending on the total size of assets, cache evictions might also have to be considered |
What problem does this solve or what need does it fill?
As seen in #13671, adding or replacing new assets for Bevy's examples is shockingly controversial and high risk.
In the absence of other constraints, assets for examples should be:
Without technical constraints, I would be very happy to have several hundred GB of test assets, updated on a whim whenever we have a new feature or find a better asset to replace something subpar.
However, because these assets are committed directly in Git, any assets (or changes to assets!) that we add are necessarily cloned with the repository. This wastes bandwidth and disk space for users, but also starts running into concerns with Github's hosting, which has a soft limit on repo size of 5 GB.
As an additional wrinkle, the way these assets are stored in the
assets
folder means that users commonly get confused when testing the examples locally, as the whole repo needs to be cloned to do this easily, rather than just copy-pasting code. See #13645.What options have you considered?
Use Git LFS via Github
These are large files! We're using git! We should use Git LFS, right?
Eh, maybe. There are two broad problems here:
To use this freely, we'd probably want 10 data packs, putting us at 500 GiB of storage and 500 GiB/month of bandwidth.
At the current $5/month per data pack, that's $50/month. Not absurd, but not fun.
The amount of bandwidth required is very hard to estimate though: Bevy has a ton of users and contributors, and each of them will want to pull down the repo and examples.
If Github decides to offer us in-kind support for this, or some form of deal, I think this is worth considering. Otherwise, probably not.
Use Git LFS on a different host
With a bit more devops work, we should be able to configure Git LFS to use an alternate backend. Preliminary research suggests that this should be meaningfully cheaper.
It also opens us up to accepting in-kind donations from alternate hosting companies (hi get in touch?).
There's open questions here about the level of engineering required, and the contributor / user experience though.
No assets in the repo
Alternatively, we could avoid hosting assets in the repository at all, and instead simply download them on demand from Bevy-controlled servers. In the long term, this would likely share infrastructure with the Bevy Marketplace (or whatever we call our Unity Asset Store equivalent).
Appealingly, this means we have full control over cost / backing, and don't need to fuss with Git LFS at all. Users trying out example have things "just work" on copy-paste although there's weirdness with "please wait, downloading assets" and we don't waste nearly as much bandwidth copying over all of the assets for users that only want to try out a handful of examples.
This involves the most infrastructure work (but probably most stuff we want anyways), and we will need to take careful plan to ensure this doesn't cripple our ability to automatically test examples in CI.
Additional context
This issue was prompted by @superdump raising these concerns on Discord.
The text was updated successfully, but these errors were encountered: