Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bevy needs a better asset-hosting solution for example assets #13875

Open
alice-i-cecile opened this issue Jun 16, 2024 · 15 comments
Open

Bevy needs a better asset-hosting solution for example assets #13875

alice-i-cecile opened this issue Jun 16, 2024 · 15 comments
Assignees
Labels
A-Build-System Related to build systems or continuous integration C-Examples An addition or correction to our examples C-Feature A new feature, making something new possible D-Complex Quite challenging from either a design or technical perspective. Ask for help! D-Domain-Agnostic Can be tackled by anyone with generic programming or Rust skills S-Needs-Design This issue requires design work to think about how it would best be accomplished X-Controversial There is active debate or serious implications around merging this PR

Comments

@alice-i-cecile
Copy link
Member

What problem does this solve or what need does it fill?

As seen in #13671, adding or replacing new assets for Bevy's examples is shockingly controversial and high risk.

In the absence of other constraints, assets for examples should be:

  • representative of real world use
  • attractive, to showcase Bevy in a good light
  • high quality, to avoid confusing testers and users about Bevy problems vs broken assets
  • varied, to help

Without technical constraints, I would be very happy to have several hundred GB of test assets, updated on a whim whenever we have a new feature or find a better asset to replace something subpar.

However, because these assets are committed directly in Git, any assets (or changes to assets!) that we add are necessarily cloned with the repository. This wastes bandwidth and disk space for users, but also starts running into concerns with Github's hosting, which has a soft limit on repo size of 5 GB.

As an additional wrinkle, the way these assets are stored in the assets folder means that users commonly get confused when testing the examples locally, as the whole repo needs to be cloned to do this easily, rather than just copy-pasting code. See #13645.

What options have you considered?

Use Git LFS via Github

These are large files! We're using git! We should use Git LFS, right?

Eh, maybe. There are two broad problems here:

  1. Git LFS has a reputation for instability and frustration. I haven't encountered these in my projects, but they're pervasive enough to worry me.
  2. Git LFS hosting is quite expensive on Github. By default there's 1 free GB of storage and bandwidth which is hilariously, unusably low for Bevy.

To use this freely, we'd probably want 10 data packs, putting us at 500 GiB of storage and 500 GiB/month of bandwidth.
At the current $5/month per data pack, that's $50/month. Not absurd, but not fun.

The amount of bandwidth required is very hard to estimate though: Bevy has a ton of users and contributors, and each of them will want to pull down the repo and examples.

If Github decides to offer us in-kind support for this, or some form of deal, I think this is worth considering. Otherwise, probably not.

Use Git LFS on a different host

With a bit more devops work, we should be able to configure Git LFS to use an alternate backend. Preliminary research suggests that this should be meaningfully cheaper.

It also opens us up to accepting in-kind donations from alternate hosting companies (hi get in touch?).

There's open questions here about the level of engineering required, and the contributor / user experience though.

No assets in the repo

Alternatively, we could avoid hosting assets in the repository at all, and instead simply download them on demand from Bevy-controlled servers. In the long term, this would likely share infrastructure with the Bevy Marketplace (or whatever we call our Unity Asset Store equivalent).

Appealingly, this means we have full control over cost / backing, and don't need to fuss with Git LFS at all. Users trying out example have things "just work" on copy-paste although there's weirdness with "please wait, downloading assets" and we don't waste nearly as much bandwidth copying over all of the assets for users that only want to try out a handful of examples.

This involves the most infrastructure work (but probably most stuff we want anyways), and we will need to take careful plan to ensure this doesn't cripple our ability to automatically test examples in CI.

Additional context

This issue was prompted by @superdump raising these concerns on Discord.

@alice-i-cecile alice-i-cecile added C-Feature A new feature, making something new possible A-Build-System Related to build systems or continuous integration C-Examples An addition or correction to our examples D-Complex Quite challenging from either a design or technical perspective. Ask for help! S-Needs-Design This issue requires design work to think about how it would best be accomplished X-Controversial There is active debate or serious implications around merging this PR D-Domain-Agnostic Can be tackled by anyone with generic programming or Rust skills labels Jun 16, 2024
@superdump
Copy link
Contributor

The main thing I think we need to move away from is having assets in the repo. I don't feel strongly about git lfs on another host, or something like just separate hosting and a script or tool to update/fetch differences and having assets/ in .gitignore or something.

@mockersf
Copy link
Member

mockersf commented Jun 16, 2024

a small first step would be to move assets to another git repo, and make it available through a submodule

on that other repo, we could nuke history without breaking every PR when needed

we could also have an "example" data source for assets that would know how to fetch them and cache them locally

@alice-i-cecile
Copy link
Member Author

The Rust Project has had nothing but extremely negative things to say about git submodules, so I'm a bit nervous. I don't mind the idea in principle though!

@superdump
Copy link
Contributor

I've worked and do work with submodules quite a bit across a few projects. They are awkward, but they are a solution to a problem, as well as creating problems. :)

@fintelia
Copy link
Contributor

Having an "http" asset source that knew how to fetch and cache assets from static hosting code be useful far beyond just the examples

@mockersf
Copy link
Member

there's a risk it will be a more painful experience for anyone trying to contribute a new example: they will have to guess the new URL the asset will be available at, and the example won't work until the asset is available there

@hxYuki
Copy link
Contributor

hxYuki commented Jun 17, 2024

Allowing multiple sources for one single asset may be enough I guess? Let bevy fetch in order and synchronize among sources. Developers can easily work on local storage and easily distribute via network.
Or an asset source which handles resource in this way should make it.

@Olle-Lukowski
Copy link
Contributor

I think the last "No assets in repo" approach is best.
Additionally, I think that the sooner we get started on something like an asset store, the better. I think having a central place to find assets, plugins, etc. would be really nice, especially for people coming from engines like unity.
I also agree with @hxYuki, we should allow for different asset sources, and nice integration within bevy to fetch them (and potential nested dependencies) automatically.

@alice-i-cecile
Copy link
Member Author

Allowing multiple sources for one single asset may be enough I guess? Let bevy fetch in order and synchronize among sources. Developers can easily work on local storage and easily distribute via network. Or an asset source which handles resource in this way should make it.

Right, I think that's a reasonable feature :) What this actually reminds me of is font fallback: maybe we can reuse the mechanisms.

I think that the best way forward is:

  1. Implement a general-purpose asset fallback solution.
  2. Implement a blessed way to fetch assets from the web.
  3. Set up our own hosting for the existing assets with an endpoint.
  4. Move all of the assets out of tree and swap to a "try locally, then download" strategy for all of the examples.

@valaphee
Copy link
Contributor

In a previous attempt I modified https://github.com/bevyengine/bevy/blob/main/crates/bevy_asset/src/io/wasm.rs to use reqwest, the only downside is that reqwest is fairly large for non wasm builds

@shanecelis
Copy link
Contributor

I like the simplicity of assets go in the "assets" folder. I agree that large assets ought not go directly in the repository. But I think it'd be a shame if all the examples referred to remote assets that are unalterable for the user and potentially leave them thinking they need to setup an http server before they can use assets.

Also which assets are we talking about remoting? Shaders and other text-based assets seem like a good thing to keep in the repository because they're small and you want that version history.

@alice-i-cecile
Copy link
Member Author

Assets == "binary blobs" here.

My preferred remote solution is "search the assets folder, falling back to a download, which then populates the asset folder". Basically asset_server.load_asset("fox.glb").with_remote_fallback("https://bevyengine.org/marketplace/examples");

@cart
Copy link
Member

cart commented Jun 21, 2024

We could do it at the "Asset Source" level to configure a global fallback for every request. Ex: override the default asset source to use a Fallback<FileAssetReader, HttpAssetReader> source.

@cart
Copy link
Member

cart commented Jun 21, 2024

Or more likely a CachedFallback<FileAssetReader, FileAssetWriter, HttpAssetReader> source or something.

@fintelia
Copy link
Contributor

Caching assets locally probably also means there needs to be a story around cache invalidation. If an asset is updated on the HTTP server, the local copies need to be replaced. Depending on the total size of assets, cache evictions might also have to be considered

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Build-System Related to build systems or continuous integration C-Examples An addition or correction to our examples C-Feature A new feature, making something new possible D-Complex Quite challenging from either a design or technical perspective. Ask for help! D-Domain-Agnostic Can be tackled by anyone with generic programming or Rust skills S-Needs-Design This issue requires design work to think about how it would best be accomplished X-Controversial There is active debate or serious implications around merging this PR
Projects
None yet
Development

No branches or pull requests

10 participants