Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CORs headers for model files #468

Closed
josephrocca opened this issue Nov 14, 2021 · 19 comments
Closed

CORs headers for model files #468

josephrocca opened this issue Nov 14, 2021 · 19 comments

Comments

@josephrocca
Copy link

Is your feature request related to a problem? Please describe.
I'm porting some popular models to work in web browsers (using tfjs/onnx/tflite), and as part of doing this I'd like to host my model somewhere. If the model is small enough, I can just host the model in the Github repo itself, like I've done for this version of AnimeGANv2 that runs in the browser:

However, this doesn't really work if the model is more than a dozen or so megabytes due to the way git/Github is designed. I've got a working JS port of OpenAI's CLIP model that I'd like to share a web demo for, but I need to host the model somewhere. Huggingface seems like a perfect service for this, but I'm running into a problem: Huggingface doesn't serve model files with the Access-Control-Allow-Origin: * header, and so the files can't be downloaded from JavaScript that's running on another website (such as josephrocca.github.io).

So, for example, if you run this on a website other than huggingface.co:

let onnxFile = await fetch("https://huggingface.co/rocca/openai-clip-js/resolve/main/clip-image-vit-32-float32.onnx").then(r => r.blob());

You'll get an error like this:
CORs error

Describe the solution you'd like
If the Access-Control-Allow-Origin: * header could be added to the serving code for model files, that would allow Huggingface to be used for hosting web-based models (onnx, tfjs, tflite).

Describe alternatives you've considered
I could host the model elsewhere, but I like Huggingface and want to host all my models there :)

@julien-c
Copy link
Member

i think this is a reasonable request! cc @SBrandeis @Pierrci what do you think?

@Pierrci Pierrci closed this as completed Nov 16, 2021
@julien-c
Copy link
Member

julien-c commented Nov 16, 2021

Shipped by @Pierrci 🙏

@josephrocca let us know if this works for you!!!

@Pierrci
Copy link
Member

Pierrci commented Nov 16, 2021

@julien-c was so excited that he announced before it was actually deployed, should be good now :)

@josephrocca
Copy link
Author

@Pierrci Haha, moving fast! :D On my end there are no CORs headers at the moment. Might just be part of a delay in cloudfront's deployment process, or maybe something related to caching?

https://huggingface.co/rocca/openai-clip-js/resolve/main/clip-image-vit-32-float32.onnx

content-length: 276
content-type: text/html; charset=utf-8
date: Tue, 16 Nov 2021 16:56:30 GMT
location: https://cdn-lfs.huggingface.co/rocca/openai-clip-js/bd2ccde5ba3e10a05d8276ac106f1d171c80d1be540a32f8febf602818f53201
permissions-policy: interest-cohort=()
server: nginx/1.18.0 (Ubuntu)
vary: Accept
x-linked-etag: "bd2ccde5ba3e10a05d8276ac106f1d171c80d1be540a32f8febf602818f53201"
x-linked-size: 351468764
x-powered-by: huggingface-moon

Which redirects to:

https://cdn-lfs.huggingface.co/rocca/openai-clip-js/bd2ccde5ba3e10a05d8276ac106f1d171c80d1be540a32f8febf602818f53201

accept-ranges: bytes
content-length: 351468764
content-type: binary/octet-stream
date: Tue, 16 Nov 2021 16:56:51 GMT
etag: "d29c6ba73af05433b498ff1617daadab"
last-modified: Sun, 14 Nov 2021 17:17:50 GMT
server: AmazonS3
via: 1.1 e458de70cfe2237c659d4e5f2ae84565.cloudfront.net (CloudFront)
x-amz-cf-id: UtFq7FQSWYO3sE68yUXjGmSUv-TT7_YCzrXWNKitBLNfxgCpLPHBzA==
x-amz-cf-pop: SIN52-C3
x-amz-version-id: Lo8kIwUXEoXfi6FZKy1ljW4BBaoi9Xnl
x-cache: Miss from cloudfront

(I'm pretty sure the CORs headers would only need to be on the latter response, just including former response for debugging/completeness.)

And testing this in my browser console still gives a CORs error:

await fetch("https://huggingface.co/rocca/openai-clip-js/resolve/main/clip-image-vit-32-float32.onnx").then(r => r.blob());

@Pierrci
Copy link
Member

Pierrci commented Nov 16, 2021

@josephrocca indeed my bad, I forgot a little subtlety when deploying, can you try again? 😄

@josephrocca
Copy link
Author

josephrocca commented Nov 16, 2021

@Pierrci Ah, so I'm now seeing CORs headers in the response for this:

https://huggingface.co/rocca/openai-clip-js/resolve/main/clip-text-vit-32-float32.onnx

But since that redirects to this:

https://cdn-lfs.huggingface.co/rocca/openai-clip-js/66cd0b4a38c3d4115d8d49b81a0e589e2c33b19434dfff48b1f694b458117a40

It's actually that latter URL that needs to serve CORs headers, since that's the response from which the body of the request needs to be read. I don't think CORs headers on the former URL are needed, since only the location header needs to be read for a 302 (redirect) response, and CORs headers aren't needed to read that.

I haven't used CloudFront in a long time, but I'm guessing there's some config to enable CORs headers (I switched from S3 to Backblaze B2 due to Amazon's absurd bandwidth fees lol - so all I know is that it's possible on B2 with basically just a checkbox)

Here's what I'm seeing from the CDN:

$ curl -s -I https://cdn-lfs.huggingface.co/rocca/openai-clip-js/66cd0b4a38c3d4115d8d49b81a0e589e2c33b19434dfff48b1f694b458117a40

HTTP/2 200 
content-type: binary/octet-stream
content-length: 254069579
date: Tue, 16 Nov 2021 17:25:04 GMT
last-modified: Sun, 14 Nov 2021 21:53:21 GMT
etag: "b750642f122a188f35dcf3c8dd1e58ae"
x-amz-version-id: _aPX3D3HCBnNq1IAep49ijMFOCHSadPs
accept-ranges: bytes
server: AmazonS3
x-cache: Miss from cloudfront
via: 1.1 f06aaad108598501fc8aab5df5423ad9.cloudfront.net (CloudFront)
x-amz-cf-pop: SIN52-C3
x-amz-cf-id: gd1wSk4IzzIljOo7-APWCOeOKyZiWbYZz1kBXIt-JJgeI1zePZ1coA==

@Pierrci
Copy link
Member

Pierrci commented Nov 17, 2021

Hi @josephrocca, we updated the configuration in CloudFront, so this time I think it should be good for real :)

(if you're testing with curl make sure to specify an Origin header, otherwise access-control-allow-origin isn't set in the response by CloudFront)

@josephrocca
Copy link
Author

Awesome!

Very bare-bones/proof-of-concept demo for OpenAI's CLIP in the browser, served by HuggingFace🤗:

Thank you @Pierrci, @julien-c, and team 🎉🎉🙏 There are some exciting APIs currently being prototyped by the Chrome team which could bring at lot of maturity to browsers as platforms for deploying ML solutions. HuggingFace is going to be a great resource for the web ML ecosystem as it grows!

@julien-c
Copy link
Member

BTW @josephrocca ... this is not super well documented yet but we can use hf.co/spaces as an alternative to GitHub Pages (to host static webpages like your demo) 🤯

We experimented with client-side ML in the past, it's super cool and seems more and more stable 🔥

@josephrocca
Copy link
Author

josephrocca commented Nov 17, 2021

@julien-c Very cool! Once I get the full CLIP demo going (similar to the original Colab OpenAI published) I'll host it there. HuggingFace: "Github for Software 2.0" (how can I invest? 😄)

One thing I just noticed (in the process of creating a Web Assembly build of Huggingface Tokenizers) is that there are two different URLs to access non-model files. E.g.

Only the /resolve/ URL serves the CORs headers. That's fine because I know about it now and can just make sure I use the /resolve/ URLs in the code (the Tokenizers Rust code actually already does that), but it may trip up others, since the /raw/ URL is the one you get when clicking the file in the HuggingFace UI to get the direct/raw URL. No change is necessarily needed here, but just thought I'd let you know in case this is not desirable from your point of view 👍

@josephrocca
Copy link
Author

@julien-c I'm wondering what the long-term plan is here for managing bandwidth costs. It'd probably be a good idea to develop a policy on this sooner rather than later - since the WebML ecosystem is growing so fast.

If I had to guess, I think bandwidth for client-side model loading from HF would go up by several orders of magnitude over the next few years, and I think it'd end up becoming too much for Hugging Face to handle. Though it may be that for popular models, some sort of edge CDN caching makes it cheap enough for you to not worry about it.

For other models (or all of them), perhaps you could work with @xenova and related library authors to show a warning in the console if a user isn't hosting their own model files.

I'd imagine that the best approach for you is to look at the referrer header (which cannot be spoofed) and rate-limit by that, such that if a site gets popular enough, and the dev hasn't started hosting their own models, then that particular site will start getting bandwidth/rate-limited. If the dev doesn't notice the warning in the console, then they'll eventually start noticing the slower model download speeds.

Obviously it'd be awesome if y'all at Hugging Face could "eat" all the bandwidth costs for the good of the ecosystem, but ML models are just massive compared to other web assets, and so I don't think that's possible, long-term.

The worst case scenario would be if HF stayed quiet on this (i.e. no policies announced), and then suddenly, at some point in the future (when everyone has started to rely on it), announces a policy which causes thousands of sites/demos/etc to suddenly (effectively) "break".

@julien-c
Copy link
Member

Hi @josephrocca, i can share more privately, but if you have a good CDN partner, bandwidth cost is actually very sub-linear i.e. you hit interesting economies of scale. CDN cost is not a huge % of our compute/infra spend. So I have reasonable conviction it will stay sustainable long-term.

@josephrocca
Copy link
Author

Oh, that's really great to hear! Extremely excited for dawn of this new HF WebML ecosystem!

@josephrocca
Copy link
Author

josephrocca commented Nov 24, 2023

@julien-c RE this tweet:

Apple, are you trying to bankrupt us? At ~500 MB per average model download, 90k hits (Apple's IP addresses are the 17. block) translate to ~45TB of downstream bandwidth… which starts being costly for us.

https://twitter.com/julien_c/status/1173669642629537795

Individual WebML model files can already be in the hundreds of MB (I'm using a >150mb embedding model in a production web app), and 90k hits is almost literally nothing when we're talking about web traffic. JSDelivr gets 234 billion requests per month. You're actively working yourself into the position of being the JSDelivr of ML models, and that's obviously awesome -- so long as you are ready to handle that traffic 😳

We're probably talking about ~200,000 TB of traffic a month even if you get to only a hundredth of JSDelivr's scale, and unless we can somehow stop @xenova (I think it's too late), I think there's a decent chance you'll get to that scale within a few years - possibly much higher.

Are you sure it's not a good idea to put a warning in the console telling people to switch to self-hosted in production? It would be really terrible if a bunch of web sites/demos/etc. started to break (or slow) at some point because HF's business model can't support it. But if you are still quite sure you can handle it, I'll relax and take your word for it, and thank you for helping the WebML ecosystem either way 🫡 🙏

@julien-c
Copy link
Member

note re. the tweet that it is a tweet from 2019 and the scales are very different now

@xenova
Copy link

xenova commented Nov 24, 2023

@josephrocca Just a reminder that tweet is from 2019 :) (Sep 16, 2019), but your points (re: becoming the JSDelivr of ML models) still remain valid 🔥

edit: @julien-c beat me by 1 second 😠😆

@julien-c
Copy link
Member

we serve many petabytes of data these days and perhaps a bit counter-intuitively (and thanks to our cloud partner(s)), bandwidth is not that big of a % of our infra cost. I think at this point Internet is pretty efficient at file distribution.

@josephrocca
Copy link
Author

Oh wow lol 🤦 that tweet came up in my twitter feed today (looking into it now I see that it was a retweet from Omar) and I didn't look at the date. Sorry to you both for the noise!

@julien-c
Copy link
Member

hehe yeah thanks @osanseviero i guess and you were not alone: https://twitter.com/julien_c/status/1728143089343058115

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants