Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Static files in GCS have too long cache times #116

Closed
gavento opened this issue Mar 17, 2020 · 6 comments
Closed

Static files in GCS have too long cache times #116

gavento opened this issue Mar 17, 2020 · 6 comments
Labels
epimodel Tasks related to epimodel repository

Comments

@gavento
Copy link
Contributor

gavento commented Mar 17, 2020

The GCS cache where the data files are stored has a default timeout of 1h - too much to do any experiments. Also will be problem later in production.
Now, when you update the data file (preserving the name), the cache keeps and serves the old data for 1h. Even if you delete it in GCS, the file is still served 🤷‍♂️

Individual files can get Cache-control: public, max-age=10, but needs to be done on every upload (or we get 1h trap). Can we somehow fix this in a better way? @hnykda

@hnykda
Copy link
Contributor

hnykda commented Mar 17, 2020

We hacked around that by changing the filename up until now (that -v2 there). Sorry, I should have warned you: https://epidemicforecasting.slack.com/archives/C0100BY1EJZ/p1584368895020800?thread_ts=1584361377.006400 .

The reason why I selected GCS is that when the file was served by the BE, it was quite slow. Also GCS bucket is multiregional, so it's also faster from wherever.

Anyway, I added setting cache and making the files public via #122.

Do you think it's enough?

@hnykda hnykda mentioned this issue Mar 17, 2020
@gavento
Copy link
Contributor Author

gavento commented Mar 17, 2020

Thanks!

  • 10 seconds may be too little for later loads. Maybe we should go for 60?
  • If anyone reloads the web between new upload and the caching param. updates, the default 1h timeout will be applied and most (but not all, depends on region) clients will see it for ~1 hour.
  • Can we trigger the pipeline without a push? (convenience)

@gavento
Copy link
Contributor Author

gavento commented Mar 17, 2020

Maybe we should have an upload script that sets the cache params (with name, sharing etc.)?

@hnykda
Copy link
Contributor

hnykda commented Mar 17, 2020

If anyone reloads the web between new upload and the caching param. updates, the default 1h timeout will be applied and most (but not all, depends on region) clients will see it for ~1 hour.

I was under an impression that we are going to update the files like once a day at most (and we are still quite far from that). And even if we did, I don't think users should know anything about us updating the model.

Can we trigger the pipeline without a push? (convenience)

No :-( . Devs can use git commit --allow-empty -m "wakey wakey GitHub Actions" && git push 👎

Maybe we should have an upload script that sets the cache params (with name, sharing etc.)?

We can, but you need proper permissions to be able to run that (that's why I put it in CI where the service account is ready).

@hnykda hnykda added epimodel Tasks related to epimodel repository charting labels Mar 17, 2020
@hnykda
Copy link
Contributor

hnykda commented Mar 20, 2020

@hnykda
Copy link
Contributor

hnykda commented Mar 20, 2020

I'm closing this, I don't think there is a way around this for now.

@hnykda hnykda closed this as completed Mar 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
epimodel Tasks related to epimodel repository
Projects
None yet
Development

No branches or pull requests

3 participants