Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide the ETag header #38

Closed
3 tasks
severo opened this issue Sep 23, 2021 · 4 comments
Closed
3 tasks

Provide the ETag header #38

severo opened this issue Sep 23, 2021 · 4 comments
Labels
feature request Request for a new feature

Comments

@severo
Copy link
Collaborator

severo commented Sep 23, 2021

  • set and manage the ETag header to save bandwidth when the client (browser) revalidates. See https://developer.mozilla.org/en-US/docs/Web/HTTP/Caching and https://gist.github.com/timheap/1f4d9284e4f4d4f545439577c0ca6300
        # TODO: use key for ETag? It will need to be serialized
        # key = get_rows_json.__cache_key__(
        #     dataset=dataset, config=config, split=split, num_rows=num_rows, token=request.user.token
        # )
        # print(f"key={key} in cache: {cache.__contains__(key)}")
  • ETag: add an ETag header in the response (hash of the response)
  • ETag: if the request contains the If-None-Match, parse its ETag (beware the "weak" ETags), compare to the cache, and return an empty 304 response if the cache is fresh (with or without changing the TTL), or 200 with content if it has changed
@severo severo added the question Further information is requested label Sep 23, 2021
@severo severo mentioned this issue Sep 23, 2021
13 tasks
@severo severo closed this as completed Feb 4, 2022
@severo severo reopened this May 3, 2022
@severo
Copy link
Collaborator Author

severo commented May 11, 2022

Note that the /assets are served by nginx and have an eTag, eg https://datasets-server.us.dev.moon.huggingface.tech/assets/mnist/--/mnist/train/0/image/image.jpg

Capture d’écran 2022-05-11 à 17 13 02

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@severo
Copy link
Collaborator Author

severo commented Sep 16, 2022

The etag could be a hash of the response + the git version of the dataset repo + the datasets library version (or, better, the worker version). Related to #545

@severo severo added feature request Request for a new feature keep and removed question Further information is requested low-priority labels Sep 16, 2022
@severo severo changed the title Manage the ETag header? Provide the ETag header Sep 16, 2022
@severo severo removed the keep label Sep 19, 2022
@severo
Copy link
Collaborator Author

severo commented Sep 19, 2022

Moving to the internal tracker

@severo severo closed this as completed Sep 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Request for a new feature
Projects
None yet
Development

No branches or pull requests

1 participant