Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generated etags stay the same after updating contents #7

Closed
ysugiura opened this issue Apr 3, 2014 · 12 comments
Closed

Generated etags stay the same after updating contents #7

ysugiura opened this issue Apr 3, 2014 · 12 comments

Comments

@ysugiura
Copy link

ysugiura commented Apr 3, 2014

I'm having a trouble to get etag decorator working. Generated etags don't change after updating the corresponding contents (responses from drf views are updated too).

By looking at the code in drf-extensions, the default behaviour of the default_etag_func seems to be calculating etags based on requests. Wouldn't it supposed to be based on responses?

Before attempting to write a custom etag_func, I would like to make sure that I'm not missing anything.

I'm testing with following versions.

  • drf-extensions==0.2.2
  • djangorestframework==2.3.13
@chibisov
Copy link
Owner

chibisov commented Apr 4, 2014

The main feature of etag calculation is prevention from heavy response calculation. It's not only about saving bandwith. Thats why i think etag calculation should be based on request data, not just taking hash function from calculated response.

I would love to help you to compose custom etag function if you described here your use cases in more details.

@ysugiura
Copy link
Author

ysugiura commented Apr 5, 2014

Thanks for the explanation and even offering me to help writing a custom etag_func. As a matter of fact, "prevention from heavy response calculation" is exactly what I want to do.

Now that I know it is the intended design, I'll try to create custom etag_func. I may ask you some questions if I encounter any problems.

Besides, I'm still wondering if default_etag_func is working. It uses DefaultKeyConstructor which purpose, I believe, is to create a cache key and has nothing to do with the data to be served.

@chibisov
Copy link
Owner

chibisov commented Apr 5, 2014

Key constructors are just the abstraction that helps to calculate keys. You can use it for cache key calculation, etag key calculation, etc. It's your choices whether to use it or not. The only their purpose is to create key string that could be use by @etag or @cache_response decorators for conditional requests or caching facilities. You can easily replace them with simple function.

DefaultKeyConstructor is fairly simple and it isn't trying to be very smart. It's constructed from unique_method_id, format and language of the request. Every such part is called a bit. You can create your custom bit by following example from documentation or use default bits for your custom use cases.

@ysugiura
Copy link
Author

ysugiura commented Apr 5, 2014

So @etag is not working out of the box?

In case you are not aware, DefaultKeyConstructor is assigned to default_etag_func.

rest_framework_extensions/utils.py:

    70: default_cache_key_func = DefaultKeyConstructor()
    ...
    74: default_etag_func = default_cache_key_func

@chibisov
Copy link
Owner

chibisov commented Apr 6, 2014

I thought from documentation it's clear enough that you should tune key construction. Maybe I should add more explicit information about it.

@ysugiura
Copy link
Author

ysugiura commented Apr 6, 2014

I admit I should have read the documentation more carefully. Anyways, I wrote a simple custom etag_func and it's working. Thanks for your time.

@chibisov chibisov closed this as completed Apr 6, 2014
@mbox
Copy link

mbox commented Oct 1, 2015

Sorry @chibisov but that is not the purpose of ETag. See https://en.wikipedia.org/wiki/HTTP_ETag - an ETag should change if the CONTENT ever changes for a resource, not if the REQUEST changes.

This default implementation is dangerously wrong - if put into use it will result in all sorts of hard-to-diagnose caching issues.

@chibisov
Copy link
Owner

chibisov commented Oct 1, 2015

@mbox but the request changes the way content rendered. Doesn't it?

@mbox
Copy link

mbox commented Oct 1, 2015

If the underlying entity has not changed, then the same request should return the same response. A different request can of course produce a different response.

So the Etag should be calculated based on the entity to be returned - I guess it would be fine to mix in things like the Language and Encoding headers when deciding if the content is the "same". But it's definitely not OK to generate the same Etag if the resource has changed.

Simple example:

1 Create a model and simple model-based APIView
2 A client requests the model via the API - server sends Etag: 12345
3 Edit the model via the admin
4 The client requests the model again, sending the Etag 12345 in the request headers.

Should the server say "hey, nothing's changed", or send back the updated data? Obviously the latter - but that only works if the Etag value calculated by the server at 2 & 4 depends on the content. However, the default etag function in drf-extensions will send a 304 response.

Say the server does the right thing, and sends Etag: 45678 along with a 200 response at 4. Now the client requests the same URL, but this time with a different Accept-Language header. If the bytes to be returned didn't change (hey, it's an API, maybe the language doesn't matter) then the Etag should still be 45678. If it did change, then the Etag is different.

Either way, the Etag is always logically calculated based on the response to be sent. It may of course be possible for the server to "cheat" rather than generate the entire response to calculate the Etag - but it should be invisible to the client.

@chibisov
Copy link
Owner

chibisov commented Oct 1, 2015

You're absolutely right. But DefaultKeyConstructor is simple and works under assumption that you'll have a stale cache only till CACHE_TIMEOUT time will come. If you need a custom cache invalidation logic just add it.

@mbox
Copy link

mbox commented Oct 2, 2015

You're right for reads that this just means a stale value will be served for CACHE_TIMEOUT seconds - which may or may not be an issue for a particular system.

However, Etags aren't just used for read implementations - they are the foundation of detecting conflicts on updates. Allowing a window for data destruction during CACHE_TIMEOUT doesn't seem like a good idea to me.

Fundamentally I think mixing the idea of server-side cache key generation and Etag generation is a mistake. Server side caching needs to calculate a cache key based on the request to look up the response data, but the Etag calculation needs to calculate a value based on the response data - which if using caching will be the cache value.

I've two suggestions for improving this:

a) Change the default etag function to simply take a hash of the response - this won't cost any more or less server-side computation than not using etags, but at least saves the network bandwidth
b) Update the documents to point out that the default etags function has the issues listed above

For model-based views, an default etag function that simply hashed the queryset or model object would work well, since it would avoid the serialization and data transfer overheads (and if the view is already using a cache to return the ORM results, then it plays nicely).

@aijogja
Copy link

aijogja commented Dec 30, 2015

I also have an issue like @mbox explanation. What is the custom etag_func that I should write to handling etag when the result was changed?

I use:
djangorestframework==3.3.1
drf-extensions==0.2.8

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants