Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using timestamps considered harmful #224

Open
milos-u opened this issue Feb 2, 2023 · 2 comments
Open

Using timestamps considered harmful #224

milos-u opened this issue Feb 2, 2023 · 2 comments

Comments

@milos-u
Copy link

milos-u commented Feb 2, 2023

What happened?

I have some real world concerns regarding cachalot design compared to old johnny cache.

Using timestamps as table generation keys (instead of uuid4 sha checksum as in johhny cache) seems to be prone to errors in production deployments.

First of all Python time.time() precision on Windows seems to be 100ns. Thus if time() is called multiple times in a row, it returns the same value.

This has severe consequences.

Simple calls like Django ORM get_or_create() can be executed so fast that the timestamp will be the same for the get() and for the create() and the table key will be set in create() to the same value like for the query in the get() (I hope it is clear what I mean).
Thus further calls to ORM get() will return the old value.

What should've happened instead?

Using timestamp as table generation value (at least on Windows) seems to be insufficient. I propose to use approach similar to jmoiron's johnny cache, i.e. uuid4() + sha1 hash.

Steps to reproduce

Django 4.x
OS Windows 10
Database PostgreSQL.
[//]: # (Include host configuration: OS, Django version, database, etc.)

@milos-u
Copy link
Author

milos-u commented Feb 2, 2023

I'm quite sure that this ticket will be closed, so just in case, if anyone is intersted, you can take a look at alternative implementation at:

master...milos-u:django-cachalot:upgrade

(sorry to include other modifications unrelated to this issue, I'm too busy to create separate branch)

the modifications above are passing test suite (except of test_missing_table_cache_key test).

I understand that my approach is much much slower, but I've no other options, to get cachalot up-and-running, reliably in production.

@milos-u
Copy link
Author

milos-u commented Feb 2, 2023

Using timestamps can be also source of various race conditions when having multiple servers. I understand FAQ recommending NTP to sync the time across servers, but this still can lead to race conditions, albeit not so easy to trigger like subsequent fast calls to get_or_create() within single process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant