Skip to content

Commit

Permalink
core: mark custom types as cacheable to prevent warnings during queries
Browse files Browse the repository at this point in the history
also use sqlalchemy.JSON type for json -- sqlite supports json since 2015, so surely it's everywhere by now
  • Loading branch information
karlicoss committed Feb 4, 2023
1 parent 58b1003 commit b873b5c
Show file tree
Hide file tree
Showing 2 changed files with 27 additions and 40 deletions.
30 changes: 15 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -120,10 +120,10 @@ Cachew gives the best of two worlds and makes it both **easy and efficient**. Th


# How it works
Basically, your data objects get [flattened out](src/cachew/__init__.py#L438)
and python types are mapped [onto sqlite types and back](src/cachew/__init__.py#L510).
Basically, your data objects get [flattened out](src/cachew/__init__.py#L426)
and python types are mapped [onto sqlite types and back](src/cachew/__init__.py#L498).

When the function is called, cachew [computes the hash of your function's arguments ](src/cachew/__init__.py:#L837)
When the function is called, cachew [computes the hash of your function's arguments ](src/cachew/__init__.py:#L826)
and compares it against the previously stored hash value.

- If they match, it would deserialize and yield whatever is stored in the cache database
Expand All @@ -135,18 +135,18 @@ and compares it against the previously stored hash value.



* automatic schema inference: [1](src/cachew/tests/test_cachew.py#L343), [2](src/cachew/tests/test_cachew.py#L357)
* automatic schema inference: [1](src/cachew/tests/test_cachew.py#L352), [2](src/cachew/tests/test_cachew.py#L366)
* supported types:

* primitive: `str`, `int`, `float`, `bool`, `datetime`, `date`, `dict`, `list`, `Exception`

See [tests.test_types](src/cachew/tests/test_cachew.py#L623), [tests.test_primitive](src/cachew/tests/test_cachew.py#L670), [tests.test_dates](src/cachew/tests/test_cachew.py#L583)
* [Optional](src/cachew/tests/test_cachew.py#L482) types
* [Union](src/cachew/tests/test_cachew.py#L740) types
* [nested datatypes](src/cachew/tests/test_cachew.py#L399)
* [Exceptions](src/cachew/tests/test_cachew.py#L997)
See [tests.test_types](src/cachew/tests/test_cachew.py#L632), [tests.test_primitive](src/cachew/tests/test_cachew.py#L679), [tests.test_dates](src/cachew/tests/test_cachew.py#L592)
* [Optional](src/cachew/tests/test_cachew.py#L491) types
* [Union](src/cachew/tests/test_cachew.py#L749) types
* [nested datatypes](src/cachew/tests/test_cachew.py#L408)
* [Exceptions](src/cachew/tests/test_cachew.py#L1006)

* detects [datatype schema changes](src/cachew/tests/test_cachew.py#L429) and discards old data automatically
* detects [datatype schema changes](src/cachew/tests/test_cachew.py#L438) and discards old data automatically


# Performance
Expand All @@ -157,25 +157,25 @@ During reading cache all that happens is reading rows from sqlite and mapping th
I haven't set up proper benchmarks/performance regressions yet, so don't want to make specific claims, however that would almost certainly make your programm faster if computations take more than several seconds.


If you want to experiment for youself, check out [tests.test_many](src/cachew/tests/test_cachew.py#L288)
If you want to experiment for youself, check out [tests.test_many](src/cachew/tests/test_cachew.py#L297)



# Using
See [docstring](src/cachew/__init__.py#L701) for up-to-date documentation on parameters and return types.
See [docstring](src/cachew/__init__.py#L690) for up-to-date documentation on parameters and return types.
You can also use [extensive unit tests](src/cachew/tests/test_cachew.py) as a reference.

Some useful (but optional) arguments of `@cachew` decorator:

* `cache_path` can be a directory, or a callable that [returns a path](src/cachew/tests/test_cachew.py#L379) and depends on function's arguments.
* `cache_path` can be a directory, or a callable that [returns a path](src/cachew/tests/test_cachew.py#L388) and depends on function's arguments.

By default, `settings.DEFAULT_CACHEW_DIR` is used.

* `depends_on` is a function which determines whether your inputs have changed, and the cache needs to be invalidated.

By default it just uses string representation of the arguments, you can also specify a custom callable.

For instance, it can be used to [discard cache](src/cachew/tests/test_cachew.py#L93) if the input file was modified.
For instance, it can be used to [discard cache](src/cachew/tests/test_cachew.py#L102) if the input file was modified.

* `cls` is the type that would be serialized.

Expand Down Expand Up @@ -260,7 +260,7 @@ Now you can use `@mcachew` in place of `@cachew`, and be certain things don't br
## Settings


[cachew.settings](src/cachew/__init__.py#L57) exposes some parameters that allow you to control `cachew` behaviour:
[cachew.settings](src/cachew/__init__.py#L58) exposes some parameters that allow you to control `cachew` behaviour:
- `ENABLE`: set to `False` if you want to disable caching for without removing the decorators (useful for testing and debugging).
You can also use [cachew.extra.disabled_cachew](src/cachew/__init__.py#L18) context manager to do it temporarily.
- `DEFAULT_CACHEW_DIR`: override to set a different base directory. The default is the "user cache directory" (see [appdirs docs](https://github.com/ActiveState/appdirs#some-example-output)).
Expand Down
37 changes: 12 additions & 25 deletions src/cachew/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
import logging
from itertools import chain, islice
import inspect
import json
from datetime import datetime, date
import stat
import tempfile
Expand All @@ -27,7 +28,7 @@
import sys
import typing
from typing import (Any, Callable, Iterator, List, NamedTuple, Optional, Tuple,
Type, Union, TypeVar, Generic, Sequence, Iterable, Set, cast)
Type, Union, TypeVar, Generic, Sequence, Iterable, Set, Dict, cast)
import dataclasses
import warnings

Expand Down Expand Up @@ -79,6 +80,8 @@ class IsoDateTime(sqlalchemy.TypeDecorator):
# but practically, the difference seems to be pretty small, so perhaps fine for now
impl = sqlalchemy.String

cache_ok = True

@property
def python_type(self): return datetime

Expand All @@ -93,7 +96,7 @@ def process_bind_param(self, value: Optional[datetime], dialect) -> Optional[str
if tz is None:
return iso
try:
import pytz # type: ignore
import pytz
except ImportError:
self.warn_pytz()
return iso
Expand Down Expand Up @@ -132,6 +135,8 @@ def warn_pytz(self) -> None:
class IsoDate(IsoDateTime):
impl = sqlalchemy.String

cache_ok = True

@property
def python_type(self): return date

Expand All @@ -143,26 +148,6 @@ def process_result_value(self, value: Optional[str], dialect) -> Optional[date]:
return None
return res.date()

import json
from typing import Dict
class Json(sqlalchemy.TypeDecorator):
impl = sqlalchemy.String

@property
def python_type(self): return Dict

def process_literal_param(self, value, dialect): raise NotImplementedError() # make pylint happy

def process_bind_param(self, value: Optional[Dict], dialect) -> Optional[str]:
if value is None:
return None
return json.dumps(value)

def process_result_value(self, value: Optional[str], dialect) -> Optional[datetime]:
if value is None:
return None
return json.loads(value)


jtypes = (int, float, bool, type(None))
class ExceptionAdapter(sqlalchemy.TypeDecorator):
Expand All @@ -173,7 +158,9 @@ class ExceptionAdapter(sqlalchemy.TypeDecorator):
I elaborate on it here: [mypy-driven error handling](https://beepb00p.xyz/mypy-error-handling.html#kiss).
'''
impl = Json
impl = sqlalchemy.JSON

cache_ok = True

@property
def python_type(self): return Exception
Expand Down Expand Up @@ -207,8 +194,8 @@ def process_result_value(self, value: Optional[str], dialect) -> Optional[Except
bool : sqlalchemy.Boolean,
datetime : IsoDateTime,
date : IsoDate,
dict : Json,
list : Json,
dict : sqlalchemy.JSON,
list : sqlalchemy.JSON,
Exception: ExceptionAdapter,
}

Expand Down

0 comments on commit b873b5c

Please sign in to comment.