Skip to content

Commit

Permalink
prepare next beta
Browse files Browse the repository at this point in the history
  • Loading branch information
jerch committed Apr 30, 2022
1 parent 92457b5 commit de528de
Show file tree
Hide file tree
Showing 3 changed files with 12 additions and 40 deletions.
48 changes: 10 additions & 38 deletions README.md
Expand Up @@ -30,7 +30,7 @@ MyModel.objects.fast_update(bunch_of_instances, ['field_a', 'field_b', 'field_c'

Alternatively `fast.fast_update` can be used directly with a queryset as first argument
(Warning - this skips most sanity checks with up to 30% speed gain,
but make sure not to feed something totally off).
so make sure not to feed something totally off).


### Compatibility ###
Expand All @@ -44,13 +44,17 @@ but make sure not to feed something totally off).

For unsupported database backends or outdated versions `fast_update` will fall back to `bulk_update`.
(It is possible to register fast update implementations for other db vendors with `register_implementation`.
Plz see `fast_update/fast.py` for more details.)
See `fast_update/fast.py` for more details.)

Note that with `fast_update` f-expressions cannot be used anymore.
This is a design decision to not penalize update performance by some swiss-army-knife functionality.
If you have f-expressions in your update data, consider re-grouping the update steps and update those
fields with `update` or `bulk_update` instead.

Other than with `bulk_update` duplicates in a changeset are not allowed and will raise.
This is mainly a safety guard to not let slip through duplicates, where the final update state
would be undetermined or directly depend on the database's compatibility.


### copy_update ###

Expand All @@ -59,15 +63,15 @@ than `fast_update` for medium to big changesets (but tends to be slower than `fa

`copy_update` follows the same interface idea as `bulk_update` and `fast_update`, minus a `batch_size`
argument (data is always transferred in one big batch). It can be used likewise from the `FastUpdateManager`.
`copy_update` also has no support for f-expressions.
`copy_update` also has no support for f-expressions, also duplicates will raise.

**Note** `copy_update` will probably never leave the alpha/PoC-state, as psycopg3 brings great COPY support,
which does a more secure value conversion and has a very fast C-version.


### Status ###

Currently beta, still some TODOs left (including better docs).
Currently beta, still some TODOs left.

The whole package is tested with Django 3.2 & 4.0 on Python 3.8 & 3.10.

Expand All @@ -77,43 +81,11 @@ The whole package is tested with Django 3.2 & 4.0 on Python 3.8 & 3.10.
There is a management command in the example app testing performance of updates on the `FieldUpdate`
model (`./manange.py perf`).

Here are some numbers from my laptop (tested with `settings.DEBUG=False`,
db engines freshly bootstrapped from docker as mentioned in `settings.py`):


| Postgres | bulk_update | fast_update | bulk/fast | copy_update | bulk/copy | fast/copy |
|----------|-------------|--------------|-----------|-------------|-----------|-----------|
| 10 | 0.0471 | 0.0044 | 10.7 | 0.0083 | 5.7 | 0.5 |
| 100 | 0.4095 | 0.0222 | 18.4 | 0.0216 | 18.9 | 1.0 |
| 1000 | 4.4909 | 0.1571 | 28.6 | 0.0906 | 49.6 | 1.7 |
| 10000 | 86.89 | 1.49 | 58.3 | 0.70 | 124.1 | 2.1 |

| SQLite | bulk_update | fast_update | ratio |
|--------|-------------|--------------|-------|
| 10 | 0.0443 | 0.0018 | 24.6 |
| 100 | 0.4408 | 0.0108 | 40.8 |
| 1000 | 4.0178 | 0.0971 | 41.4 |
| 10000 | 40.90 | 0.97 | 42.2 |

| MariaDB | bulk_update | fast_update | ratio |
|---------|-------------|--------------|-------|
| 10 | 0.0448 | 0.0049 | 9.1 |
| 100 | 0.4069 | 0.0252 | 16.1 |
| 1000 | 5.0570 | 0.1759 | 28.7 |
| 10000 | 139.20 | 1.74 | 80.0 |

| MySQL8 | bulk_update | fast_update | ratio |
|--------|-------------|--------------|-------|
| 10 | 0.0442 | 0.0055 | 8.0 |
| 100 | 0.4132 | 0.0278 | 14.9 |
| 1000 | 5.2495 | 0.2115 | 24.8 |
| 10000 | 136.61 | 1.99 | 68.6 |


`fast_update` is at least 8 times faster than `bulk_update`, and keeps making ground for bigger changesets.
This indicates different runtime complexity. `fast_update` grows almost linear for very big numbers of rows
(tested during some perf series against `copy_update` up to 10M), while `bulk_update` grows much faster
(looks quadratic to me, did not further investigate this).
(looks quadratic to me, which can be lowered to linear by applying a proper `batch_size`,
but it stays very steep compared to `fast_update`).

For very big changesets `copy_update` is the clear winner, and even shows a substantial increase in updated rows/s
(within my test range, as upper estimate this of course cannot grow slower than linear,
Expand Down
2 changes: 1 addition & 1 deletion fast_update/__init__.py
@@ -1 +1 @@
__version__ = '0.1.0'
__version__ = '0.2.0'
2 changes: 1 addition & 1 deletion setup.py
Expand Up @@ -24,7 +24,7 @@ def get_version(path):
author='netzkolchose',
author_email='j.breitbart@netzkolchose.de',
url='https://github.com/netzkolchose/django-fast-update',
download_url='https://github.com/netzkolchose/django-fast-update/archive/v0.1.0.tar.gz',
download_url='https://github.com/netzkolchose/django-fast-update/archive/v0.2.0.tar.gz',
keywords=['django', 'bulk_update', 'fast', 'update', 'fast_update'],
classifiers=[
'Development Status :: 4 - Beta',
Expand Down

0 comments on commit de528de

Please sign in to comment.