Skip to content

Commit

Permalink
Updated documentation for 0.2
Browse files Browse the repository at this point in the history
updates_since() is now documented
  • Loading branch information
simonw committed Dec 6, 2023
1 parent 8499d5e commit 1e06725
Showing 1 changed file with 43 additions and 7 deletions.
50 changes: 43 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
[![Tests](https://github.com/simonw/sqlite-chronicle/workflows/Test/badge.svg)](https://github.com/simonw/sqlite-chronicle/actions?query=workflow%3ATest)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/sqlite-chronicle/blob/main/LICENSE)

Use triggers to track when rows in a SQLite table were updated or deleted, in order to synchronize that table with other databases.
Use triggers to track when rows in a SQLite table were updated or deleted

## Installation

Expand All @@ -15,11 +15,11 @@ pip install sqlite-chronicle

## enable_chronicle(conn, table_name)

This module provides a single function: `sqlite_chronicle.enable_chronicle(conn, table_name)`, which does the following:
This module provides a function: `sqlite_chronicle.enable_chronicle(conn, table_name)`, which does the following:

1. Checks if a `_chronicle_{table_name}` table exists already. If so, it does nothing. Otherwise...
2. Creates that table, with the same primary key columns as the original table plus integer columns `added_ms`, `updated_ms`, `version` and `deleted`
3. Creates a new row in the chronicle table corresponding to every row in the original table, setting `added_ms` and `updated_ms` to the current timestamp in milliseconds, and `version` to 1.
3. Creates a new row in the chronicle table corresponding to every row in the original table, setting `added_ms` and `updated_ms` to the current timestamp in milliseconds, and `version` column that starts at 1 and increments for each subsequent row
4. Sets up three triggers on the table:
- An after insert trigger, which creates a new row in the chronicle table, sets `added_ms` and `updated_ms` to the current time and increments the `version`
- An after update trigger, which updates the `updated_ms` timestamp and also updates any primary keys if they have changed (likely extremely rare) plus increments the `version`
Expand All @@ -35,11 +35,47 @@ The end result is a chronicle table that looks something like this:
|-----|---------------|---------|--------|---------|
| 47 | 1694408890954 | 1694408890954 | 2 | 0 |
| 48 | 1694408874863 | 1694408874863 | 3 | 1 |
| 1 | 1694408825192 | 1694408825192 | 1 | 0 |
| 2 | 1694408825192 | 1694408825192 | 1 | 0 |
| 3 | 1694408825192 | 1694408825192 | 1 | 0 |
| 1 | 1694408825192 | 1694408825192 | 4 | 0 |
| 2 | 1694408825192 | 1694408825192 | 5 | 0 |
| 3 | 1694408825192 | 1694408825192 | 6 | 0 |

## Applications
## updates_since(conn, table_name, since=None, batch_size=1000)

The `sqlite_chronicle.updates_since()` function returns a generator over a list of `Change` objects.

These objects represent changes that have occurred to rows in the table since the `since` version number, or since the beginning of time if `since` is not provided.

- `conn` is a SQLite connection object
- `table_name` is a string containing the name of the table to get changes for
- `since` is an optional integer version number - if not provided, all changes will be returned
- `batch_size` is an internal detail, controlling the number of rows that are returned from the database at a time. You should not need to change this as the function implements its own internal pagination.

Each `Change` returned from the generator looks something like this:

```python
Change(
pks=(5,),
added_ms=1701836971223,
updated_ms=1701836971223,
version=5,
row={'id': 5, 'name': 'Simon'},
deleted=0
)
```
A `Change` is a dataclass with the following properties:

- `pks` is a tuple of the primary key values for the row - this will be a tuple with a single item for normal primary keys, or multiple items for compound primary keys
- `added_ms` is the timestamp in milliseconds when the row was added
- `updated_ms` is the timestamp in milliseconds when the row was last updated
- `version` is the version number for the row - you can use this as a `since` value to get changes since that point
- `row` is a dictionary containing the current values for the row - these will be `None` if the row has been deleted (except for the primary keys)
- `deleted` is `0` if the row has not been deleted, or `1` if it has been deleted

Any time you call this you should track the last `version` number that you see, so you can pass it as the `since` value in future calls to get changes that occurred since that point.

Note that if a row had multiple updates in between calls to this function you will still only see one `Change` object for that row - the `updated_ms` and `version` will reflect the most recent update.

## Potential applications

Chronicle tables can be used to efficiently answer the question "what rows have been inserted, updated or deleted since I last checked" - by looking at the `version` column which has an index to make it fast to answer that question.

Expand Down

0 comments on commit 1e06725

Please sign in to comment.