Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Link development and design documentation #5050

Merged
merged 1 commit into from Dec 15, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
17 changes: 10 additions & 7 deletions README.md
Expand Up @@ -25,7 +25,6 @@ these other resources:
- [Timescale Community Forum](https://www.timescale.com/forum/)
- [Timescale Release Notes & Future Plans](https://tsdb.co/GitHubTimescaleReleaseNotes)


For reference and clarity, all code files in this repository reference
licensing in their header (either the Apache-2-open-source license
or [Timescale License (TSL)](https://github.com/timescale/timescaledb/blob/main/tsl/LICENSE-TIMESCALE)
Expand Down Expand Up @@ -140,6 +139,12 @@ To build from source, see instructions

## Resources

### Architecture documents

- [Basic TimescaleDB Features](tsl/README.md)
- [Advanced TimescaleDB Features](tsl/README.md)
- [Testing TimescaleDB](test/README.md)

### Useful tools

- [timescaledb-tune](https://github.com/timescale/timescaledb-tune): Helps
Expand All @@ -164,13 +169,11 @@ multiple workers.

### Releases & updates

- [Timescale Release Notes & Future
Plans](https://tsdb.co/GitHubTimescaleReleaseNotes): see planned and
- [Timescale Release Notes & Future Plans](https://tsdb.co/GitHubTimescaleReleaseNotes): see planned and
in-progress updates and detailed information about current and past
releases.
- [Subscribe to Timescale Release
Notes](https://tsdb.co/GitHubTimescaleGetReleaseNotes) to get notified about
new releases, fixes, and early access/beta programs.
releases. - [Subscribe to Timescale Release
Notes](https://tsdb.co/GitHubTimescaleGetReleaseNotes) to get
notified about new releases, fixes, and early access/beta programs.

### Contributing

Expand Down
7 changes: 7 additions & 0 deletions src/README.md
@@ -0,0 +1,7 @@
# Basic TimescaleDB Features

- [TimescaleDB Abstract Data Types](adts/README.md)
- [TimescaleDB Scheduler](bgw/README.md)
- [TimescaleDB Multi-version Loader](loader/README.md)


74 changes: 40 additions & 34 deletions src/bgw/README.md
Expand Up @@ -9,51 +9,54 @@ extension versions which may require different scheduler logic.
## Schedules

The scheduler allows you to set a `schedule_interval` for every job.
That defines the interval the scheduler will wait after a job finishes to start
it again, if the job is successful. If the job fails, the scheduler uses `retry_period`
in an exponential backoff to decide when to run the job again.
That defines the interval the scheduler will wait after a job finishes
to start it again, if the job is successful. If the job fails, the
scheduler uses `retry_period` in an exponential backoff to decide when
to run the job again.

## Design

The scheduler itself is a background job that continuously runs and waits
for a time when jobs need to be scheduled. It then launches jobs as new
background workers that it controls through the background worker handle.

Aggregate statistics about a job are kept in the job stat catalog table.
These statistics include the start and finish times of the last run of the job
as well as whether or not the job succeeded. The `next_start` is used to
figure out when next to run a job after a scheduler is restarted.
Aggregate statistics about a job are kept in the job stat catalog
table. These statistics include the start and finish times of the
last run of the job as well as whether or not the job succeeded. The
`next_start` is used to figure out when next to run a job after a
scheduler is restarted.

The statistics table also tracks consecutive failures and crashes for the job
which are used for calculating the exponential backoff after a crash or failure
(which is used to set the `next_start` after the crash/failure). Note also that
there is a minimum time after the database scheduler starts up and a crashed job
is restarted. This is to allow the operator enough time to disable the job
if needed.
The statistics table also tracks consecutive failures and crashes for
the job which are used for calculating the exponential backoff after a
crash or failure (which is used to set the `next_start` after the
crash/failure). Note also that there is a minimum time after the
database scheduler starts up and a crashed job is restarted. This is
to allow the operator enough time to disable the job if needed.

Note that the number of crashes is an overestimate of the actual number of crashes
for a job. This is so that we are conservative and never miss a crash and fail to
use the appropriate backoff logic. There is some complexity
in ensuring that all crashes are counted. A crash in Postgres causes /all/
processes to quit immediately therefore we cannot write anything to the database once
any process has crashed. Thus, we must be able to deduce that a crash occurred
from a commit that happened before any crash. We accomplish
this by committing a changes to the stats table before a job starts and
undoing the change after it finishes. If a job crashed, it will be left
in an intermediate state from which we deduce that it could have been the
crashing process.
Note that the number of crashes is an overestimate of the actual
number of crashes for a job. This is so that we are conservative and
never miss a crash and fail to use the appropriate backoff
logic. There is some complexity in ensuring that all crashes are
counted. A crash in Postgres causes *all* processes to quit
immediately therefore we cannot write anything to the database once
any process has crashed. Thus, we must be able to deduce that a crash
occurred from a commit that happened before any crash. We accomplish
this by committing a changes to the stats table before a job starts
and undoing the change after it finishes. If a job crashed, it will be
left in an intermediate state from which we deduce that it could have
been the crashing process.

## Scheduler State Machine

The scheduler implements a state machine for each job.
Each job starts in the SCHEDULED state. As soon as a job starts
it enters the STARTING state. If the scheduler determines the
job should be terminated (e.g. it has reached a timeout), it moves
the job to a TERMINATING state. Once a background worker has for
a job has stopped, the job returns to the SCHEDULED state.
The states and associated transitions are as follows.
The scheduler implements a state machine for each job. Each job
starts in the `SCHEDULED` state. As soon as a job starts it enters the
`STARTING` state. If the scheduler determines the job should be
terminated (e.g. it has reached a timeout), it moves the job to a
TERMINATING state. Once a background worker has for a job has stopped,
the job returns to the `SCHEDULED` state. The states and associated
transitions are as follows.

```
```ditaa
+---------+ +--------+
+---> |SCHEDULED+-------> |DISABLED|
| +----+----+ +--------+
Expand All @@ -70,10 +73,13 @@ The states and associated transitions are as follows.
+<-----+TERMINATING|
+-----------+
```

## Limitations

This first implementation has two limitations:

- The list of jobs to be run is read from the database when the scheduler is first started.
We do not update this list if the jobs table changes.
- The list of jobs to be run is read from the database when the
scheduler is first started. We do not update this list if the jobs
table changes.
- There is no prioritization for when to run jobs.

5 changes: 5 additions & 0 deletions test/README.md
@@ -0,0 +1,5 @@
# Testing TimescaleDB

- [Regression tests](pgtest/README.md)
- [Perl-based TAP tests](perl/README.md)
- [Background Worker Test Infrastructure](src/bgw/README.md)
9 changes: 7 additions & 2 deletions tsl/README.md
@@ -1,5 +1,10 @@
## TimescaleDB TsL Library ##
## TimescaleDB TSL Library ##

The TimescaleDB TSL library is licensed under the [Timescale License](LICENSE-TIMESCALE).

- [Continuous Aggregates](src/continuous_aggs/README.md)
- [Compression](src/compression/README.md)
- [Query optimization for time series](src/nodes/README.md)

The TimescaleDB TsL library is licensed under the [Timescale License](LICENSE-TIMESCALE).


7 changes: 7 additions & 0 deletions tsl/src/nodes/README.md
@@ -0,0 +1,7 @@
# TimescaleDB Optimizations

TimescaleDB has a number of optimizations to improve performance of
query execution.

- [Skip scan](skip_scan/README.md) optimize queries involving `DISTINCT`
- [Gapfill](gapfill/README.md) supports gapfilling time-series using LOCF and interpolation