Skip to content

Commit

Permalink
add information on responsibilities and lifecycle of workers (#2853)
Browse files Browse the repository at this point in the history
  • Loading branch information
cgardens committed Apr 14, 2021
1 parent cf45cc6 commit ab9e93f
Show file tree
Hide file tree
Showing 2 changed files with 28 additions and 0 deletions.
Binary file added docs/.gitbook/assets/worker-lifecycle.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
28 changes: 28 additions & 0 deletions docs/architecture/jobs.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,34 @@ In Airbyte, all interactions with connectors are run as jobs performed by a Work
* Discovery worker: retrieves the schema of the source underlying a connector
* Sync worker, used to sync data between a source and destination

## Worker Responsibilities

The worker has 4 main responsibilities in its lifecycle.
1. Spin up any connector docker containers that are needed for the job.
2. They facilitate message passing to or from a connector docker container (more on this [below](#message-passing)).
3. Shut down any connector docker containers that it started.
4. Return the output of the job. (See [Airbyte Specification](./airbyte-specification.md) to understand the output of each worker type.)

## Message Passing

There are 2 flavors of workers:
1. There are workers that interact with a single connector (e.g. spec, check, discover)
2. There are workers that interact with 2 connectors (e.g. sync, reset)

In the first case, the worker is generally extracting data from the connector and reporting it back to the scheduler. It does this by listening to STDOUT of the connector. In the second case, the worker is facilitating passing data (via record messages) from the source to the destination. It does this by listening on STDOUT of the source and writing to STDIN on the destination.

For more information on the schema of the messages that are passed, refer to [Airbyte Specification](./airbyte-specification.md).

## Worker Lifecycle

This section will depict the lifecycle of a worker. It will only show the 2 connector version. The since connector version is the same with one side removed.

Note: When a source has passed all of its messages, the docker process should automatically exit. After a destination has received all records, it should automatically shutdown. The worker gives each a grace period to shutdown on their own. If that grace period expires, then the worker will force shutdown.

![Worker Lifecycle](../.gitbook/assets/worker-lifecycle.png)

[Image Source](https://docs.google.com/drawings/d/1k4v_m2M5o2UUoNlYM7mwtZicRkQgoGLgb3eTOVH8QFo/edit)

See the [architecture overview](high-level-view.md) for more information about workers.

## Job State Machine
Expand Down

0 comments on commit ab9e93f

Please sign in to comment.