Skip to content

Use Cases

Avery Fischer (biggerfisch) edited this page Mar 14, 2022 · 1 revision

Example Use Cases for Orbiteer

Large, Live Data Migration

Imagine a data migration that must be performed on a live, running database. If the subject table is very large, an attempt at a single query might lock the table for an unacceptable period of time. The subject table could be processed in uniformly sized chunks and this might be acceptable if the remaining system load is stable and the processing required for all chunks is approximately equal.

However, if these constraints are not true, and system load rises or some chunks require more processing than others, the chosen chunk size might prove to be too large.

Orbiteer solves this by making chunk sizes flexible. Measuring the duration of a chunk's migration process means Orbiteer can adjust the next chunk's size as needed to hit a target runtime. While interference can still exist and individual chunks may take too much time, Orbiteer will prevent the entire process from stalling or contributing to a load cascade.

Alternatively, it may be that the chosen chunk sizes end up being too small. While this is unlikely to directly create issues, it is inefficient and uses more overhead than required. Orbiteer can measure the faster-than-expected execution times and increase the chunk sizes to take bigger bites of the process.

Querying Data with a Timeout

Imagine a data query against an API with a timeout of 30s for data with a time component. While queries for a small period of time work fine against this API, large time periods tend to time out and the API does not provide pagination in order to prevent this. The naive solution is to semi-randomly split the large time period into multiple near-equal sized chunks and concatenate the data together.

However, if these chunk sizes are still too large or near the timeout threshold, errors may occur and the process will have to be manually inspected, fixed, and restarted.

Orbiteer solves this by setting a target duration somewhat under the limit and adjusting chunk sizes based on live feedback from the API's current performance. This means that if the API comes under heavy load during the process, it will attempt to adapt the chunk sizes to ensure the process's successful completion.