An ingestion, storage, and delivery system for meteorological observations that delivery reliability and performance, while remaining simple to understand, operate, and maintain.
TODO: Banner image?
Approaching beta, targeting summer 2025.
Lard is built around a Postgres database with two services that interact with it, one focused on ingestion, and one providing an API to access the data.
This architecture lets it scale down to run on a single machine, while also scaling up to respond to high query volume:
Here, one node takes responsiblity for ingestion, using Postgres replication to sync the others. Meanwhile, the others focus on serving read-only requests from the API service, allowing read throughput to scale linearly with the number of replicas. Replicas are also able to take over from the primary in case of outages, minimising downtime.
In addition to read throughput, previous experience with database systems at Met has taught us that as our dataset grows (think past 1 billion observations) write throughput begins to slow to a problematic degree. This happens because the indexes (structures needed to speed up queries on large tables) become resource intensive to maintain as they grow larger. Particularly the BTree indices we use to represent time need to remain balanced, but as we always add data on one side of the tree (the present is one extreme of the time range our dataset covers), we are constantly unbalancing it, and the expense of balancing a tree scales with its size.
We've gotten around this by partitioning the main data table in time, breaking up the indices, while still maintaining a single logical table from the perspective of the services.
Deeper dives into the architecture of the components:
-
TODO: Link egress architecture
-
TODO: Products
-
TODO: QC
TODO: Publish container image?
At Met Norway we use these ansible playbooks to manage a VM based deployment on our local OpenStack. These are somewhat specific to our infrastructure, but can serve as a good starting point for your own playbooks.
With Rust installed, compile the project with:
cargo build --workspace
We have integration tests that require a local postgres instance to run. To save having to maintain a local postgres, we provide a justfile that orchestrates setup and teardown in a container, and runs the tests with:
just test_all
This requires you to have Docker (or an equivalent substitute) installed