-
Notifications
You must be signed in to change notification settings - Fork 2
Concepts Overview
codeQ is a task queue server. It accepts work items over the network, persists them durably, hands them out to workers in priority and arrival order, tracks who owns what for how long, and stores the completion record so the original submitter can read it back. The data plane is a single Go process speaking gRPC and HTTP, backed by an embedded Pebble LSM-tree on local disk. Everything else in this section is a refinement of that sentence.
It helps to start by saying what codeQ is not. A message broker like Apache Kafka or Apache Pulsar publishes events to topics and lets consumers replay them with their own offsets. The broker does not care whether a message is acted upon; it cares that it was delivered. The consumer is responsible for tracking progress and for idempotency. codeQ inverts the contract. Every entry that enters the queue is a unit of work with a result slot reserved for it. A worker that claims a task takes ownership of that slot for the duration of a lease. The server tracks the task's lifecycle from creation to terminal status and refuses to mark it complete unless the claiming worker submits the result, or until the lease expires and another worker is allowed to try. The unit of consumption is not "the next message" but "a task I am now responsible for".
This is a different shape of system. Brokers optimize fan-out; codeQ optimizes single-consumer ownership with retry semantics. Brokers store offsets per consumer group; codeQ stores per-task state machines. Brokers persist messages until retention expires; codeQ persists a task until it reaches a terminal status or the operator-configured retention deletes it. The vocabulary used through this section reflects that orientation: enqueue, claim, lease, heartbeat, submit, abandon, nack, dead-letter. There is no notion of subscription offsets, consumer groups, or partition assignment in the broker sense. There is only the task table, the per-(command, tenant) FIFO queue layered on top of it, and the leasing protocol that lets workers hold a task without losing the server's record of who owns it.
The pages that follow build the concept stack from the inside out. They are not a tutorial; they explain why the system is shaped the way it is. If you want to run codeQ first and read the rationale later, start with the Get Started Overview and come back here.
Tasks and Results covers the two persistent records at the centre of the system. It walks through every field of the Task struct from pkg/domain/task.go, explains the four-state lifecycle (PENDING, IN_PROGRESS, COMPLETED, FAILED) with a state diagram, and traces the two paths that mutate task state: scheduler CreateTask and results Submit.
Queue Model explains how codeQ implements FIFO ordering within a priority bucket on top of a sorted KV. It covers the key layout from internal/repository/pebble/keys.go, the visibility timeout backed by a lease, the delayed-visibility sorted set keyed by score, and the dead-letter tombstone set. It is the page to read if you want to understand why the FIFO invariant survives crashes and concurrent claims.
Sharding explains how codeQ runs N independent Pebble shards on a single node and routes every task-keyed operation by FNV-1a hash of the task ID. It covers the atomic invariant that all keys derived from a task land on one shard, the cross-shard operations (Claim fan-out, scatter-gather admin queries, delayed-list sweeps), and why four shards is the empirical sweet spot on typical hardware.
Leases and Ownership covers the in-memory lease table, the recovery-on-Open path that rebuilds it from the durable inprog index, the heartbeat race between a slow worker and the reaper, and the ownership-transfer race that the WorkerID != req.WorkerID guard in the submit path prevents.
Multi-Tenancy explains what JWT tenant isolation buys you and what it does not. It covers the key-prefix scheme that segments queues by tenantId, the in-memory token-bucket rate limiter that enforces per-tenant ceilings, and the explicit limits of in-process isolation.
The later concept pages, which belong to part 2 of this section, cover Authentication and Authorization, the Persistence Engine, Consensus and Replication for the optional Raft-replicated mode, Cluster-Level Failover, the available Deployment Modes, and an Architecture Overview that ties the data plane to the I/O surface and the operational surface.
The choice of "task queue" over "broker" is not a marketing label; it changes what the server has to do. Three properties follow from it.
The first is exactly-once completion modulo idempotency. codeQ guarantees at-most-one in-progress claim per task at any given time, enforced by the lease. A task is in IN_PROGRESS for exactly one worker until that worker either submits a terminal result, abandons the claim, nacks for retry, or stops heartbeating long enough for the lease to expire and another worker to claim it. The submitter sees a single terminal record per task. Idempotency on the producer side is a one-line opt-in: pass an idempotencyKey on Enqueue and any subsequent enqueue with the same key returns the original task ID instead of creating a second.
The second is server-side retry and dead-letter handling. A task carries an Attempts counter and a MaxAttempts ceiling. When a worker calls Nack or when a lease expires, codeQ requeues the task with an exponential-with-jitter backoff and increments the attempt counter. When Attempts >= MaxAttempts, the task is moved to the dead-letter set, where operators can inspect or replay it. None of this lives in the worker. The worker says "I failed, retry in 30 seconds" and codeQ does the bookkeeping. Compare this to a broker model, where retry topology (DLQ topic, retry topic, delay buckets) is the consumer's responsibility.
The third is the result rendezvous. Every task has a result slot. When the producer wants the outcome it calls GetResult(taskID) and either gets the ResultRecord or a "still in progress" signal. The result is durable; producers can crash and reconnect and still read their results until retention expires. This is the property that lets codeQ act as a synchronous backend for asynchronous workloads. A web request can submit a task, poll for the result, and return when it is ready, without the client speaking to the workers directly.
These three properties are why a task queue exists as a distinct category of system. A broker can be made to look like one by writing enough application code; codeQ ships them as the protocol.
A typical interaction is one Enqueue, one Claim, zero or more Heartbeats, and one Submit. The producer enqueues a task with a command, a payload, optional priority, optional webhook, and optional idempotency key. The server returns a task ID. A worker pool calls Claim, gets a task with a lease that expires in N seconds (typically 30 to 120), and starts work. If the work outlives the lease, the worker calls Heartbeat to extend it. When work finishes, the worker calls Submit with Status=COMPLETED and a result map, or Status=FAILED with an error string. The server writes a ResultRecord, updates the task to its terminal status, drops the lease, and removes the task from the in-progress index. The producer, having held the task ID, can call GetResult at any later point to read the outcome.
Things that can break this happy path each have a defined behavior. If the worker crashes mid-task, the lease expires and the reaper requeues the task with the next exponential-backoff delay. If the worker calls Submit but its lease has been stolen (because heartbeat traffic dropped long enough for another worker to claim), the server rejects the Submit with not-owner and the worker can abandon cleanly. If the same idempotencyKey is submitted twice, the second Enqueue returns the same task ID without creating a second row. If the queue is empty when a worker claims, the call returns (nil, false, nil) without blocking. Each of these cases is grounded in the lease table, the FIFO key layout, and the durable batch commit on every write — the pages below explain each in turn.
Nothing in this section is about how to wire codeQ to your stack. There are no producer SDK snippets, no Kubernetes manifests, no Prometheus dashboards. The I/O section covers wire protocols, the Observability section covers metrics and tracing, the Performance section covers measurements and tuning, and the Get Started section covers running the server. This section is the conceptual ground those other sections stand on. Read it once if you intend to operate codeQ for any nontrivial workload; refer back to it when a specific behaviour surprises you. The behaviours are not arbitrary — they fall out of the data model and the leasing protocol described here.
A note on terminology before going further. The codeQ documentation uses "task" to mean the durable unit of work and "result" to mean the durable record of its completion. It uses "worker" to mean the process that claims and executes tasks and "producer" to mean any client that submits work into the queue. The same process can play both roles — a worker can be a producer for downstream tasks it spawns — but the protocol roles are distinct because the RPCs they call are distinct. "Lease" means the time-bounded claim a worker holds on a task; "lease expiry" is the event after which the server is allowed to give the same task to a different worker. "Sharding" in this documentation refers to single-node parallelism across N Pebble instances; "cluster" refers to multi-node deployments where shards are distributed across hosts. These are two separate concepts that share a word, and the documentation is careful to qualify which sense is meant.
A second note, on scope. codeQ is a server. It does not bundle producers or workers. The reference implementation provides command-line tools and Go libraries for both roles, but the protocol surface is the gRPC and HTTP API, and any client speaking that API is a legitimate participant. Throughout this section, "worker" and "producer" are abstract roles. An operator running codeQ does not need to use the supplied tooling; they need to understand the protocol, and that is what these pages are for.
Source: github.com/osvaldoandrade/codeq.
- Overview
- Tasks and Results
- Queue Model
- Sharding
- Leases and Ownership
- Multi-Tenancy
- Authentication and Authorization
- Persistence Engine
- Consensus and Replication
- Cluster-Level Failover
- Deployment Modes
- Architecture Overview