Ranking of records returned by Search and List #1475

adamtagscherer · 2026-05-07T19:32:10Z

adamtagscherer
May 7, 2026
Maintainer

Ranking of records returned by Search and List

dirctl search and dirctl routing search currently return matching records in an order that has no relation to relevance. This RFC proposes adding a ranking layer on top of the existing match logic, exposing the resulting score and its component signals on the wire so the ordering is explainable.

Background

How search works today

The repository exposes two search-shaped APIs that share concepts but live in different layers:

agntcy.dir.search.v1.SearchService — local, SQL-backed. Implemented in server/controller/search.go, indexed in server/database/gorm/. Used by dirctl search.
agntcy.dir.routing.v1.RoutingService.Search and .List — distributed, KV + DHT. Implemented in server/routing/. Used by dirctl routing search. Already returns a match_score per result (number of RecordQuery items that matched), but does not use it to sort.

The only ordering anywhere in the codebase is one line in server/database/gorm/record.go:

	query = query.Order("records.created_at DESC")

Concretely:

Path	Ordering today
`SearchService.SearchCIDs`	`created_at DESC` (when the record was added to the local index)
`SearchService.SearchRecords`	`created_at DESC`
`RoutingService.List` (local-only)	KV iteration order over `/records/*`
`RoutingService.Search` (remote)	KV iteration order over `/skills/`, `/domains/`, `/modules/*`

match_score is computed for every routing result but only attached as metadata; results are streamed in iteration order.

Available ranking signals

The following are persisted today and can drive a ranking score without any new storage or schema changes:

In the SQL index (server/database/gorm/). name, version, schema_version, oasf_created_at, authors, signed boolean, plus joined tables for skills, locators, modules, and domains. Verification state lives in signature_verifications and name_verifications, exposed today as the RECORD_QUERY_TYPE_TRUSTED and RECORD_QUERY_TYPE_VERIFIED filters.

Outside SQL. DHT provider count via server/routing/handler.go::GetProviders, the locally-cached /skills/.../<CID>/<peerID> keys, and the per-query match_score from server/routing/query_matching.go.

Proposal

Add a per-result ranking score, computed at query time from a small linear combination of normalized signals, and stream results in descending score order. Expose the score and its component sub-scores on the wire so the ordering is explainable in the CLI and UI.

Signals

Each signal is normalized to [0, 1]. The final score is Σ (w_i · s_i) with weights configurable in daemon.config.yaml. Defaults below are starting points for discussion.

Signal	Computation	Default weight	Source
Query relevance	`match_score / num_queries`	0.30	`server/routing/query_matching.go`
Trust — signed	1 if `records.signed`, else 0	0.10	`server/database/gorm/record.go`
Trust — sig-verified	1 if any row in `signature_verifications` has `status='verified'`	0.10	`server/database/gorm/signature.go`
Trust — name-verified	1 if `name_verifications.status='verified'`	0.10	`server/database/gorm/naming.go`
Popularity	`min(1, providers / K)`, K configurable (default 10)	0.15	`server/routing/handler.go::GetProviders` / KV cache
Taxonomy completeness	`min(1, (#skills + #domains + #modules + #locators) / N)`, N default 8	0.05	Joined tables
Freshness	`exp(-Δdays / τ)` on `oasf_created_at` (fallback `created_at`), τ default 365 days	0.15	`records.oasf_created_at`
Schema recency	Higher for newer schema versions, semver-compared against the latest known	0.05	`records.schema_version`

Notes:

Per-query signals are evaluated per request. Per-record signals can be cached.
Signals already exposed as filters (verified, trusted) remain hard filters; the ranking uses them as soft signals when not filtered on.
Popularity is a runtime DHT signal — it changes as peers join and leave — so it cannot be baked into a static SQL column. It must be evaluated per query.

Ranking options

Option A — static signals only

Per-record signals only, no query_relevance term.

Pro: simplest model, smallest API change.
Con: same ordering regardless of the query the user typed. A signed, replicated record about robotics will still rank above a perfectly-matching record about Python. The whole point of search is relevance — discarding it is a steep cost.

Option B — static signals with query relevance ("hybrid ranking", recommended)

All signals above, including query_relevance.

Pro: matches user intent. Builds on match_score, which already exists. Zero new persisted state, zero schema migrations.
Con: requires sorting in memory. Routing search already collects all candidates before streaming, so it's free there. SearchService switches from ORDER BY ... LIMIT in SQL to "fetch candidates → score → sort → page", with a candidate cap to bound work. Pagination becomes more involved (see below).

Implementation

API changes

Additive proto changes — no breaking changes for existing clients.

agntcy.dir.routing.v1.SearchResponse:

message SearchResponse {
  core.v1.RecordRef record_ref = 1;
  Peer peer = 2;
  repeated RecordQuery match_queries = 3;
  uint32 match_score = 4;

  // Composite ranking score in [0, 1000] (fixed-point for cross-SDK
  // determinism). Higher = better.
  uint32 rank_score = 5;

  // Per-signal sub-scores that contributed to rank_score, for
  // explainability ("why was this first?").
  RankExplanation rank_explanation = 6;
}

message RankExplanation {
  uint32 query_relevance = 1; // 0..1000
  uint32 trust           = 2; // composite of signed + sig-verified + name-verified
  uint32 popularity      = 3;
  uint32 completeness    = 4;
  uint32 freshness       = 5;
  uint32 schema_recency  = 6;
  uint32 provider_count  = 7; // raw, not normalized
}

The same rank_score and rank_explanation are added to agntcy.dir.search.v1.SearchRecordsResponse and SearchCIDsResponse.

Both endpoints stream results in rank_score DESC order with a deterministic tie-break: created_at DESC, then cid lexicographic.

Server changes

A new package server/ranking/ with:

ranking.go — pure Score(record, query, signals) Result.
signals.go — adapters that pull each signal from existing types (types.Record, types.SignatureVerification, etc.).
config.go — weights and bounds.

Both controllers import this package; scoring logic is not duplicated.

`SearchService` (SQL-backed)

server/controller/search.go currently does:

	recordCIDs, err := c.db.GetRecordCIDs(filterOptions...)
	if err != nil {
		return fmt.Errorf("failed to get record CIDs: %w", err)
	}

The new flow is GetRecords(filterOptions) (with associations) → score in server/ranking → sort → apply Limit/Offset → stream. Bound the candidate set by N = max(1000, limit·10) before scoring to keep work finite.

Note: even Option A cannot precompute a static rank_score SQL column, because popularity is a runtime DHT signal. Materialising a partial score (everything except popularity) and combining it with the live signal at query time is possible, but adds complexity for little benefit until profiling shows query-time scoring is a bottleneck.

`RoutingService.Search` (KV / DHT)

server/routing/routing_remote.go already collects all (cid, peer, matchScore) tuples in memory before streaming. Add the ranking pass there:

After candidates are collected, look up provider count via handler.GetProviders (or count cached /skills/.../<cid>/<peer> entries — cheaper, already local).
For per-record signals (signed, trusted, verified, completeness, freshness, schema), look up the local SQL index by CID. If the CID isn't locally indexed (purely remote), fall back to documented defaults and mark the explanation accordingly.
Sort, then stream.

Configuration

ranking:
  enabled: true
  weights:
    query_relevance:  0.30
    trust:            0.30   # split internally across signed / sig-verified / name-verified
    popularity:       0.15
    completeness:     0.05
    freshness:        0.15
    schema_recency:   0.05
  freshness:
    half_life_days: 365
  popularity:
    saturation_at_providers: 10

If ranking.enabled = false, the server returns results in today's order with rank_score = 0 and an empty explanation. This is the safe-rollout switch.

Pagination

Once we sort by computed score, naive LIMIT/OFFSET over the SQL table doesn't work because the score isn't in the DB. Two options:

Window-and-sort. Fetch up to N = max(1000, limit·10) filtered candidates from SQL, score in Go, sort, then apply offset and limit on the sorted slice. Simple, fine while the underlying filtered set is bounded.
Stable cursor. Replace offset with a next_page_token encoding (rank_score, cid) of the last item streamed. The next call says "give me results with (rank_score, cid) < (last_score, last_cid)". More work, but matches gRPC streaming idioms.

UX

CLI

dirctl search and dirctl routing search keep their current --format cid default, so existing scripts (dirctl search ... | xargs dirctl pull ...) keep working. The ordering changes — that is the point — and the changelog calls this out.

A new --format ranked (richer output):

$ dirctl search --skill "python*" --format ranked --limit 5

  RANK  SCORE  CID                  NAME             VER     SIGNED  VERIFIED  TRUSTED  PROVIDERS  AGE
   1     872   bafkrei...abc        cisco/web-agent  v1.2.0  yes     yes       yes      7          3d
   2     654   bafkrei...def        acme/scraper     v0.9.1  yes     no        yes      3          22d
   3     510   bafkrei...ghi        foo/python-bot   v0.1.0  no      no        no       1          90d

Tip: re-run with --explain to see how each score was computed.

--explain adds a per-row breakdown:

   1     872   bafkrei...abc
        query_relevance: 300/300  trust: 180/300  popularity: 130/150
        completeness: 80/100      freshness: 90/100   schema_recency: 50/50

--format record keeps printing record JSON/YAML; rank_score and rank_explanation are added as optional fields.

UI

The Directory currently has a separate web GUI. This RFC does not propose UI changes, but rank_score and rank_explanation are designed to be UI-friendly: a "Why this result?" tooltip can be rendered from rank_explanation without server-side help. Coordinate with frontend before implementation lands.

musaabhasan · 2026-05-08T17:23:00Z

musaabhasan
May 8, 2026

I like the direction, especially exposing component signals instead of returning a single opaque score.

One design point I would be careful with is cross-type comparability. A score for a skill, a domain, and a module may not mean the same thing unless the features are normalized per record type. I would keep ranking namespace-aware first, then only merge across namespaces if the response also exposes record_type, ranking_version, and component scores.

A useful scoring split could be:

query coverage: how many requested terms/fields matched,
field specificity: name and declared capability should count more than broad descriptive text,
record quality: completeness, schema validity, signature/provenance, and deprecation status,
freshness: useful within the same logical record type, but not strong enough to outrank a more specific older result globally,
availability or route quality: health, locality, and route cost, preferably as a secondary tie-breaker.

I would also make tie-breaking deterministic: score desc, specificity desc, updated_at desc, stable CID or record id asc. That makes CLI output and tests much easier to reason about.

The metadata should probably include both score and score_components, plus a no_score_reason for records that were returned by legacy paths or could not be evaluated. That lets clients adopt ranking gradually without assuming all records have equal signal quality.

1 reply

adamtagscherer May 13, 2026
Maintainer Author

Thanks for the thoughtful comment.

On the namespace-aware bit: our model ranks records, not skills vs. domains vs. modules — those are query types the user filters by, but every result in the response stream is a record. So I don't think cross-type comparability applies here. Happy to be wrong if I'm missing what you meant.

Field specificity (name matches counting more than broader-field matches) is a fair point. I'm going to keep it out of the first iteration to avoid bloating query_relevance, but it's a reasonable v2 once we have data on whether the simple count-of-matched-queries is good enough.

Deterministic tie-break: agreed, and the RFC already specifies rank_score DESC, created_at DESC, CID for exactly this reason.

The strongest idea here is surfacing why a record couldn't be fully scored. Routing search does have records where we only know the CID via the DHT and can't look up per-record signals locally — today the RFC just says "fall back to defaults and mark the explanation accordingly," which isn't precise. I'll add a score_status field (FULL, PARTIAL, DEFAULTS_ONLY) to RankExplanation so clients can distinguish those cases. Thanks for flagging it.

Skipping ranking_version for now — at our scale the config is operator-local and stable per deployment, so I don't think it carries its weight.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ranking of records returned by Search and List #1475

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Ranking of records returned by Search and List #1475

Uh oh!

adamtagscherer May 7, 2026 Maintainer

Ranking of records returned by Search and List

Background

How search works today

Available ranking signals

Proposal

Signals

Ranking options

Option A — static signals only

Option B — static signals with query relevance ("hybrid ranking", recommended)

Implementation

API changes

Server changes

SearchService (SQL-backed)

RoutingService.Search (KV / DHT)

Configuration

Pagination

UX

CLI

UI

Replies: 1 comment · 1 reply

Uh oh!

musaabhasan May 8, 2026

Uh oh!

adamtagscherer May 13, 2026 Maintainer Author

adamtagscherer
May 7, 2026
Maintainer

`SearchService` (SQL-backed)

`RoutingService.Search` (KV / DHT)

Replies: 1 comment 1 reply

musaabhasan
May 8, 2026

adamtagscherer May 13, 2026
Maintainer Author