Skip to content

Feature: Add alias-backed Solr support for search indexing (ALT)#180

Draft
mdorf wants to merge 7 commits intodevelopmentfrom
feature/solrcloud-alias-indexing-claude
Draft

Feature: Add alias-backed Solr support for search indexing (ALT)#180
mdorf wants to merge 7 commits intodevelopmentfrom
feature/solrcloud-alias-indexing-claude

Conversation

@mdorf
Copy link
Copy Markdown
Member

@mdorf mdorf commented Apr 14, 2026

Summary

Adds SolrCloud alias support to Goo, enabling zero-downtime full re-indexing of ontology search collections.

After migrating from static Solr XML schemas to dynamic SolrCloud, the ability to index into an alternate core and swap was lost. This PR restores that capability using SolrCloud aliases: models reference alias names, aliases point to versioned physical collections, and atomic alias swaps (CREATEALIAS) enable zero-downtime re-indexing.

Key changes

  • Alias CRUD (solr_admin.rb): list_aliases, alias_exists?, resolve_alias, create_alias, delete_alias — all via SolrCloud Collections API
  • Two-mode connector (solr_connector.rb): init() accepts a bootstrap_collection: parameter. When the bootstrap name differs from the alias name → aliased mode (alias + physical collection, re-index capable). When same or nil → plain collection mode (no alias, backward-compatible for collections like :ontology_metadata)
  • Re-index workflow (solr_connector.rb + goo.rb):
    • create_reindex_collection(name) — creates a new collection with the correct schema
    • promote_alias(name) — atomically swaps the alias, keeps the old collection (for rollback)
    • swap_alias_and_delete_old(name) — swaps and deletes the old collection
    • Goo.create_reindex_connection, Goo.reindex_client, Goo.complete_reindex, Goo.promote_alias — module-level API
  • DSL passthrough (search.rb): enable_indexing now accepts and forwards bootstrap_collection: to the connector
  • Tests (test_solr.rb): Full coverage for alias-aware init, plain collection init, reindex collection creation, alias swap, promote, and cleanup

How it works

# Aliased mode (term_search, prop_search):
enable_indexing(:term_search, :main, bootstrap_collection: :term_search_bootstrap)
# First boot: creates collection "term_search_bootstrap", creates alias "term_search" → "term_search_bootstrap"
# Subsequent boots: resolves alias, uses existing collection
# Re-index: create new collection → index into it → promote alias → optionally delete old

# Plain mode (ontology_metadata):
enable_indexing(:ontology_metadata)
# Creates collection "ontology_metadata" directly, no alias

Test plan

  • bundle exec ruby -Itest test/solr/test_solr.rb — alias lifecycle, reindex, promote, plain mode
  • Verify existing test/test_search.rb tests still pass
  • End-to-end: boot with aliased connectors, confirm alias creation in Solr Admin UI
  • End-to-end: create reindex collection, index data, promote alias, verify search resolves through new collection

mdorf added 7 commits April 8, 2026 16:27
Support for managing SolrCloud collection aliases, which will enable
zero-downtime full re-indexing by building a new index in an alternate
collection and atomically swapping the alias to point to it.

New methods: list_aliases, alias_exists?, resolve_alias, create_alias,
delete_alias. Includes tests for all operations including atomic
alias overwrite (the swap mechanism).
On first boot, init now creates a versioned physical collection
(e.g. term_search_v1) and an alias pointing to it, instead of
creating a collection with the bare name. Subsequent boots detect
the existing alias and skip creation.

RSolr client connects via the alias URL so reads and writes resolve
transparently. Schema operations target the physical collection.

Updates existing tests to work with alias bootstrapping and adds
tests for alias-aware init behavior.
Refactors SolrConnector init to support two modes:
- Aliased mode: when bootstrap_collection differs from alias name,
  creates a physical collection + alias. Enables re-indexing via
  create_reindex_collection and swap_alias_and_delete_old.
- Plain mode: when no bootstrap_collection given (or same name),
  creates a simple collection with no alias. Same as old behavior.

Adds Goo module-level orchestration for re-index workflow:
- Goo.create_reindex_connection(alias_name, new_collection_name)
- Goo.reindex_client(alias_name)
- Goo.complete_reindex(alias_name)

Models like :ontology_metadata that don't need re-index support
continue to work unchanged. Models like :term_search will opt in
by providing bootstrap_collection in enable_indexing.
Adds reindex_client to the model ClassMethods so scripts can access
the reindex SolrConnector through the model layer. Existing index
operations (indexClear, indexCommit, indexBatch, etc.) already accept
a connection_name parameter and can target the reindex collection.
Updates enable_indexing to accept and forward bootstrap_collection
keyword arg to Goo.add_search_connection, so models can declare
their alias and bootstrap collection names in a single call.
Enables a two-step workflow: rebuild into a new collection, then
promote later. Refactors swap_alias_and_delete_old to delegate to
promote_alias, keeping both destructive and non-destructive paths.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant