Skip to content

Conversation

@AlexSkrypnyk
Copy link
Member

@AlexSkrypnyk AlexSkrypnyk commented Nov 17, 2025

Closes #2089

Summary by CodeRabbit

  • New Features

    • Added Estonian language resources: tokenization, stopwords, synonyms, accent mappings, spellchecker and suggester support.
  • Improvements

    • Upgraded search stack to Solr 9 (new Lucene version) and improved Ukrainian stemming.
    • Switched to a modern cache backend and enabled remote/stream streaming options.
    • Added circuit-breaker and other updated search defaults.
  • Documentation

    • Updated Docker and local dev docs for Solr 9 and added dependency wait-for script; clarified maintenance permission guidance.

@coderabbitai
Copy link

coderabbitai bot commented Nov 17, 2025

Warning

Rate limit exceeded

@AlexSkrypnyk has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 1 minutes and 55 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 7b690e1 and 7fa01c7.

⛔ Files ignored due to path filters (16)
  • .vortex/installer/tests/Fixtures/install/_baseline/.docker/config/solr/config-set/accents_et.txt is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/_baseline/.docker/config/solr/config-set/nouns_et.txt is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/_baseline/.docker/config/solr/config-set/protwords_et.txt is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/_baseline/.docker/config/solr/config-set/stopwords_et.txt is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/_baseline/.docker/config/solr/config-set/synonyms_et.txt is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/_baseline/.docker/solr.dockerfile is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/services_no_solr/.docker/config/solr/config-set/-accents_et.txt is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/services_no_solr/.docker/config/solr/config-set/-nouns_et.txt is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/services_no_solr/.docker/config/solr/config-set/-protwords_et.txt is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/services_no_solr/.docker/config/solr/config-set/-stopwords_et.txt is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/services_no_solr/.docker/config/solr/config-set/-synonyms_et.txt is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/services_none/.docker/config/solr/config-set/-accents_et.txt is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/services_none/.docker/config/solr/config-set/-nouns_et.txt is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/services_none/.docker/config/solr/config-set/-protwords_et.txt is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/services_none/.docker/config/solr/config-set/-stopwords_et.txt is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/services_none/.docker/config/solr/config-set/-synonyms_et.txt is excluded by !.vortex/installer/tests/Fixtures/**
📒 Files selected for processing (18)
  • .docker/config/solr/config-set/accents_et.txt (1 hunks)
  • .docker/config/solr/config-set/nouns_et.txt (1 hunks)
  • .docker/config/solr/config-set/protwords_et.txt (1 hunks)
  • .docker/config/solr/config-set/schema.xml (8 hunks)
  • .docker/config/solr/config-set/schema_extra_fields.xml (1 hunks)
  • .docker/config/solr/config-set/schema_extra_types.xml (4 hunks)
  • .docker/config/solr/config-set/solrconfig.xml (10 hunks)
  • .docker/config/solr/config-set/solrconfig_extra.xml (2 hunks)
  • .docker/config/solr/config-set/solrconfig_query.xml (1 hunks)
  • .docker/config/solr/config-set/solrcore.properties (1 hunks)
  • .docker/config/solr/config-set/stopwords_et.txt (1 hunks)
  • .docker/config/solr/config-set/synonyms_et.txt (1 hunks)
  • .docker/solr.dockerfile (1 hunks)
  • .vortex/CLAUDE.md (1 hunks)
  • .vortex/docs/content/getting-started/features.mdx (1 hunks)
  • .vortex/docs/content/tools/docker.mdx (1 hunks)
  • .vortex/docs/content/workflows/notifications.mdx (0 hunks)
  • .vortex/docs/cspell.json (2 hunks)

Walkthrough

Migrates Solr to version 9, adds Estonian language resources and analyzers, converts many dynamic fields to plural/multiValued types, swaps caches to Caffeine, enables remote streaming and circuit-breaker config, updates Docker image and Solr core properties, and adds spellcheck/suggester configs for Estonian.

Changes

Cohort / File(s) Summary
Estonian linguistic resources
.docker/config/solr/config-set/accents_et.txt, .docker/config/solr/config-set/nouns_et.txt, .docker/config/solr/config-set/protwords_et.txt, .docker/config/solr/config-set/stopwords_et.txt, .docker/config/solr/config-set/synonyms_et.txt
Add Estonian static resources: accent-to-ASCII mapping, noun file (empty), protected words (empty/placeholder), comprehensive stopwords list, and a small synonyms file.
Core schema
.docker/config/solr/config-set/schema.xml
Bump schema name/version, set _root_ stored="true", add _nest_path_ NestPathField, add plural/multiValued field types (strings, booleans, pints/pfloats/plongs/pdoubles, date_ranges), convert many dynamic fields to plural/multi-valued variants, and enable storeOffsetsWithPositions on text_ws.
Extra schema field types
.docker/config/solr/config-set/schema_extra_types.xml
Add text_et, text_unstemmed_et, and collated_et fieldTypes (Estonian analyzers with MappingCharFilter, EdgeNGram, DictionaryCompoundWordTokenFilter, etc.); replace Snowball Russian stemmer with Morfologik for Ukrainian fields and bump their version markers.
Extra schema fields
.docker/config/solr/config-set/schema_extra_fields.xml
Add Estonian dynamicField patterns (ts_, tm_, tos_, tom_, tus_, tum_, sort_*) referencing the new Estonian types.
Solr main config
.docker/config/solr/config-set/solrconfig.xml
Update config name/version, bump luceneMatchVersion to LUCENE_90, add circuitBreaker blocks (commented options), add numVersionBuckets to updateLog, enableRemoteStreaming and enableStreamBody, and remove XSLT response writer.
Query config / caches
.docker/config/solr/config-set/solrconfig_query.xml
Replace LRU/FastLRU cache implementations with CaffeineCache for documentCache, fieldValueCache, filterCache, perSegFilter, and queryResultCache; update version annotations.
Spellcheck & suggest
.docker/config/solr/config-set/solrconfig_extra.xml
Add Estonian spellchecker configuration and Estonian suggester configuration (AnalyzingInfixLookupFactory, text_et analyzer, contextField sm_context_tags).
Core properties
.docker/config/solr/config-set/solrcore.properties
Add Estonian files to replication confFiles (stopwords_et, nouns_et, protwords_et, accents_et) and bump solr.luceneMatchVersion to 9.10.0.
Docker image & startup tweaks
.docker/solr.dockerfile
Upgrade base image to solr-9-drupal:25.10.0; add RUN edits to solrconfig/solrcore.properties changing dataDir, disable native locking (solr.lock.type=none), and adjust autoSoftCommit.MaxTime.
Docs & tooling
.vortex/CLAUDE.md, .vortex/docs/content/getting-started/features.mdx, .vortex/docs/content/tools/docker.mdx, .vortex/docs/content/workflows/notifications.mdx, .vortex/docs/cspell.json
Update documentation to reference Solr 9 and new service details, add wait-for-dependencies service docs, expand CLAUDE.md permission guidance, minor formatting edits, and cspell updates (add “langid”, “uselagoon”; remove “redis”).

Sequence Diagram(s)

sequenceDiagram
  participant Dockerfile as Build
  participant Solr as SolrContainer
  participant Config as ConfigSet
  participant Analyzer as TextAnalyzers
  participant Indexer as Spellcheck/Suggester
  participant Query as QueryPath

  rect rgba(200,230,255,0.4)
  Build->>Solr: Start container from solr-9-drupal image
  Solr->>Config: Load config-set from .docker/config/solr/config-set/
  end

  rect rgba(220,255,220,0.25)
  Config->>Analyzer: Register new fieldTypes (text_et, collated_et, ...)
  Config->>Indexer: Register spellcheck/suggest components (et)
  Config->>Query: Apply cache backend -> CaffeineCache
  Config->>Query: Enable remote streaming & circuitBreaker config
  end

  rect rgba(255,245,200,0.25)
  Query->>Analyzer: Use analyzers for tokenization/stemming at index/query time
  Analyzer->>Indexer: Provide tokens for spell/suggest indexes
  Query->>Query: CircuitBreaker monitors and Cache caches results
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

  • Areas needing focused review:
    • .docker/config/solr/config-set/schema.xml — verify consistency of plural/multiValued dynamic field mappings and any dependent clients.
    • .docker/config/solr/config-set/schema_extra_types.xml — analyzer chains for Estonian and Ukrainian (Morfologik) substitution correctness.
    • .docker/config/solr/config-set/solrconfig.xml & .docker/config/solr/config-set/solrconfig_query.xml — circuitBreaker placement and cache backend class changes; ensure settings compatible with Solr 9.
    • .docker/solr.dockerfile — confirm in-container edits are safe and idempotent across image variants.

Possibly related PRs

Poem

"I'm a rabbit in the index stream,
Hopping from token to token like a dream.
Estonian blooms, caches hum anew,
Solr nine's the garden — I sniff and chew.
Hooray for search! 🐇✨"

Pre-merge checks and finishing touches

✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly and concisely summarizes the main change: updating Solr to version 9. It directly reflects the primary objective and is specific enough for scanning history.
Linked Issues check ✅ Passed The PR comprehensively addresses issue #2089 objectives to migrate to Solr 9 through Docker image updates, configuration schema changes, Estonian language support, and caching improvements.
Out of Scope Changes check ✅ Passed All changes are directly related to the Solr 9 migration objective. Documentation updates, Estonian language support, and fixture guidelines are all integral to the Solr 9 upgrade scope.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.docker/config/solr/config-set/solrconfig.xml (1)

510-536: Enable Solr authentication and configure access controls for remote streaming, or disable the feature if not required for production.

Verification confirms the security concern is valid. The codebase shows:

  • enableRemoteStreaming was changed from "false" to "true" and enableStreamBody="true" was added in commit 7b690e1
  • Port 8983 is exposed in docker-compose.yml without access restrictions
  • Upload limits are set to unlimited (-1)
  • solrconfig.xml itself warns: "Before enabling remote streaming, you should make sure your system has authentication enabled"
  • No security.json, authentication rules, or reverse proxy protections for Solr endpoints are present in the repository

Required actions:

  1. Configure Solr authentication (security.json with roles/permissions)
  2. Restrict access to /update/extract and other streaming handlers to authenticated users only
  3. Document the security rationale if remote streaming is necessary for production, or disable it
  4. Set reasonable upload limits instead of -1 (unlimited)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between de4b6a3 and 7b690e1.

⛔ Files ignored due to path filters (16)
  • .vortex/installer/tests/Fixtures/install/_baseline/.docker/config/solr/config-set/accents_et.txt is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/_baseline/.docker/config/solr/config-set/nouns_et.txt is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/_baseline/.docker/config/solr/config-set/protwords_et.txt is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/_baseline/.docker/config/solr/config-set/stopwords_et.txt is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/_baseline/.docker/config/solr/config-set/synonyms_et.txt is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/_baseline/.docker/solr.dockerfile is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/services_no_solr/.docker/config/solr/config-set/-accents_et.txt is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/services_no_solr/.docker/config/solr/config-set/-nouns_et.txt is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/services_no_solr/.docker/config/solr/config-set/-protwords_et.txt is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/services_no_solr/.docker/config/solr/config-set/-stopwords_et.txt is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/services_no_solr/.docker/config/solr/config-set/-synonyms_et.txt is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/services_none/.docker/config/solr/config-set/-accents_et.txt is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/services_none/.docker/config/solr/config-set/-nouns_et.txt is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/services_none/.docker/config/solr/config-set/-protwords_et.txt is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/services_none/.docker/config/solr/config-set/-stopwords_et.txt is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/services_none/.docker/config/solr/config-set/-synonyms_et.txt is excluded by !.vortex/installer/tests/Fixtures/**
📒 Files selected for processing (16)
  • .docker/config/solr/config-set/accents_et.txt (1 hunks)
  • .docker/config/solr/config-set/nouns_et.txt (1 hunks)
  • .docker/config/solr/config-set/protwords_et.txt (1 hunks)
  • .docker/config/solr/config-set/schema.xml (8 hunks)
  • .docker/config/solr/config-set/schema_extra_fields.xml (1 hunks)
  • .docker/config/solr/config-set/schema_extra_types.xml (4 hunks)
  • .docker/config/solr/config-set/solrconfig.xml (10 hunks)
  • .docker/config/solr/config-set/solrconfig_extra.xml (2 hunks)
  • .docker/config/solr/config-set/solrconfig_query.xml (1 hunks)
  • .docker/config/solr/config-set/solrcore.properties (1 hunks)
  • .docker/config/solr/config-set/stopwords_et.txt (1 hunks)
  • .docker/config/solr/config-set/synonyms_et.txt (1 hunks)
  • .docker/solr.dockerfile (1 hunks)
  • .vortex/CLAUDE.md (1 hunks)
  • .vortex/docs/content/getting-started/features.mdx (1 hunks)
  • .vortex/docs/content/tools/docker.mdx (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-06-02T00:41:36.321Z
Learnt from: AlexSkrypnyk
Repo: drevops/vortex PR: 1697
File: .vortex/installer/tests/Fixtures/install/provision_database_lagoon/docs/onboarding.md:16-17
Timestamp: 2025-06-02T00:41:36.321Z
Learning: DO NOT EVER REVIEW FILES IN .vortex/installer/tests/Fixtures/ directory - these are test fixtures and should not be reviewed under any circumstances.

Applied to files:

  • .vortex/CLAUDE.md
🪛 GitHub Actions: Vortex - Test docs
.vortex/docs/content/tools/docker.mdx

[error] 218-218: CSpell: Unknown word (uselagoon)


[error] 219-219: CSpell: Unknown word (langid)

🪛 LanguageTool
.docker/config/solr/config-set/stopwords_et.txt

[duplication] ~791-~791: Möjligt korrekturfel: du upprepade ett ord
Context: ...kõlk kilks kilks-kolks kilks-kõlks kill killadi killadi|-kolladi killadi-kõlladi killa-kolla ki...

(WORD_REPEAT_RULE)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: build (1)
  • GitHub Check: build (0)
  • GitHub Check: vortex-test-installer (8.3)
  • GitHub Check: vortex-test-installer (8.4)
  • GitHub Check: vortex-test-workflow (3)
  • GitHub Check: vortex-test-workflow (2)
  • GitHub Check: vortex-test-workflow (4)
  • GitHub Check: vortex-test-workflow (0)
  • GitHub Check: vortex-test-workflow (1)
🔇 Additional comments (28)
.vortex/CLAUDE.md (1)

1500-1519: Excellent addition of critical AI assistant guidance.

This new section appropriately formalizes permission requirements for fixture-updating commands, ensuring automated tools respect the time and scope implications of these operations. The guidance is accurate, well-justified, and complements existing fixture documentation (lines 402–421) without duplication. The timeframe ("10-15 minutes") is appropriately specific and aligns with the operational reality described earlier in the document. This reinforces best practices for maintaining the codebase and preventing unintended large-scale changes.

.docker/config/solr/config-set/synonyms_et.txt (1)

1-1: Verify the Estonian synonym mapping.

The entry "drupal, durpal" is unusual. "Drupal" is an English CMS name, and "durpal" does not appear to be a standard Estonian word or recognized transliteration. Please confirm:

  1. Is "durpal" an intentional Estonian variant or domain-specific term used in your project?
  2. Is this a typo? (e.g., should it be "drupaali" or another Estonian form?)
  3. What is the intended use case for this synonym mapping?

Without clarification, this synonym may cause unexpected search behavior in Estonian text processing.

.docker/config/solr/config-set/protwords_et.txt (1)

1-1: Empty protwords file is valid for Solr configuration.

The file is appropriately placed as part of the Estonian language support additions for the Solr 9 migration. An empty protwords file is valid in Solr (it simply means no words are explicitly protected from stemming). If there are Estonian-specific words that should not be stemmed or transformed, add them here (one per line).

.docker/config/solr/config-set/solrconfig_query.xml (1)

3-29: Cache backend migration to CaffeineCache aligns with Solr 9 best practices.

Updating from LRU-based caches to CaffeineCache with preserved tuning parameters is appropriate for Solr 9. Version annotations correctly reflect the upgrade.

.docker/config/solr/config-set/solrcore.properties (2)

13-13: Lucene version upgrade to 9.10.0 is correctly reflected.

This aligns with Solr 9 migration and matches the version in solrconfig.xml (LUCENE_90). Re-indexing after this change is recommended if not already done per Solr upgrade guidelines.


5-5: Verify all replication configuration files exist in config-set.

The solr.replication.confFiles list has been significantly expanded to include Estonian language resources. Please confirm that all referenced files (especially stopwords_et.txt, synonyms_et.txt, nouns_et.txt, protwords_et.txt, accents_et.txt) are present in .docker/config/solr/config-set/ to avoid replication failures.

.docker/config/solr/config-set/solrconfig_extra.xml (2)

122-136: Estonian spellchecker configuration follows established patterns.

The Estonian language spellchecker is correctly configured with parameters matching other European language configurations. Parameters (accuracy: 0.5, maxEdits: 2, minQueryLength: 4) are appropriate for Estonian morphology.

Verify that the spellcheck_et field and text_et fieldType are defined in the schema files.


695-706: Estonian suggester configuration is consistent and complete.

The Estonian suggester is properly configured with AnalyzingInfixLookupFactory and references the text_et analyzer fieldType for language-specific tokenization and analysis.

Confirm that the text_et fieldType is defined in schema_extra_types.xml and is suitable for Estonian language text analysis.

.docker/config/solr/config-set/solrconfig.xml (4)

14-14: Solr 9 configuration metadata and Lucene version correctly updated.

Config name updated to drupal-4.3.10-solr-9.x-1 and luceneMatchVersion changed from LUCENE_80 to LUCENE_90, consistent with PR objectives and related changes in solrcore.properties.

Also applies to: 41-41


449-493: Circuit breaker configuration enables stability management for Solr 9.

The CircuitBreakerManager is properly configured with defaults disabled but available for customization. Memory and CPU threshold configuration options are documented, allowing operators to tune based on their infrastructure. This is a best practice addition for preventing cascading failures under load.


293-293: Transaction log versioning parameter added with appropriate default.

The numVersionBuckets parameter (default 65536) improves transaction log performance for Solr 9. This default is suitable for most deployments and can be tuned via environment variable if needed.


75-94: Deprecation notices for lib directives are informative.

Comments correctly note that <lib/> directives are deprecated in Solr 9 and will be removed in 10.0, directing users to use SOLR_MODULES environment variable instead. This is helpful for future maintenance.

.vortex/docs/content/getting-started/features.mdx (1)

135-135: Documentation correctly reflects Solr 9 upgrade.

User-facing documentation accurately identifies the Solr 9 version in the pre-configured services stack, helping developers understand the environment.

.vortex/docs/content/tools/docker.mdx (3)

216-225: Solr 9 documentation is comprehensive and accurate.

The updated Solr section correctly identifies the uselagoon/solr-drupal-9 image, documents pre-configured modules, specifies the configuration path, and provides access instructions. This level of detail helps developers understand and troubleshoot the Solr service.


231-236: wait-for-dependencies service documentation clarifies stack startup behavior.

The new service documentation explains the critical role of this container in determining when the entire stack is ready, which is important for understanding Docker Compose orchestration and CI/CD reliability.


218-219: CSpell warnings for technical terms should be suppressed.

The pipeline reports CSpell errors for "uselagoon" and "langid", which are proper nouns (Lagoon image name) and technical terms (Solr module name). These should be added to the CSpell dictionary or suppressed in the markdown linting configuration to prevent false positives.

Consider adding these terms to .cspellrc.json or similar CSpell configuration:

{
  "words": ["uselagoon", "langid", "solr-drupal"]
}

Or use inline suppressions if only needed in this file:

<!-- cspell:ignore uselagoon langid -->
.docker/solr.dockerfile (2)

5-8: ✓ Base image and documentation references updated correctly.

The transition to solr-9-drupal:25.10.0 with proper @see references aligns with the Solr 9 migration objective and referenced Lagoon patterns.


16-19: ✓ Solr configuration modifications for Lagoon compatibility are well-structured.

The three sed operations (dataDir, lock.type, autoSoftCommit.MaxTime) follow established Lagoon patterns and are correctly chained with && error handling. USER context switches are properly placed for security.

.docker/config/solr/config-set/schema_extra_fields.xml (1)

81-87: ✓ Estonian language fields properly integrated into schema.

The seven new dynamicField entries follow established naming conventions and attribute patterns consistent with other language fields (English, Finnish). Field type references (text_et, collated_et, text_unstemmed_et) are correctly defined in schema_extra_types.xml.

.docker/config/solr/config-set/schema_extra_types.xml (2)

486-546: ✓ Estonian field types are comprehensively configured.

The Estonian analyzer definitions follow established patterns with proper stemming, accent normalization, compound word handling, and edge n-grams. Both stemmed and unstemmed variants are provided alongside collation support.


2454-2454: I'll search for information about Morfologik dictionary availability and Ukrainian language support.

Now I need to verify if the Ukrainian Morfologik dependency is actually declared in the project's build configuration.

Now I need to check if the repository's build configuration includes the Ukrainian Morfologik dependency:Based on my comprehensive verification, I can now provide the final rewritten review comment:

Verify Ukrainian Morfologik dictionary availability at runtime.

Solr provides support for Ukrainian lemmatization with the solr.MorfologikFilterFactory in the analysis-extras module, which requires the analysis-extras Module to be enabled. The path ua/net/nlp/ukrainian.dict referenced at lines 2454 and 2463 is correct, but the dictionary at the legacy path does not exist (caused by LUCENE-7785), and it cannot be found in the standard Solr distribution download.

To resolve this, confirm that:

  1. The ua.net.nlp:morfologik-ukrainian-search Maven artifact is declared as a dependency in the Solr or Docker build configuration
  2. The analysis-extras module is enabled in solrconfig.xml
  3. The dictionary file resolves at runtime from the ua.net.nlp package classpath

Missing this dependency will cause Solr initialization and indexing failures for Ukrainian content.

.docker/config/solr/config-set/schema.xml (7)

52-52: ✓ Schema metadata updated correctly for Solr 9 migration.

Version bumped from drupal-4.3.5-solr-8.x-1 to drupal-4.3.10-solr-9.x-1 and documentation reference updated to latest Solr guide.


125-127: ⚠️ Verify migration plan for root field storage change.

Line 125 changes _root_ from stored="false" to stored="true". This is a schema-breaking change with significant implications:

Impact:

  • Index size will increase (every document stores its nested path info)
  • Enables proper nested document retrieval in Solr 9
  • Existing indexes will not have stored root values

Action Required:

  • Confirm reindexing strategy before deploying to production
  • Document in deployment notes that indexes must be rebuilt
  • Consider performance implications of stored field growth

161-161: Verify behavioral impact of sm_context_tags type change.

Line 161 changes sm_context_tags from type="text_ws" to type="strings":

  • Old behavior: Whitespace tokenized text field (analyzed)
  • New behavior: String field (multi-valued, not analyzed)

Potential Issues:

  1. This changes how suggester context filters work (referenced in comment above line 159)
  2. Whitespace-based tokenization is replaced with exact string matching
  3. May affect suggestions that previously relied on partial word matching

Please confirm:

  • This change is intentional and has been tested with the suggester
  • No regression in suggestion quality or filtering accuracy
  • Documentation/comments updated to reflect the new behavior

272-272: ✓ New multi-valued primitive field types properly added.

Introduction of strings and booleans field types enables proper Solr 9 support for multi-valued primitive fields, complementing existing Point field types.

Also applies to: 276-276


175-227: ✓ Dynamic field updates systematically modernize to Point field types.

Comprehensive refactor of dynamic fields to use Point types (plong, pfloat, pdouble, pdate) with proper docValues configuration. Changes are consistent across all field variants (single-valued, multi-valued, indexed/unindexed). This aligns with Solr 9 best practices.


341-341: ✓ New date_ranges field type correctly supports multi-valued date ranges.

Addition of date_ranges as multi-valued variant of date_range (line 340) properly extends date range support for complex queries in Solr 9.


380-380: ✓ text_ws enhanced with offset position tracking.

Adding storeOffsetsWithPositions="true" to text_ws enables precise highlighting and phrase queries. Trade-off of increased storage is justified by improved query capabilities in Solr 9.

# Ç => C
"\u00C7" => "C"
# Ć => C
"\U0106" => "C"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix Unicode escape sequence syntax error on line 20.

Line 20 uses uppercase \U0106 but the correct syntax is lowercase \u0106. The uppercase variant is not valid for 4-digit hex escape sequences in most text processing contexts (including Solr's accent mapping configuration). This will cause a parsing error when Solr loads this configuration file.

Apply this diff to fix the error:

-# Ć => C
-"\U0106" => "C"
+# Ć => C
+"\u0106" => "C"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"\U0106" => "C"
"\u0106" => "C"
🤖 Prompt for AI Agents
In .docker/config/solr/config-set/accents_et.txt around line 20, the Unicode
escape uses an invalid uppercase escape "\U0106"; replace it with the correct
lowercase "\u0106" (and scan other entries for any similar uppercase \U
occurrences), save the file in UTF-8, and re-run Solr config validation to
ensure the mapping parses correctly.

@@ -0,0 +1 @@

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Empty file—verify intent and populate if needed.

The file is created as part of Estonian linguistic resources for Solr 9 language support but contains no content. Clarify whether this is:

  1. A placeholder that should be populated with Estonian nouns for text analysis, or
  2. An optional fallback file that should include a clarifying comment explaining its empty state.

Given that related linguistic files (stopwords_et.txt, synonyms_et.txt) are being added alongside schema field types for Estonian processing, this file likely needs content to be functional.


🏁 Script executed:

#!/bin/bash
# Check content and size of nouns_et.txt
echo "=== Checking nouns_et.txt ===" 
if [ -f ".docker/config/solr/config-set/nouns_et.txt" ]; then
  echo "File exists"
  echo "File size: $(wc -c < .docker/config/solr/config-set/nouns_et.txt) bytes"
  echo "Line count: $(wc -l < .docker/config/solr/config-set/nouns_et.txt) lines"
  echo "Content (hex dump):"
  od -c .docker/config/solr/config-set/nouns_et.txt | head -5
else
  echo "File does not exist"
fi

# Check related Estonian language files
echo -e "\n=== Checking related Estonian files ==="
for file in stopwords_et.txt synonyms_et.txt; do
  path=".docker/config/solr/config-set/$file"
  if [ -f "$path" ]; then
    echo "File: $file - Size: $(wc -c < $path) bytes, Lines: $(wc -l < $path) lines"
    echo "First 3 lines:"
    head -3 "$path" || true
  else
    echo "$file: Not found"
  fi
done

# Search for references to nouns_et in configuration
echo -e "\n=== Searching for references to nouns_et ==="
rg -i "nouns_et" .docker/config/solr/config-set/ -A2 -B2 || echo "No references found"

# Check schema for Estonian language configuration
echo -e "\n=== Checking schema for Estonian references ==="
if [ -f ".docker/config/solr/config-set/managed-schema.xml" ]; then
  rg -i "et|estonia" .docker/config/solr/config-set/managed-schema.xml -C2 || echo "No Estonian references found"
else
  find .docker/config/solr/config-set/ -name "*.xml" | head -5 | xargs grep -i "estonia\|nouns" || true
fi

# List all files in the config-set directory to understand structure
echo -e "\n=== Files in config-set directory ==="
find .docker/config/solr/config-set/ -type f | sort

Length of output: 13810


Populate nouns_et.txt with Estonian nouns—file is actively used in Solr text analysis.

The file is currently empty (1 byte) but is actively referenced in schema_extra_types.xml within the DictionaryCompoundWordTokenFilterFactory filter, which requires a populated dictionary for compound word tokenization. Without content, Estonian language text analysis will fail or produce incorrect results.

Related Estonian files (stopwords_et.txt, synonyms_et.txt) are properly populated. Populate nouns_et.txt with a comprehensive list of Estonian nouns to enable proper compound word analysis.

</fieldType>
<!--
Estonian Text Field
8.2.0
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Clarify version semantics for Estonian and Ukrainian field types.

Version comments indicate "8.2.0" for Estonian additions and Ukrainian updates, which is ambiguous given this PR migrates to Solr 9. Please confirm:

  1. Are these version numbers referring to the Drupal module version (search_api_solr) rather than Solr version?
  2. Should the comments reflect Solr 9.x version for new additions in this migration?
  3. Why is Ukrainian marked as "8.2.0" when the MorfologikFilterFactory is being added (suggesting new/updated functionality)?

Clarifying documentation will improve future maintenance clarity.

Also applies to: 517-517, 522-522, 2445-2445, 2469-2469, 2474-2474

🤖 Prompt for AI Agents
In .docker/config/solr/config-set/schema_extra_types.xml around lines 488, 517,
522, 2445, 2469, and 2474, the inline version comments that currently read
"8.2.0" are ambiguous; update each comment to explicitly state whether it refers
to the search_api_solr Drupal module version or the Solr server version (e.g.,
"search_api_solr v8.2.0" or "Solr 9.x migration"), and for this PR that migrates
to Solr 9 change comments for newly added/updated functionality (like the
MorfologikFilterFactory for Ukrainian) to reference the appropriate target
(e.g., "Added MorfologikFilterFactory — Solr 9.x migration" or "search_api_solr
vX.Y.Z") so future maintainers clearly know the semantic intent of the version
tag.

@AlexSkrypnyk AlexSkrypnyk force-pushed the feature/2089-update-solr9 branch from 7b690e1 to 22f3de4 Compare November 17, 2025 09:16
@github-actions github-actions bot temporarily deployed to commit November 17, 2025 09:19 Inactive
@AlexSkrypnyk AlexSkrypnyk force-pushed the feature/2089-update-solr9 branch from 22f3de4 to 7fa01c7 Compare November 17, 2025 09:19
@github-actions github-actions bot temporarily deployed to commit November 17, 2025 09:21 Inactive
@codecov
Copy link

codecov bot commented Nov 17, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 70.02%. Comparing base (de4b6a3) to head (7fa01c7).
⚠️ Report is 2 commits behind head on develop.

Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #2103   +/-   ##
========================================
  Coverage    70.02%   70.02%           
========================================
  Files           97       97           
  Lines         4898     4898           
  Branches        44       44           
========================================
  Hits          3430     3430           
  Misses        1468     1468           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@AlexSkrypnyk AlexSkrypnyk merged commit 9da0ba0 into develop Nov 17, 2025
28 checks passed
@AlexSkrypnyk AlexSkrypnyk deleted the feature/2089-update-solr9 branch November 17, 2025 09:42
@github-project-automation github-project-automation bot moved this from BACKLOG to Release queue in Vortex Nov 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Released in 1.34.0

Development

Successfully merging this pull request may close these issues.

Migrate to Solr 9

2 participants