-
Notifications
You must be signed in to change notification settings - Fork 32
Search Relevance testing infrastructure #2243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔍 Preview links for changed docs |
tests-integration/Elastic.Assembler.IntegrationTests/Search/SearchTestBase.cs
Fixed
Show fixed
Hide fixed
tests-integration/Elastic.Assembler.IntegrationTests/Search/SearchTestBase.cs
Fixed
Show fixed
Hide fixed
src/services/Elastic.Documentation.Assembler/Building/AssemblerBuildService.cs
Dismissed
Show dismissed
Hide dismissed
src/api/Elastic.Documentation.Api.Infrastructure/Adapters/Search/ElasticsearchGateway.cs
Dismissed
Show dismissed
Hide dismissed
src/api/Elastic.Documentation.Api.Infrastructure/Adapters/Search/ElasticsearchGateway.cs
Dismissed
Show dismissed
Hide dismissed
tests-integration/Elastic.Assembler.IntegrationTests/Search/SearchTestBase.cs
Fixed
Show fixed
Hide fixed
tests-integration/Elastic.Assembler.IntegrationTests/Search/SearchTestBase.cs
Dismissed
Show dismissed
Hide dismissed
reakaleek
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🚀
|
|
||
| builder.Configuration[$"Parameters:DocumentationElasticUrl"] = "http://localhost.example:9200"; | ||
| var configBuilder = new ConfigurationBuilder(); | ||
| _ = configBuilder.AddUserSecrets("72f50f33-6fb9-4d08-bff3-39568fe370b3"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: I can see this string 4 times across files. Is this just coincidence or should we put this into a constant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its the secret guid defined in aspire.csproj.
Will follow up ensuring this is a constant!
Add Search Integration Tests with Elasticsearch Explain API
Why
Search relevance is critical for documentation discoverability, but debugging why certain documents rank higher than others has been a black box. When search results don't match expectations, we need detailed insights into Elasticsearch's scoring decisions to improve our search queries and boost factors.
Additionally, as we continue to refine our hybrid search implementation (combining lexical and semantic search with RRF), we need automated tests that not only verify correct behavior but also help us understand and improve search ranking over time.
How
This PR introduces a comprehensive search testing infrastructure with two complementary test classes:
1. Infrastructure Changes
ElasticsearchGateway Refactoring (
ElasticsearchGateway.cs):Extracted query building logic into reusable static methods to eliminate duplication:
BuildLexicalQuery()- Encapsulates traditional text search with multiple match types and boost factorsBuildSemanticQuery()- Handles semantic search usingsemantic_textfieldsNormalizeSearchQuery()- Query normalization (e.g., "dotnet" → "net")This extraction serves dual purposes: DRY principle and enabling the explain functionality to use the exact same queries as production searches.
Implemented Elasticsearch Explain API integration:
ExplainDocumentAsync()- Uses Elasticsearch's_explainAPI to get detailed scoring breakdown for why a document matched (or didn't match) a queryExplainTopResultAndExpectedAsync()- Compares actual top result with expected result, providing side-by-side scoring analysisFormatExplanation()- Recursively formats Elasticsearch'sExplanationDetailtree into human-readable indented outputExplainResultrecord - Strongly-typed container for explain results (Found, Matched, Score, Explanation)SearchBootstrapFixture (
SearchTestBase.cs):2. Test Classes
SearchIntegrationTests - Black-box API testing:
/docs/_api/v1/search)SearchRelevanceTests - White-box relevance testing with explain output:
Uses
ElasticsearchGatewaydirectly to bypass HTTP layerWhen a test fails (first result doesn't match expected), automatically:
Includes
Assert.SkipUnless(searchFixture.Connected)to gracefully handle Elasticsearch unavailabilityUses the same test cases as
SearchIntegrationTestsfor consistency3. Configuration & DI
TestParameterProvider:
IParameterProviderinterface for test scenariosTest Output Example
When a search relevance test fails, developers see:
This enables data-driven decisions about boost factors, query types, and field weights.
Current Status
All 15 search integration tests passing:
The tests currently pass because search results match expectations, but the infrastructure is ready to provide detailed diagnostics the moment search relevance needs improvement.