Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOLR-16561: Use autoSoftCommmitMaxTime as preferred poll interval of … #1189

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ooasis
Copy link

@ooasis ooasis commented Nov 24, 2022

Description

TLOG/PULL replicas use IndexFetcher to fetch segment files from leaders. Once new segment files are downloaded and merged into existing index, a new Searcher is opened so the updated data is made available to the clients. The poll interval is determined by following code in ReplicateFromLeader

if (uinfo.autoCommmitMaxTime != -1) {
   pollIntervalStr = toPollIntervalStr(uinfo.autoCommmitMaxTime/2);
} else if (uinfo.autoSoftCommmitMaxTime != -1) {
   pollIntervalStr = toPollIntervalStr(uinfo.autoSoftCommmitMaxTime/2);
}

In a typical config for replication using TLOG/PULL replicas where data visibility is less important (a trade-off to avoid NRT replicas), we set a short commit time to persist changes and long soft-commit time to make changes visible.

<autoCommit>
  <maxTime>15000</maxTime>
  <openSearcher>false</openSearcher>
</autoCommit>
<autoSoftCommit>
  <maxTime>3600000</maxTime>
</autoSoftCommit>

With about config, the poll interval will be 15/2 = 7 sec. This leads to frequent opening of new Searchers which causes huge impact on realtime user queries, especially if the new Searcher takes long time to warmup. This also makes changes visible on followers ahead of leaders.

Because the polling of new segment files is more about visibility because TLOG replicas still get updates to tlog files via UpdateHandler (this is my understanding). It seems more appropriate to use autoSoftCommmitMaxTime as the poll interval.

Solution

I would proposed change below where autoSoftCommmitMaxTime is chosen as the preferred interval. This will make the poll interval much longer and make the visibility order more inline with eventual consistency pattern.

if (uinfo.autoSoftCommmitMaxTime != -1) {
    pollIntervalStr = toPollIntervalStr(uinfo.autoSoftCommmitMaxTime);
} else if (uinfo.autoCommmitMaxTime != -1) {
    pollIntervalStr = toPollIntervalStr(uinfo.autoCommmitMaxTime);
}

Tests

The difference can only be tested with proper replication config and controlled indexing and user queries. The change has been tried in my environment and showed much less impact on realtime queries compared with previous tests.

Checklist

Please review the following and check all that apply:

  • I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • I have created a Jira issue and added the issue ID to my pull request title.
  • I have given Solr maintainers access to contribute to my PR branch. (optional but recommended)
  • I have developed this patch against the main branch.
  • I have run ./gradlew check.
  • I have added tests for my changes.
  • I have added documentation for the Reference Guide

Copy link

This PR had no visible activity in the past 60 days, labeling it as stale. Any new activity will remove the stale label. To attract more reviewers, please tag someone or notify the dev@solr.apache.org mailing list. Thank you for your contribution!

@github-actions github-actions bot added the stale PR not updated in 60 days label Feb 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale PR not updated in 60 days
Projects
None yet
2 participants