Incremental snapshot not working correctly after running forcemerge. #102395

merajblueshift · 2023-11-21T04:40:55Z

Elasticsearch Version

Version: 8.7.1, Build: rpm/f229ed3f893a515d590d0f39b05f68913e2d9b53/2023-04-27T04:33:42.127815583Z, JVM: 20.0.1

Installed Plugins

discovery-ec2

Java Version

bundled

OS Version

Linux 6.1.59-84.139.amzn2023.x86_64 #1 SMP PREEMPT_DYNAMIC Tue Oct 24 20:57:25 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Problem Description

We had an index containing many docs.deleted. In order to clean up the index it was restored on a cluster where no read/writes were happening on the index. After that, we ran a _forcemerge with only_expunge_deletes as true on the index to optimize it. Before the force merge operation was started, SLM took one snapshot of the index to s3.
The force merge operation optimized the index and reduced the number of segments from 5k to 1.5k. However, when we took a fresh snapshot of the index and restored it, the index didn't match the state after the force merge. The restored index still had the state that was before the force merge operation. On checking the snapshot details we also observed that the incremental snapshot taken after the force merge operation finished in 800 ms despite the fact that all underlying segment files had changed.

Steps to Reproduce

Create an index and add, and delete documents on it. Make sure the docs.deleted count on the index reaches a very high number.
Stop all reads/writes to the index.
Change the index.merge.policy.expunge_deletes_allowed setting to 1.
Take a snapshot of the index to s3.
Run a _forcemerge operation with only_expunge_deletes set to true.
docs.deleted count and number of segments on the index should reduce after _forcemerge.
Take a new snapshot of the index.
Restore the index from the snapshot taken in step 7.
The restored index still has the state of the index before _forcemerge operation.

Logs (if relevant)

No response

The text was updated successfully, but these errors were encountered:

DaveCTurner · 2023-11-21T11:37:56Z

The reason for this behaviour is that we skip shard snapshots where the shard contents have not changed meaningfully since the last snapshot, where "meaningfully" is defined here:

elasticsearch/server/src/main/java/org/elasticsearch/snapshots/SnapshotShardsService.java

Lines 496 to 500 in 476dc2c

    
           return userCommitData.get(Engine.HISTORY_UUID_KEY) 
        
               + "-" 
        
               + userCommitData.getOrDefault(Engine.FORCE_MERGE_UUID_KEY, "na") 
        
               + "-" 
        
               + maxSeqNo;

Note that this includes some consideration of force-merges, but that doesn't include the ?only_expunge_deletes case:

elasticsearch/server/src/main/java/org/elasticsearch/index/engine/InternalEngine.java

Lines 2404 to 2411 in f017fab

    
           if (onlyExpungeDeletes) { 
        
               indexWriter.forceMergeDeletes(true /* blocks and waits for merges*/); 
        
           } else if (maxNumSegments <= 0) { 
        
               indexWriter.maybeMerge(); 
        
           } else { 
        
               indexWriter.forceMerge(maxNumSegments, true /* blocks and waits for merges*/); 
        
               this.forceMergeUUID = forceMergeUUID; 
        
           }

As a workaround I suspect that another force-merge with ?max_num_segments=10000 would set the force-merge ID without actually merging anything, bypassing the no-op detection on the snapshot. But in the long run I do agree that we should probably consider an ?only_expunge_deletes force-merge as a meaningful change here too.

elasticsearchmachine · 2023-11-21T11:40:12Z

Pinging @elastic/es-distributed (Team:Distributed)

merajblueshift added >bug needs:triage Requires assignment of a team area label labels Nov 21, 2023

DaveCTurner closed this as completed Nov 21, 2023

DaveCTurner reopened this Nov 21, 2023

DaveCTurner added :Distributed/Engine Anything around managing Lucene and the Translog in an open shard. and removed needs:triage Requires assignment of a team area label labels Nov 21, 2023

elasticsearchmachine added the Team:Distributed Meta label for distributed team label Nov 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incremental snapshot not working correctly after running forcemerge. #102395

Incremental snapshot not working correctly after running forcemerge. #102395

merajblueshift commented Nov 21, 2023 •

edited

DaveCTurner commented Nov 21, 2023 •

edited

elasticsearchmachine commented Nov 21, 2023

Incremental snapshot not working correctly after running forcemerge. #102395

Incremental snapshot not working correctly after running forcemerge. #102395

Comments

merajblueshift commented Nov 21, 2023 • edited

Elasticsearch Version

Installed Plugins

Java Version

OS Version

Problem Description

Steps to Reproduce

Logs (if relevant)

DaveCTurner commented Nov 21, 2023 • edited

elasticsearchmachine commented Nov 21, 2023

merajblueshift commented Nov 21, 2023 •

edited

DaveCTurner commented Nov 21, 2023 •

edited