Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Percolation requests can be executed before all percolator queries are loaded #10722

Closed
antonha opened this issue Apr 22, 2015 · 5 comments
Closed
Assignees
Labels
:Search/Percolator Reverse search: find queries that match a document

Comments

@antonha
Copy link

antonha commented Apr 22, 2015

The percolator can sometimes fail to match queries right after shard recovery. Version observed in: 1.5.1.

The percolator keeps all queries in an in-memory collection (shard by shard), which it reads from the index at startup. This is done by registering a listener to the IndicesLifecycle, with the listener loads the queries when afterIndexShardPostRecovery is called.

This seems to not block the shard to be reported as initialised, so sometimes this is not completed before the cluster returns a yellow status. Thus, if a request comes in before all queries have been loaded into the in-memory structure, the response will erroneously say that there were no matches.

I've been unable to create a predictively failing test for this. This test sometimes exposes the error (by not passing). For me, it fails every 5-6th time:

package org.elasticsearch.test.integration;

import org.elasticsearch.action.admin.cluster.health.ClusterHealthResponse;
import org.elasticsearch.action.percolate.PercolateRequestBuilder;
import org.elasticsearch.action.percolate.PercolateResponse;
import org.elasticsearch.action.percolate.PercolateSourceBuilder;
import org.elasticsearch.client.Client;
import org.elasticsearch.client.Requests;
import org.elasticsearch.common.settings.ImmutableSettings;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.percolator.PercolatorService;
import org.testng.annotations.AfterClass;
import org.testng.annotations.Test;

import java.io.IOException;
import java.util.concurrent.ExecutionException;

import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder;
import static org.elasticsearch.index.query.QueryBuilders.matchQuery;
import static org.hamcrest.CoreMatchers.equalTo;
import static org.hamcrest.CoreMatchers.is;
import static org.hamcrest.MatcherAssert.assertThat;

/**
 *
 */
public class RecoveryTests extends AbstractNodesTests {
    @AfterClass
    public void closeNodes() {
        closeAllNodesAndClear();
    }

    @Test(enabled = true)
    public void testRestartNode() throws IOException, ExecutionException, InterruptedException {
        Settings extraSettings = ImmutableSettings.settingsBuilder()
                .put("index.gateway.type", "local").build();

        logger.info("--> Starting one nodes");
        startNode("node1", extraSettings);
        Client client = client("node1");

        logger.info("--> Add dummy doc");
        client.admin().indices().prepareDelete("_all").execute().actionGet();
        client.prepareIndex("test", "type", "1").setSource("field", "value").execute().actionGet();

        logger.info("--> Register query");
        client.prepareIndex("test", PercolatorService.TYPE_NAME, "1")
                .setSource(jsonBuilder()
                                .startObject()
                                .field("query", matchQuery("field", "b"))
                                .field("id", 1)
                                .field("group", "g1")
                                .field("query_hash", "hash1")
                                .endObject()
                ).setRefresh(true).execute().actionGet();
        logger.info("--> Restarting node");
        closeNode("node1");
        startNode("node1", extraSettings);
        client = client("node1");
        logger.info("Waiting for cluster health to be yellow");
        waitForYellowIndices(client);

        logger.info("--> Percolate doc with field=b");
        PercolateResponse response = new PercolateRequestBuilder(client).setIndices("test").setDocumentType("type")
                .setSource(new PercolateSourceBuilder().setDoc(PercolateSourceBuilder.docBuilder().setDoc(jsonBuilder().startObject().field("_id", "1").field("field", "b").endObject())))
                .execute().actionGet();

        assertThat(response.getCount(), is(1l));

        logger.info("--> Restarting node again (This will trigger another code-path since translog is flushed)");
        closeNode("node1");
        startNode("node1", extraSettings);
        client = client("node1");
        logger.info("Waiting for cluster health to be yellow");
        waitForYellowIndices(client);

        logger.info("--> Percolate doc with field=b");
        response = new PercolateRequestBuilder(client).setIndices("test").setDocumentType("type")
                .setSource(new PercolateSourceBuilder().setDoc(PercolateSourceBuilder.docBuilder().setDoc(jsonBuilder().startObject().field("_id", "1").field("field", "b").endObject())))
                .execute().actionGet();

        assertThat(response.getCount(), is(1l));

    }

    private void waitForYellowIndices(Client client) {
        ClusterHealthResponse health = client.admin().cluster().health(Requests.clusterHealthRequest(new String[]{}).waitForYellowStatus().waitForActiveShards(5)).actionGet();
        assertThat(health.isTimedOut(), equalTo(false));
    }
}

There are tests similar to this one for the percolator, which are supposed to test the same thing. From what I understand though, those are very particular about the cluster setup.. might it be that they can't catch this issue?

@clintongormley clintongormley added discuss :Search/Percolator Reverse search: find queries that match a document labels Apr 25, 2015
@ekesken
Copy link

ekesken commented May 12, 2015

We live same problem during auto-scaling of our elasticsearch cluster in batch operations.

I described our problem and shared a script to reproduce problem in following stackoverflow post:

http://stackoverflow.com/questions/30194246/percolate-returns-empty-matches-under-heavy-load-during-elasticsearch-cluster-re

Is there any workaround that we can apply? It's critical for our scenario not to miss any content during percolation.

@ekesken
Copy link

ekesken commented May 13, 2015

Is there a way to check every shard is really OK, before sending percolation requests? obviously checking green status does not work.

@clintongormley
Copy link

@martijnvg can we somehow not mark a percolation shard as active until the percolation requests have been loaded?

@clintongormley
Copy link

(Note: this is not new in 1.5.1 - it has worked this way since the beginning)

@clintongormley
Copy link

Closed by #11799

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Search/Percolator Reverse search: find queries that match a document
Projects
None yet
Development

No branches or pull requests

4 participants