Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[map] Decrease of performance indexed search on v3.7.6 and above #11231

Closed
illian64 opened this issue Aug 26, 2017 · 6 comments
Closed

[map] Decrease of performance indexed search on v3.7.6 and above #11231

illian64 opened this issue Aug 26, 2017 · 6 comments

Comments

@illian64
Copy link

@illian64 illian64 commented Aug 26, 2017

Hello!
I found decrease of performance for indexed search.
I put objects (200 000) in map. Each object has 2 fields, for example String field1 and String field2.
field1 - isn't unique, field2 - is unique. Both has index.
After that I try to find values by field1 and field2 100 times.

Predicate predicate = entryObject.get("field1").equal(value1)
    .and(entryObject.get("field2").equal(value2));

For version 3.7.5 and below search takes about 50 milliseconds.
For version 3.7.6 and above (include 3.8.4 and 3.9 EA) search takes about 5000 milliseconds.

If field1 will be unique, search on version 3.7.6 and above takes about 50 milliseconds again.

I wrote very simple test for reproduce it.

Do I something wrong?

@Slf4j
public class SoHazelcastTest {
    public static final int COUNT = 200_000;
    public static final String MAP = "map";

    @Test
    public void testUnique() throws Exception {
        test(true); //approximately 50 milliseconds on 3.7.5 and 3.7.6
    }

    @Test
    public void testNotUnique() throws Exception {
        test(false); //approximately 50 milliseconds on 3.7.5 and approximately 5000 milliseconds on 3.7.6
    }

    private void test(boolean uniqueValue1) {
        Config hazelcastConfig = new Config();
        MapConfig mapConfig = hazelcastConfig.getMapConfig(MAP);
        mapConfig.addMapIndexConfig(new MapIndexConfig("field1", false));
        mapConfig.addMapIndexConfig(new MapIndexConfig("field2", false));
        HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance(hazelcastConfig);

        log.info("Fill in");
        for (int i = 0; i < COUNT; i++) {
            String value1 = uniqueValue1 ? "v1" + i : "v1";
            hazelcastInstance.getMap(MAP).put(UUID.randomUUID(), new StoredObject(value1, "value2_" + i));
        }

        log.info("Start search");
        long startTime = System.nanoTime();
        for (int i = 0; i < 100; i++) {
            String value1 = uniqueValue1 ? "v1" + i : "v1";
            EntryObject entryObject = new PredicateBuilder().getEntryObject();
            Predicate predicate = entryObject.get("field1").equal(value1)
                    .and(entryObject.get("field2").equal("value2_" + i));
            assertEquals(1, hazelcastInstance.getMap(MAP).values(predicate).size());
        }
        long durationInMills = (System.nanoTime() - startTime) / 1000000;
        log.info("Finish search, durationInMills = {}", durationInMills);

        Hazelcast.shutdownAll();
    }

    @Data
    @NoArgsConstructor
    @AllArgsConstructor
    private static class StoredObject implements Serializable {
        private String field1;
        private String field2;
    }
}
@pveentjer
Copy link
Member

@pveentjer pveentjer commented Aug 27, 2017

@illian64 thanks for the reproducer and nailing it down to the patch release!

@pveentjer pveentjer changed the title Decrease of performance indexed search on v3.7.6 and above [map] Decrease of performance indexed search on v3.7.6 and above Aug 27, 2017
@mmedenjak mmedenjak added this to the 3.10 milestone Feb 19, 2018
@jerrinot
Copy link
Contributor

@jerrinot jerrinot commented Mar 8, 2018

@taburet: isnt this a duplicate of the issue you have been working on?

@taburet
Copy link
Contributor

@taburet taburet commented Mar 8, 2018

@jerrinot not sure yet, but most likely it is not, since here we have the degradation on non-unique indexes only, going to check this tomorrow.

@taburet taburet self-assigned this Mar 8, 2018
@taburet
Copy link
Contributor

@taburet taburet commented Mar 9, 2018

I have profiled 3.7.5, 3.7.6 and 3.10-SNAPSHOT. As it turned out, the slowdown is caused by the introduction of the result set copying for indexed queries in 3.7.6. This was done to make the returned result set more consistent, so the index modifications performed after the result set was obtained are not affecting the result set contents. In the provided test this causes copying of the collection with 200k items on every query invocation in the non-unique index case.

In the 3.9 series the hazelcast.index.copy.behavior option was introduced to control this copying behavior. If its value is set to NEVER the behavior and the performance are exactly the same as it was in 3.7.5. The default value is COPY_ON_READ, which provides more strict consistency guarantees than NEVER. See Copying Indexes and IndexCopyBehavior for more in-depth details.

@mmedenjak
Copy link
Contributor

@mmedenjak mmedenjak commented Mar 12, 2018

Closing as the cause is in correctness constraints and there is a workaround.

@mmedenjak mmedenjak closed this Mar 12, 2018
@illian64
Copy link
Author

@illian64 illian64 commented Mar 13, 2018

Thanks for your answer!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants
You can’t perform that action at this time.