Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not precise PER_NODE capacity calculation algorithm #11646

Closed
Mak-Sym opened this issue Oct 23, 2017 · 7 comments
Closed

Not precise PER_NODE capacity calculation algorithm #11646

Mak-Sym opened this issue Oct 23, 2017 · 7 comments

Comments

@Mak-Sym
Copy link

@Mak-Sym Mak-Sym commented Oct 23, 2017

During the HZ upgrade from hazelcast-3.5.2, I observed some weird behavior, which made some of my tests fail. Please consider following simple unit test:

package com.hazelcast.cache.eviction;

import com.google.common.collect.Lists;
import com.hazelcast.config.Config;
import com.hazelcast.config.EntryListenerConfig;
import com.hazelcast.config.EvictionPolicy;
import com.hazelcast.config.InMemoryFormat;
import com.hazelcast.config.MapConfig;
import com.hazelcast.config.MapIndexConfig;
import com.hazelcast.config.MaxSizeConfig;
import com.hazelcast.core.Hazelcast;
import com.hazelcast.core.HazelcastInstance;
import com.hazelcast.core.IMap;
import com.hazelcast.test.HazelcastSerialClassRunner;
import com.hazelcast.test.annotation.QuickTest;
import org.junit.After;
import org.junit.Before;
import org.junit.Test;
import org.junit.experimental.categories.Category;
import org.junit.runner.RunWith;

import static org.hamcrest.MatcherAssert.assertThat;
import static org.hamcrest.Matchers.equalTo;

@RunWith(HazelcastSerialClassRunner.class)
@Category(QuickTest.class)
public class SimplePutGetTest {
    private HazelcastInstance hazelcast;

    @Before
    public void setUp() {
        hazelcast = Hazelcast.newHazelcastInstance(new Config());
    }

    @After
    public void tearDown() {
        Hazelcast.shutdownAll();
    }

    @Test
    public void testPut() throws InterruptedException {
        hazelcast.getConfig().addMapConfig(new MapConfig("test")
                .setMaxSizeConfig(new MaxSizeConfig(50, MaxSizeConfig.MaxSizePolicy.PER_NODE))
                .setTimeToLiveSeconds(3600)
                .setBackupCount(1)
                .setMaxIdleSeconds(3600)
                .setEvictionPolicy(EvictionPolicy.LRU)
                .setAsyncBackupCount(0)
                .setInMemoryFormat(InMemoryFormat.BINARY)
                .setReadBackupData(false)
                .setNearCacheConfig(null)
                .setMapStoreConfig(null)
                .setWanReplicationRef(null)
                .setEntryListenerConfigs(Lists.<EntryListenerConfig>newArrayList())
                .setMapIndexConfigs(Lists.<MapIndexConfig>newArrayList())
                .setMergePolicy("com.hazelcast.map.merge.PutIfAbsentMapMergePolicy")
        );
        final IMap<String, String> map = hazelcast.getMap("test");

        map.put("one", "two");

        assertThat(map.get("one"), equalTo("two")); //<- fails here! we have null, but expecting to have "two"
    }
}

This unit test passes in hazelcast-3.5.x, but fails starting from version 3.6.x.
After some debugging, it turned out that this failure was caused by 2 improvements:
https://github.com/hazelcast/hazelcast/pull/6671/files
https://github.com/hazelcast/hazelcast/pull/6674/files

But those fixes look incorrect because of the following reasons:

  • New eviction formula partitionSize > perNodeCapacity * nodesCount / numberOfPartitions is not always correct. In the case above, all partitions contain only 1 entity, but it still will cause eviction to happen. This is wrong, as my instance still has capacity for 49 entities. To make it worse, versions 3.6 and 3.7 doesn't even yell any warning message to the logs
  • If, in the test above, line .setEvictionPolicy(EvictionPolicy.LRU) is commented out (or eviction policy is set to NONE), this test passes. It's not intuitive at all.
  • If the max-size is less than number of partitions, all entities are silently evicted immediately after map PUT operation

So proposed fix for PER_NODE capacity calculation would be something like this:

protected boolean checkPerNodeEviction(RecordStore recordStore, MaxSizeConfig maxSizeConfig) {
    Collection<Integer> ownedPartitions = mapServiceContext.getOwnedPartitions();
    int totalSize = 0;
    for(Integer ownedPartition: ownedPartitions) {
        PartitionContainer partitionContainer = mapServiceContext.getPartitionContainer(ownedPartition);
        if (partitionContainer != null) {
            RecordStore store = partitionContainer.getExistingRecordStore(recordStore.getName());
            if(store != null) {
                totalSize += store.size();
            }
        }
    }

    return totalSize > maxSizeConfig.getSize();
}

And deletion of just inserted object has to be avoided in EvictorImpl.java:

private EntryView selectEvictableEntry(RecordStore recordStore, Data excludedKey) {
    Iterable<EntryView> samples = getSamples(recordStore);
    EntryView excluded = null;
    EntryView selected = null;

    for (EntryView candidate : samples) {
        // it doesn't make sense to insert and then immediately delete value
        if (excludedKey != null && getDataKey(candidate).equals(excludedKey)) {
            continue;
        }

       if (selected == null) {
           selected = candidate;
       } else if (mapEvictionPolicy.compare(candidate, selected) < 0) {
           selected = candidate;
       }
   }

   return selected == null ? excluded : selected;
}

Does it make sense?

Another thing is those changes are poorly documented. Setting per node capacity to be bigger than number of nodes wasn't a requirement in 3.5. Documentation for 3.6 added it as requirement, but without clear explanation what is going to happen if max-size is set to a value lower than the partition count:

If you use this option, please note that you cannot set the max-size to a value lower than the partition count

As unit test above shows, it's possible to set max-size lower than number of partitions, if eviction policy is set to NONE.

And a last thing, in release docs for 3.6, the link to corresponding issue is broken (it's under "3.6-EA2 Fixes" section):

Fixed wrong calculation of eviction removal size when PER_NODE max-size policy is used. [6675]

(it opens a PR for "Remove XML namespace injection during schema validation")

@mmedenjak mmedenjak added this to the 3.10 milestone Oct 23, 2017
@vbekiaris
Copy link
Contributor

@vbekiaris vbekiaris commented Nov 20, 2017

@Mak-Sym thanks a lot for the detailed issue report. Since Hazelcast version 3.8.3 a warning is logged if the per-node max size value does not allow any values to be inserted in the map, like this:

18:17:57,810  WARN |testPut| - [EvictionChecker] hz._hzInstance_2_dev.partition-operation.thread-5 - [127.0.0.1]:5702 [dev] [3.10-SNAPSHOT] The max size configuration for map "test" does not allow any data in the map. Given the current cluster size of 3 members with 271 partitions, max size should be at least 91.

Logging was introduced with #10821 . I see the reasoning behind your attempt to find the exact number of entries stored in the local member but there are some important caveats:

  • The eviction check is executed on a partition thread; traversing record stores in other partitions will have thread visibility issues
  • It is prone to races
  • Eviction checking is performed frequently and the cost of traversing record stores to sum up their size is way higher than the current approach.

If, in the test above, line .setEvictionPolicy(EvictionPolicy.LRU) is commented out (or eviction policy is set to NONE), this test passes. It's not intuitive at all.

It is described in the reference manual and the XSD for declarative config that max-size has no effect when eviction policy is NONE (which is also its default value). Maybe a runtime validation for a non-default max-size value along with NONE eviction policy to log a warning would be good here? /cc @ahmetmircik wdyt?

@Mak-Sym
Copy link
Author

@Mak-Sym Mak-Sym commented Nov 20, 2017

Thank you for explanation @vbekiaris .

I agree that current algorithm is faster, but, on the other hand, we need to traverse only local partitions - is that operation such expensive? It is definitely more expensive than to perform arithmetic operations, but does not require remote invocations or something like that, so still has to be pretty fast. Am I right or did I miss something?

Now, if we decided to stick with it, does it make sense to perform a put operation at all, when per node capacity (according to the new algorithm) is < 1 and eviction policy is not NONE? So, instead of adding the data and then removing it immediately in partition thread (which can be a remote operation), it's better to avoid put operation at all, or ban eviction of just inserted data.

Also, issue link in release docs for 3.6 has to be fixed (pls. see my comment above)

@vbekiaris
Copy link
Contributor

@vbekiaris vbekiaris commented Nov 22, 2017

I agree that current algorithm is faster, but, on the other hand, we need to traverse only local partitions - is that operation such expensive? It is definitely more expensive than to perform arithmetic operations, but does not require remote invocations or something like that, so still has to be pretty fast.

You are right, no remote operations are involved. Still, the main concern is that accessing other partitions' record stores from another partition thread is incompatible with Hazelcast's threading model.

Also, issue link in release docs for 3.6 has to be fixed (pls. see my comment above)

/cc @Serdaro can this be fixed?

@devOpsHazelcast
Copy link
Contributor

@devOpsHazelcast devOpsHazelcast commented Dec 7, 2017

it's working as designed, I agree we should improve documentation.

@Serdaro
Copy link
Member

@Serdaro Serdaro commented Dec 19, 2017

Fixed the link in 3.6 release notes. Thank you @Mak-Sym

@Mak-Sym
Copy link
Author

@Mak-Sym Mak-Sym commented Jan 3, 2018

Thank you @Serdaro !
@vbekiaris @devOpsHazelcast if the capacity calculation works as designed, does it make sense not to evict just inserted object? For example, never allow maxExpectedRecordStoreSize to be set to any value less than 1:

protected boolean checkPerNodeEviction(RecordStore recordStore) {
    double maxExpectedRecordStoreSize = calculatePerNodeMaxRecordStoreSize(recordStore);
    if(maxExpectedRecordStoreSize < 1) {
       maxExpectedRecordStoreSize = 1;
    }
    return recordStore.size() > maxExpectedRecordStoreSize;

or update EvictorImpl implementation to skip removing just inserted value:

private EntryView selectEvictableEntry(RecordStore recordStore, Data excludedKey) {
    Iterable<EntryView> samples = getSamples(recordStore);
    EntryView excluded = null;
    EntryView selected = null;

    for (EntryView candidate : samples) {
        // it doesn't make sense to insert and then immediately delete value
        if (excludedKey != null && getDataKey(candidate).equals(excludedKey)) {
            continue;
        }

        if (selected == null) {
            selected = candidate;
        } else if (mapEvictionPolicy.compare(candidate, selected) < 0) {
            selected = candidate;
        }
    }

    return selected == null ? excluded : selected;
}

Makes sense?

@Mak-Sym
Copy link
Author

@Mak-Sym Mak-Sym commented Jan 25, 2018

I created PR #12195 with fix proposal, so closing this issue for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
6 participants