Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server/cache: reorganize the cache #359

Merged
merged 7 commits into from
Nov 6, 2016

Conversation

huachaohuang
Copy link
Contributor

First, stores and regions are reorganized so that we can easily access
the information we need.

Second, the algorithm to handle region heartbeat is modified
accordingly, to handle Split and Merge.

It's hard to handle Split and Merge perfectly to keep the cache always
reflecting the newest information about which region is serving which
range. But we can keep the cache always updating according to the region
heartbeat.

Here's how we handle region heartbeat. First, if the region does not
exist, it is added to the cache directly. Second, if the region exists
but the heartbeat is stale, an error will be returned. Finally, if the
region exists and the heartbeat is not stale, it is updated to the
cache. Additionaly, every time a region is added or updated, all range
overlap regions will be deleted from the search tree first, and then
the region will be inserted. Note that regions deleted from the search
tree can still be found by the region id.

This PR relies on #353 and #356.

@siddontang siddontang changed the title server/cache: reorganize the cache [DNM] server/cache: reorganize the cache Oct 23, 2016
@huachaohuang huachaohuang mentioned this pull request Oct 24, 2016
@huachaohuang huachaohuang changed the base branch from huachaohuang/add-region-info to master October 25, 2016 08:46
@huachaohuang huachaohuang changed the title [DNM] server/cache: reorganize the cache server/cache: reorganize the cache Oct 25, 2016
@huachaohuang
Copy link
Contributor Author

PTAL @siddontang @overvenus
cache.go and cache_test.go are totally rewrote, you can review the file directly.

@siddontang
Copy link
Contributor

A huge PR, changed lines > 1024. Can you split it?

huachaohuang added a commit that referenced this pull request Oct 27, 2016
This commit is separated from #359, it
remove some unnecessary code and wrap some code in clusterInfo.
@huachaohuang huachaohuang changed the base branch from master to huachaohuang/add-stores-info October 27, 2016 05:10
@huachaohuang huachaohuang changed the title server/cache: reorganize the cache [DNM] server/cache: reorganize the cache Oct 27, 2016
huachaohuang added a commit that referenced this pull request Oct 30, 2016
* server/cache: wrap some methods in clusterInfo

This commit is separated from #359, it
remove some unnecessary code and wrap some code in clusterInfo.
@huachaohuang huachaohuang changed the base branch from huachaohuang/add-stores-info to master October 31, 2016 10:41
@huachaohuang huachaohuang changed the title [DNM] server/cache: reorganize the cache server/cache: reorganize the cache Oct 31, 2016
@huachaohuang
Copy link
Contributor Author

It's hard to split this PR any more, they depend on each other, and half of them are tests.
PTAL @overvenus @siddontang

oldRegion, ok := r.regions[region.GetId()]
if !ok {
log.Fatalf("updateRegion for none existed region in regions - %v", region)
if region.Leader == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if no leader, do we need to clear up the leaders and followers cache for the region?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the region exists, we will first remove the original region and add a new one, see setRegion.
addRegion and removeRegion are actually helper functions, only setRegion will be used outside.

func (c *clusterInfo) getRegions() []*regionInfo {
c.RLock()
defer c.RUnlock()
return c.regions.getRegions()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seem it is still not thread-safe to use []*regionInfo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getRegions will clone the regions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have too many regions, this function may have bad performance, is it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but seems no where use this function except for the get regions API.

}

func (r *regionsInfo) addRegionCount(region *metapb.Region) {
// Remove from leaders and followers.
for _, peer := range region.GetPeers() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we meet following case:

1, add region a with peers 1,2,3,4
2, remove region a with peers 1,2,3

So the followers cache may have a uncleared peer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we remove a region, we will get the original region by id, which means that the removed region must be the same as the previous added region.

@siddontang
Copy link
Contributor

LGTM

PTAL @overvenus @qiuyesuifeng

Copy link
Member

@overvenus overvenus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -14,14 +14,12 @@
package server

import (
"bytes"
"math/rand"
"log"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use github.com/ngaut/log.

"sync"
"time"

"github.com/golang/protobuf/proto"
"github.com/gogo/protobuf/proto"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why here use gogo proto?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, goimports do that for me, seems they are mostly the same, and we do use gogo/protobuf in kvproto, I think it's OK.

regionsInfo will be rewrote later, we should not access it directly.
First, stores and regions are reorganized so that we can easily access
the information we need.

Second, the algorithm to handle region heartbeat is modified
accordingly, to handle Split and Merge.

It's hard to handle Split and Merge perfectly to keep the cache always
reflecting the newest information about which region is serving which
range. But we can keep the cache always updating according to the region
heartbeat.

Here's how we handle region heartbeat. First, if the region does not
exist, it is added to the cache directly. Second, if the region exists
but the heartbeat is stale, an error will be returned. Finally, if the
region exists and the heartbeat is not stale, it is updated to the
cache. Additionaly, every time a region is added or updated, all range
overlap regions will be deleted from the search tree first, and then
the region will be inserted. Note that regions deleted from the search
tree can still be found by the region id.
@huachaohuang huachaohuang merged commit af843d1 into master Nov 6, 2016
@huachaohuang huachaohuang deleted the huachaohuang/reorganize-cache branch December 7, 2016 10:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants