Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue=#1030 bugfix of multi-load at master initial phase #1038

Merged
merged 1 commit into from
Nov 14, 2016

Conversation

00k
Copy link
Collaborator

@00k 00k commented Oct 17, 2016

@baidubot
Copy link
Collaborator

Reviewers: @lylei @caijieming-baidu @taocp

VLOG(8) << "OFFLINE Tablet with empty addr, " << tablet;
} else if (!tabletnode_manager_->FindTabletNode(server_addr, &node)) {
tablet->SetStatus(kTableOffLine);
VLOG(8) << "OFFLINE Tablet of Dead TS, " << tablet;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是否应该用应该判断是否是not init状态,在考虑设置offline;有可能ts被kick;

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@caijieming-baidu 上面几行已经check过啦,请看395行

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我的意思是上面的check init 到set offline之间,是有机制保证肯定不会发生状态转换吗?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

有保证,这个函数执行的时候是初始化阶段,还不会有其它线程去操作tablet,唯一的例外是zk的触发事件,这是通过tabletnode_mutex_去保证的

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的,这个没问题了

VLOG(8) << "OFFLINE Tablet of Dead TS, " << tablet;
} else if (node->state_ == kReady) {
tablet->SetStatus(kTableOffLine);
VLOG(8) << "OFFLINE Tablet of Alive TS, " << tablet;
TryLoadTablet(tablet, server_addr);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@caijieming-baidu 答复同上

dead_node_tablet_list.push_back(tablet);
// Ts not response, we count its tablets as Ready and wait for it to be kicked.
tablet->SetStatus(kTableReady);
VLOG(8) << "UNKNOWN Tablet of No-Response TS, " << tablet;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ts被kick的事件,并不会对处于init状态的tablet进行出来,此时将tablet设置成ready,是否会造成tablet游离,无人管理。

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@caijieming-baidu 这个函数在入口拿了tabletnode_mutex_锁,与DeleteTabletNode/AddTabletNode是同一把锁,会保证ts被kick等情况的处理变成串行的,所以不会有问题

uint32_t meta_num = response->tabletmeta_list().meta_size();
for (uint32_t i = 0; i < meta_num; i++) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some tablets may load on such new ts

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@caijieming-baidu We have to assume the new ts has no tablet loaded, otherwise ‘multi-load’ would be inevitable.
This is the fundamental design of Tera.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if new ts has no tablet, why query it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to make sure this TS is alive and switch its status to kReady

Copy link
Collaborator

@caijieming-ng caijieming-ng Nov 3, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zk/nexus notifies master that a new ts restart, so we no need to query it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Do you mind if I remove the query logic and call this function within AddTabletNode?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

never mind

@caijieming-ng
Copy link
Collaborator

LGTM

@caijieming-ng caijieming-ng merged commit 925e73d into baidu:master Nov 14, 2016
@00k 00k deleted the master_init_bug branch November 14, 2016 05:35
caijieming-ng pushed a commit to caijieming-ng/tera that referenced this pull request Nov 24, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants