Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add cluster-client #191

Merged
merged 1 commit into from
Jan 9, 2017
Merged

feat: add cluster-client #191

merged 1 commit into from
Jan 9, 2017

Conversation

gxcsoccer
Copy link
Contributor

@gxcsoccer gxcsoccer commented Jan 7, 2017

Checklist
  • npm test passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines
Affected core subsystem(s)
Description of change
  • 内置 cluster-client
  • 文档

@mention-bot
Copy link

@gxcsoccer, thanks for your PR! By analyzing the history of the files in this pull request, we identified @popomore, @fengmk2 and @atian25 to be potential reviewers.

@codecov-io
Copy link

codecov-io commented Jan 7, 2017

Current coverage is 97.38% (diff: 89.65%)

Merging #191 into master will decrease coverage by 0.29%

@@             master       #191   diff @@
==========================================
  Files            34         33     -1   
  Lines           863        881    +18   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits            843        858    +15   
- Misses           20         23     +3   
  Partials          0          0          

Powered by Codecov. Last update df3d7d4...bea6485

@fengmk2
Copy link
Member

fengmk2 commented Jan 8, 2017

赞!还带文档

@@ -56,4 +57,29 @@ module.exports = {
},
};
},

/**
* 将客户端封装为 "cluster" 模式
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

注释都要写英文

@@ -272,4 +273,29 @@ module.exports = {
};
},

/**
* 将客户端封装为 "cluster" 模式
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

写明一下这个是强制为follower的,与agent唯一差异的地方

TBD
## 背景

大家知道 Nodejs 是单进程、单线程的,为了充分地利用多核 CPU,官方提供了 `cluster` 模块,可以让多个进程监听同一个端口。该方案已经非常成熟且广泛被使用,但它也会带来一些额外的开销和问题。例如:一些中间件需要和服务器建立长连接,理论上一台服务器最好只建立一个长连接,因为它是非常宝贵的资源,但 cluster 模式会导致 n 倍(n = 进程数)的连接被创建。另外,有些工作是只能由一个进程来做的,比如:日志切分等等。所以,我们需要一个方案来协调多个进程间的职责和共享资源。
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

大家-》我们

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nodejs统一都叫node

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JavaScript 执行是单线程,node本身里面是有多线程做io的,看看其他人关于这块是怎么写的

- 客户端会被区分为两种角色:
- Leader: 负责和 server 端维持连接的实例,对于一类客户端只有一个 Leader
- Follower: 类似代理的“假”实例,它和 Leader 间通过 socket 连接,并且把请求代理给 Leader
- 客户端启动的时候通过本地端口的争夺来确定 Leader。例如:大家都尝试监听 7777 端口,最后只会有一个实例抢占到,那它就变成 Leader,其余的都是 Follower
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

句号丢失

| |
```

## 在 Egg 里如何使用
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

尽量使用“框架”代替 egg,方便后面将 egg 文档与 chair 文档合并的时候,让读者的上下文保持一致。

const startCluster = require('egg-cluster').startCluster;

module.exports = (options, callback) => {
options = options || {};
options.customEgg = options.customEgg || path.join(__dirname, '../..');
startCluster(options, callback);
utils.getFreePort((err, port) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个是不是加到 egg-cluster

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我考虑了下 还是放 egg 会好点,放egg-cluster里可能让人搞不清用途

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

现在 egg-cluster 就是为 egg 服务的,所以进程相关的还是放 cluster 里比较好一点。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

放 egg-cluster 里,新增功能需要改两个模块,单测也要分成两块来写

@@ -1,10 +1,18 @@
'use strict';

const path = require('path');
const utils = require('../core/util');
const startCluster = require('egg-cluster').startCluster;

module.exports = (options, callback) => {
options = options || {};
options.customEgg = options.customEgg || path.join(__dirname, '../..');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个我之后把他去掉吧,egg-cluster 必须传 customEgg 不设默认值。

* - {Number} [maxWaitTime] - Follower 等待 Leader 启动的最大时长,默认为 30 秒
* @return {ClientWrapper} 封装后实例
*/
cluster(clientClass, options) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

添加这个之后是不是可以把 worker client 删除了。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是的,单独pr

@@ -1,6 +1,339 @@
title: Cluster
title: 多进程模型
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

进程模型和客户端模型是不是要分开写,这里基本没写进程相关的。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我加了,先写到一起了,后面若有需要可以拆开

options.port = this._options.clusterPort;
// agent worker 来做 leader
options.isLeader = true;
options.logger = this.coreLogger;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

规范是不传 logger 了吧

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个要改造 cluster-client

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cluster-client 本身也用了 logger?

@gxcsoccer
Copy link
Contributor Author

文档继续改

@popomore
Copy link
Member

popomore commented Jan 8, 2017

多进程模型和 cluster-client 分开写。进程模型主要讲 master/agent/app 进程直接的关系,如何启动,参数的作用,进程间通信。cluster-client 主要是给进程中的客户端通信用的。

@fengmk2
Copy link
Member

fengmk2 commented Jan 9, 2017

@gxcsoccer 改好了么?


- 在服务器上同时启动多个进程
- 每个进程里都跑的是同一份源代码(好比把以前一个进程的工作分给多个进程去做)
- 更神奇的是,这些进程可以同时监听一个端口(具体原理推荐阅读朴灵老师的《深入浅出 Nodejs》 9.4 Cluster模块)。
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JacksonTian 老师的文章没有公开的链接,还是使用 @DavidCai1993 这篇 cluster 实现原理吧 https://cnodejs.org/topic/56e84480833b7c8a0492e20c

@gxcsoccer
Copy link
Contributor Author

@fengmk 先帮忙review吧,在医院,等下有空修改review意见


其中:

- 负责启动其他进程的叫做 Master 进程,他好比是个“包工头”,不做具体的工作,只负责启动其他进程。
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

『包工头』中文双引号这么写?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

搜狗输入法配置

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

吊,学习了


上面的示例是不是很简单,但是作为企业级的解决方案,要考虑的东西还有很多。

- worker 进程异常退出以后该如何处理?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worker 统一一下上下文的写法?上面是 Worker,这里是 worker,然后下面又是 Worker


### 异常处理

健壮性是企业级应用必须考虑的问题,除了程序本身代码质量要保证,框架层面也需要提供相应的“兜底”机制保证极端情况下应用的可用性。
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

健壮性是企业级应用必须考虑的问题

健壮性(又叫鲁棒性)是企业级应用必须考虑的问题

鲁棒性可以让英文翻译比较能准确翻译上。

1. 关闭当前进程所有的 TCP Server(将已有的连接快速断开,且不再接收新的连接),断开和 Master 的 IPC 通道,让进程能够优雅的退出;
2. 当 Worker 进程“死掉”以后,Master 进程会重新 fork 一个新的 Worker,保证在线的“工人”总数不变。

在 node 社区里,有 supervisor, forever, nodemon, pm2 等等模块做类似的事情,但或多或少都有些坑,或是太重了。所以,我们开发了 [graceful](https://github.com/node-modules/graceful) 和 [egg-cluster](https://github.com/eggjs/egg-cluster) 两个模块,在阿里和蚂蚁的生产环境经过长时间验证,被证明是相对靠谱的方案。
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

supervisor, forever, nodemon, pm2 都加上外链

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感觉这里不用说别人有问题?因为现在也指不出来,感觉更多的是不需要那么多复杂的功能,所以开发了这两个库,经过验证足够稳定也足够精简?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

close all | <----------+ |
tcp servers | |
| disconnect |
| ------------------------> |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+------------------------> |

```

1. Master 启动后先 fork Agent 进程
2. Agent 初始化成功后,通过 IPC 通道通知 Master 它已经 ready
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

去掉“ 它已经 ready”


### Agent 的用法

你可以在应用或插件根目录下的 agent.js 中实现你自己的逻辑(和 app.js 用法类似,只是入口参数是 agent 对象)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agent.js 加上代码块标示,文件路径都需要

```js
// ${baseDir}/agent.js

module.exports = function(agent) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agent => {} 非 generator 都是有 arrow function 吧

```js
// ${baseDir}/app.js

module.exports = function(app) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上

- 如何确定谁是 Leader,谁是 Follower 呢?有两种模式:
- 自由竞争模式:客户端启动的时候通过本地端口的争夺来确定 Leader。例如:大家都尝试监听 7777 端口,最后只会有一个实例抢占到,那它就变成 Leader,其余的都是 Follower。
- 强制指定模式:框架指定某一个 Leader,其余的就是 Follower
- egg 里面我们采用的是强制指定模式,Leader 只能在 Agent 里面创建,这也符合我们对 Agent 的定位
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

egg 里面

框架里面

我们将客户端接口抽象为下面两大类,这也是对客户端接口的一个规范,对于符合规范的客户端,我们可以自动将其包装为 Leader/Follower 模式

- 订阅、发布类(subscribe / publish)
- subscribe 接口包含两个参数,第一个是订阅的信息,第二个是订阅的回调函数
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • subscribe(info, listener)
  • publish(info)


### 具体的使用方法

下面我用一个简单的例子,介绍在 Egg 里面如何让一个客户端支持 Leader/Follower 模式
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

介绍在 Egg 里面

介绍在框架里面


const RegistryClient = require('registry_client');

module.exports = function(agent) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agent => {}


module.exports = function(agent) {
const done = agent.readyCallback('register_client', {
isWeakDep: agent.config.runMode === 0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个删除了

};
```

- 第三步,在 ${baseDir}/app.js 中使用 app.cluster 接口对 RegistryClient 进行封装
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

app.js

const co = require('co');
const RegistryClient = require('registry_client');

module.exports = function(app) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

app => {}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

改成 function* (app) { ,因为下面是有了 co


module.exports = function(app) {
const done = app.readyCallback('register_client', {
isWeakDep: app.config.runMode === 0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上

publishData: 'xxx',
});

co(function*() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

现在不需要 co 了


app.js
```js
module.exports = function(agent) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

app 不是 agent 吧

module.exports = function(agent) {
agent.mockClient = agent.cluster(MockClient)
.delegate('sub', 'subscribe')
.create();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

app 里面的实现方法还是 sub 吧,写上例子

agent.mockClient = agent.cluster(MockClient)
.delegate('sub', 'subscribe')
// 增加一个 xxx 的方法
.override('xxx', function() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

加个有意义的名字,还有使用者怎么使用。


如果在原来的客户端基础上,你还想增加一些 api,你可以使用 override API

app.js
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是 app 还是 agent 呢?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是 app.js

@gxcsoccer gxcsoccer force-pushed the feat-add-cluster branch 2 times, most recently from 05d8f31 to 55e0703 Compare January 9, 2017 15:31
@gxcsoccer
Copy link
Contributor Author

都改好了

@fengmk2 fengmk2 removed the WIP label Jan 9, 2017
@fengmk2 fengmk2 added this to the v1.x milestone Jan 9, 2017
Copy link
Member

@fengmk2 fengmk2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

callback(err);
return;
}
options.clusterPort = port;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里由 @popomore 来迁移。

| exit |
+-------------------------> | +---------+
| | | Worker |
die | fork a new +----+----+
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

在 disconnect 的同时就 fork 了,这个地方挪上去一点把,不然容易引起误会

另外,关于 Agent Worker 还有几点需要注意的是:

1. 由于 App Worker 依赖于 Agent,所以必须等 Agent 初始化完成后才能 fork App Worker
2. Agent 虽然是 App Worker 的“小秘”,但是业务相关的工作不应该放到 Agent 上去做,不然把她累垮了就不好了
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

『小秘』

@fengmk2
Copy link
Member

fengmk2 commented Jan 9, 2017

@gxcsoccer 更新一下图就自行合并吧,commit log 也改一下,少了一个 f


另外,通过 messenger 传递数据效率是比较低的,因为它会通过 Master 来做中转;万一 IPC 通道出现问题还可能将 Master 进程搞挂。

那么有没有更好的方法呢?答案是肯定的,我们框架提供了一种叫 “Cluster Client” 来降低了这种场景的复杂度。
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

提供了一种叫 “Cluster Client” 看起来不太顺,如果是一个模式的话是不是不应该叫 Cluster Client?

+-----------------------------------------------------------------------------------------------+
```

1. Follower 连接上 server 后,首先发送一个 register channel 的 packet(引入 channel 的概念是为了区别不同类型的客户端)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

连接上 local server 后

@gxcsoccer
Copy link
Contributor Author

gxcsoccer commented Jan 9, 2017

@dead-horse @fengmk2 搞定,等 ci 跑完合并

@popomore popomore mentioned this pull request Jan 9, 2017
31 tasks
@gxcsoccer gxcsoccer merged commit 00b7eb3 into master Jan 9, 2017
@gxcsoccer gxcsoccer deleted the feat-add-cluster branch January 9, 2017 16:48
@fengmk2
Copy link
Member

fengmk2 commented Jan 9, 2017

文档路径不对,我改改。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants