Skip to content

Conversation

@DuodenumL
Copy link
Contributor

关于资源插件化的PR。

设计想法:对于core来说,它不需要知道各种资源的作用&分配方式,只需要把请求的参数转发给资源插件即可。资源插件会进行资源的具体分配工作,并对core返回engine args,用于后续engine层创建workload。

资源插件采用可执行文件的方式,core会加载指定目录下的所有可执行文件作为资源插件。

在创建workload的流程中,eru-cli要把所有关于资源的请求参数转化为map[string][]string的形式,例如{"cpu": ["1.2"]}。由于core不知道哪些参数对应哪个资源插件,所以它会把所有的参数依次转发给所有的资源插件。各个插件自己负责解析自己需要的参数,并做出正确的处理。

由于要实现插件化,所以资源相关的信息并不会再存储在node metadata里了,而是由各个资源插件自己负责存储,存储也不一定用ETCD,处理得当的话MySQL / MongoDB之类的也行(就是部署麻烦点)。

因此在workload metadata中,关于资源的描述换成了map[string]map[string][]string的方式(是不是看着有点绕),例如:

{
  "cpu-plugin": {
    "cpu": ["1.2"],
    "file": ["some-file"]
  },
  "mem-plugin": {
    "mem": ["1PB"],
    "file": ["some-other-file"]
  }
}

要分别记录各个资源插件分配的资源,否则在rollback / remove要释放资源的时候不知道怎么释放。

主要要做的事情:

  • 把当前资源处理的逻辑改成插件的形式。
  • cluster层和engine层要基于resource args的改动做出变化。

@DuodenumL
Copy link
Contributor Author

针对当前commit的一些说明:

  • cluster/calcalcium就是“使用资源插件的calcium”,为了防止混乱复制出来一份,不然IDE一顿报错难顶。这里大部分的文件没啥改动,主要就是在create.go及与其关联的其它文件上有变动。
  • PluginManager目前还没抽象化,但是Plugin做成了Interface。先撸实现再优化吧...

@CMGS
Copy link
Contributor

CMGS commented Oct 15, 2021

所以我应该看哪些- -

@DuodenumL
Copy link
Contributor Author

所以我应该看哪些- -

现在完成度还没那么高,主要动的是

  • calcalium/create.go以及这里面调用到的各种东西
  • resources/plugin.go + manager.go

Copy link
Member

@jschwinger233 jschwinger233 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

整体方向还行, 就是有些实现有点别扭, 现在先不用在意.

res[node] = &resourcetypes.NodeResourceInfo{
NodeName: node,
Capacity: utils.Min(info1.Capacity, info2.Capacity),
Rate: info1.Rate + info2.Rate*info2.Weight,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

加权平均

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是加权平均,L134的是当前带加权的总和,最后会除以总数来算平均

UnlockNodes(ctx context.Context, nodes []string) error

// GetAvailableNodes returns available nodes and total capacity
GetAvailableNodes(ctx context.Context, rawRequest RawParams) (map[string]*types.NodeResourceInfo, int, error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

名字不好, 之后换一个.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are only two hard things in Computer Science...

utils/file.go Outdated

// ListAllExecutableFiles returns all the executable files in the given path
func ListAllExecutableFiles(path string) ([]string, error) {
entries, err := os.ReadDir(path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filepath.Walk 轻松重构

}

// ResourcePluginOptions .
type ResourcePluginOptions map[string][]string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interface map 呢

types/options.go Outdated
type DeployOptions struct {
ResourceOpts ResourceOptions
//ResourceOpts ResourceOptions
ResourceOpts ResourcePluginOptions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为啥不叫 ResourceOptions

@DuodenumL DuodenumL force-pushed the feature/plugin branch 4 times, most recently from 221192b to b12f3f8 Compare October 21, 2021 04:51
@DuodenumL DuodenumL force-pushed the feature/plugin branch 3 times, most recently from cae47ef to 39843ee Compare October 29, 2021 09:47
@DuodenumL
Copy link
Contributor Author

纪念一下,终于能build了...下一步是把cpu和mem先写成插件,持续debug

return node, err
// node, err := c.scheduler.MaxIdleNode(nodes)
// todo: node, err := c.resource.MaxIdleNode(nodes)
node := nodes[0]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

标记一下,这里还得改改...

计划是让每个plugin都实现max idle node,并且返回一个优先级,选择优先级最大的那个node。

实际操作的时候只有cpu插件返回比较大的优先级。

// Plugin resource plugin
type Plugin interface {
// LockNodes locks the given nodes
LockNodes(ctx context.Context, nodes []string) error
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里也是有点纠结,感觉插件这边不一定需要有lock的功能,依赖core里那个ETCD的锁也不是不行?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个已经讨论过了, 先依然用 core 的锁, 如果有什么问题再来讨论

core.go Outdated

if config.Profile != "" {
http.Handle("/metrics", metrics.Client.ResourceMiddleware(cluster)(promhttp.Handler()))
// todo: http.Handle("/metrics", metrics.Client.ResourceMiddleware(cluster)(promhttp.Handler()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这是做啥

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

metrics里删了一大堆东西,后续也许还要补上,这里放一个todo...

// `prepare` should be a pure calculation process without side effects.
// `commit` writes the calculation result of `prepare` into database.
// if `commit` returns error, `rollback` will be performed.
func PCR(ctx context.Context, prepare func() error, commit func() error, rollback func() error) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

按照之前的风格也应该叫 Pcr

// `commit` writes the calculation result of `prepare` into database.
// if `commit` returns error, `rollback` will be performed.
func PCR(ctx context.Context, prepare func() error, commit func() error, rollback func() error) error {
err := prepare()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if err := prepare(); err != nil { bla bla } 紧凑点

log.Errorf(ctx, "[PCR] failed to prepare, err: %v", err)
return err
}
err = commit()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同样这里也是,风格保持一致

}
return err
}
return nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

而且按照这个注释写的逻辑,我感觉其实用 Txn 包一下就可以了啊

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

类似于 if prepare then commit else rollback,为什么需要一个新函数

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

其实是完全可以用Txn来替换的,用这个PCR主要是为了体现Prepare过程是无副作用的,这样感觉逻辑清晰一些...

目前的设计里plugin的大部分API都是纯计算无副作用的,例如Alloc / Realloc / Remap,它们就适用于Prepare过程

types/config.go Outdated
MaxConcurrency int64 `yaml:"max_concurrency" default:"20"` // concurrently call single runtime in the same time
Store string `yaml:"store" default:"etcd"` // store type

ResourcePluginDir string `yaml:"resource_plugin_dir" default:"/etc/eru/resource_plugins"` // resource plugins path
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plugins,注意是复数

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

另外折腾了这里的话,你的 yaml sample 也得改

log.Errorf(nil, "[SendNodeInfo] Error occurred while sending cpu used data to statsd: %v", err) //nolint
}
}
//
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WTF?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

因为Node里的ResourceMeta也换成map[string]interface{}了,所以不能直接用之前这个metrics,这里等流程跑通了之后要改...

//
//// ResourceMiddleware to make sure update resource correct
//// todo: remove resource types
//func (m *Metrics) ResourceMiddleware(cluster cluster.Cluster) func(http.Handler) http.Handler {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WTF??


func FromTemplate() lock.DistributedLock {
m := &DistributedLock{}
m.On("Lock", mock.Anything).Return(context.Background(), nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这是用在哪里的,既不是测试又不是 mock?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个是store的FromTemplate里调用的...

if err != nil {
return res, err
}
err = json.Unmarshal(body, &res)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

紧凑点

if err != nil {
return nil, err
}
return files, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

files := []string{}
	if err := filepath.Walk(basedir, func(path string, info fs.FileInfo, err error) error {
		if err != nil {
			return err
		}
		if info.IsDir() && path != basedir {
			return filepath.SkipDir
		}
		if isExecutable(info.Mode().Perm()) {
			files = append(files, path)
		}
		return nil
	}); err!= nil {
        return nil, err
    }

可以写紧凑点的

}

message StringSlice {
repeated string slice = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好像没有必要单独定义这个 message

UpdateNodeResourceUsage(ctx context.Context, node string, resourceArgs []coretypes.RawParams, direction bool) error

// UpdateNodeResourceCapacity updates node resource capacity
UpdateNodeResourceCapacity(ctx context.Context, node string, resourceOpts coretypes.RawParams, direction bool) error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

node 这个名字不好, 这个仓库里 node 变量都是 *types.Node 类型, string 类型的变量名是 nodename

rollbackPluginChan <- plugin
}
})
close(rollbackPluginChan)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rollbackPluginChan 听起来像是要回滚的 plugin, 但是 len()==all 的时候又不回滚, 这个逻辑虽然正确但是很奇怪, 想办法调整一下

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我终于知道这里哪里最别扭了, 是你把 fan-out fan-in 的逻辑的一部分放到 callPlugins 里, 而这个拆分的结果是, 如你在当前这一层来观察 rollbackPluginChan 这个 chan, 你会发现没有并发场景去使用它, 但是如果改成 []Plugin slice 也不对, 因为并发被 callPlugins 藏起来的, 但是必要的并发同步却又留给了调用方.
我觉得这里可以使用泛型了, callPlugins[T any](xxx, func(Plugin) T) []T, 让 callPlugin 内部的 goroutine 完成所有的同步, 最后返回 slice.

// Plugin resource plugin
type Plugin interface {
// LockNodes locks the given nodes
LockNodes(ctx context.Context, nodes []string) error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个已经讨论过了, 先依然用 core 的锁, 如果有什么问题再来讨论

UnlockNodes(ctx context.Context, nodes []string) error

// SelectAvailableNodes returns available nodes and total capacity
SelectAvailableNodes(ctx context.Context, nodes []string, resourceOpts coretypes.RawParams) (map[string]*types.NodeResourceInfo, int, error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

声明类型的时候, 对那些读上去不知道是啥的最好加上变量名增加可读性, 比如这里的返回值 (map, int, error) 可以写成 (_ map, variableName int, _ error), 可以让人知道这个 int 是啥, 这样注释都可以直接省略


// GetNodeResourceInfo returns total resource info and available resource info of the node, format: {"cpu": 2}
// also returns diffs, format: ["node.VolumeUsed != sum(workload.VolumeRequest"]
GetNodeResourceInfo(ctx context.Context, node string, workloads []*coretypes.Workload, fix bool) (coretypes.RawParams, coretypes.RawParams, []string, error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个接口拆两个吧, Get 操作的同时在改数据...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

大力支持!之前的Get我就觉得有点奇怪...

// SetNodeResourceInfo sets both total node resource info and allocated resource info
// used for rollback of RemoveNode
// notice: here uses absolute values, not delta values
SetNodeResourceInfo(ctx context.Context, resourceCapacity coretypes.RawParams, resourceUsage coretypes.RawParams) error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个接口第三参数是 resourceUsage, 和 UpdateNodeResourceUsage 这个接口是不是语义重复了

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

还有 UpdateNodeResourceCapacity, 功能重复?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的Set是设置绝对值,Update用的是相对值...

// Alloc allocates resource, returns engine args for each workload, format: [{"cpus": 1.2}, {"cpus": 1.2}]
// also returns resource args for each workload, format: [{"cpus": 1.2}, {"cpus": 1.2}]
// pure calculation
Alloc(ctx context.Context, node string, deployCount int, resourceOpts coretypes.RawParams) ([]coretypes.RawParams, []coretypes.RawParams, error)
Copy link
Member

@jschwinger233 jschwinger233 Nov 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

没看懂前两个 return 值是啥, 不如你把变量名加到返回值的类型声明里; RawParam 这个类型是只和资源有关的吗? 从名字来看像是能承载万物

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

打算把RawParams改成具体的类型,WorkloadResourceOpts / NodeResourceOpts / WorkloadResourceArgs / NodeResourceArgs / EngineArgs


// AddNode adds a node with requested resource, returns resource capacity and (empty) resource usage
// should return error if the node already exists
AddNode(ctx context.Context, node string, resourceOpts coretypes.RawParams) (coretypes.RawParams, coretypes.RawParams, error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

第二返回值是啥

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resource usage

Copy link
Member

@jschwinger233 jschwinger233 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感觉还是 callPlugins 太恶心了, 改成泛型就好了

Remap(ctx context.Context, node string, workloadMap map[string]*coretypes.Workload) (map[string]coretypes.RawParams, error)

// UpdateNodeResourceUsage updates node resource usage
UpdateNodeResourceUsage(ctx context.Context, node string, resourceArgs []coretypes.RawParams, direction bool) error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

direction 是啥?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incr / Decr,表示usage的改变方向

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

那把这个类型建立一个 alias 吧, 不要用 bool, 语义模糊

// SelectAvailableNodes returns available nodes which meet all the requirements
// the caller should require locks
// pure calculation
func (pm *PluginManager) SelectAvailableNodes(ctx context.Context, nodes []string, resourceOpts types.RawParams) (map[string]*resourcetypes.NodeResourceInfo, int, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doCalculateCapacity 和 SelectAvailableNodes 这两个函数几乎拥有一样的语义吧, 我觉得前者的名字好一点, 然后似乎不需要 doCalculateCapacity 这个函数了

}

// Alloc .
func (pm *PluginManager) Alloc(ctx context.Context, node string, deployCount int, resourceOpts types.RawParams) ([]types.RawParams, []map[string]types.RawParams, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pm.Alloc 是有 IO 的而 plugin.Alloc 是纯计算的, 最好统一一下 Alloc 这个函数的语义

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不只是Alloc,为了方便回滚 + PCR,plugin里大多数func都是纯计算的...但是在plugin manager里为了方便外部调用,就成了有IO的。

如果plugin的Alloc也有IO,那么plugin manager回滚的时候就不能统一调用UpdateNodeResourceUsage,而是要反向调用Alloc,也就是说Alloc本身要提供rollback的能力,感觉这样就啰嗦了很多。

*当然不可以在plugin manager里将所有resource args取反向然后直接UpdateNodeResourceUsage,因为默认core这边是不知道如何计算“反向”的,而且不是所有resource args都能通过加个负号就变成“反向”的。

所以这里的解决方案...可能是把plugin里的Alloc改成一个看起来无副作用的名字?

Copy link
Member

@jschwinger233 jschwinger233 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

有个问题, 我们对命令行接口好像有点理解不同, 我心想的其实是那种..以文档 spec 来规定接口的命令行, 比如 https://github.com/containernetworking/cni/blob/master/SPEC.md#cni-operations, spec 规定一个正确的实现应该有哪些命令行调用选项, 最后的实现方只要提供一个二进制满足 spec 就好;
但是你的接口好像是让 plugin 去实现一个 golang interface?

// deploy scheduled workloads on one node
func (c *Calcium) doDeployWorkloadsOnNode(ctx context.Context, ch chan *types.CreateWorkloadMessage, nodename string, opts *types.DeployOptions, deploy int, plans []resourcetypes.ResourcePlans, seq int) (indices []int, err error) {
logger := log.WithField("Calcium", "doDeployWorkloadsOnNode").WithField("nodename", nodename).WithField("opts", opts).WithField("deploy", deploy).WithField("plans", plans).WithField("seq", seq)
func (c *Calcium) doDeployWorkloadsOnNode(ctx context.Context, ch chan *types.CreateWorkloadMessage, nodename string, opts *types.DeployOptions, deploy int, engineArgs []types.RawParams, resourceArgs []map[string]types.RawParams, seq int) (indices []int, err error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个参数太多了, 手动格式化一下, 一个参数一行

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

借个楼,最终实现的时候是“提供一个二进制满足spec”的,plugin这个interface一方面是为了mock,一方面是“万一以后用rpc插件呢?”

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我对这个 go interface 没意见, 我是看到你没有实现具体的命令行调用方式, 如果按照 spec 来调用, 那么命令行的调用方式都是一样的, 只需要换二进制.

if err != nil {
return res, err
}
if err = json.Unmarshal(body, &res); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

聪明!


command := exec.CommandContext(ctx, cmd, args...)
if err := command.Run(); err != nil {
log.Errorf(ctx, "[callBinaryPlugin] failed to run plugin %s, command %s %v, err %s", bp.path, cmd, args, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里 err 的时候输出一下 stderr, 非常有帮助

@jschwinger233
Copy link
Member

cluster 层的逻辑整体方向没太大问题, plugin 那里需要再讨论一下, 我可以直接看看你实现的一个具体的 plugin 仓库先.
不过还有很多实现细节要打磨.

@DuodenumL
Copy link
Contributor Author

用泛型重写了一下plugin manager,显著减少了编码过程的痛苦。但是现在各种工具对泛型的支持还不全,包括但并不限于:

  • Goland识别不出来[T any]里这个any,试了下VSCode也不太行
  • Goland还有各种奇奇怪怪的报错,比如说会提示func() int不能匹配func() T
  • mockery在Parse的过程中会报错,看了下现在mockery自己还在用go1.16。
  • go vet在泛型的地方会报错,导致go test的时候还要手动指定一下-vet=off
  • golangci-lint跟mockery有一样的问题。

mockery这个问题我想办法解决了:要pull下来mockery的代码,然后把go版本改成1.17,在go install的时候要这么用:`go install -gcflags=all=-G=3 -tags="typeparams"。但是golangci-lint还没搞...

package utils

// Map like map in Python
func Map[T1, T2 any](slice []T1, f func(T1) T2) []T2 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

太骚了,测试加起来,这部分一看 ut 的覆盖率急剧下降了

return ramInBytes, nil
}

flag := int64(1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

测试 too

// `prepare` should be a pure calculation process without side effects.
// `commit` writes the calculation result of `prepare` into database.
// if `commit` returns error, `rollback` will be performed.
func Pcr(ctx context.Context, prepare func(ctx context.Context) error, commit func(ctx context.Context) error, rollback func(ctx context.Context) error, ttl time.Duration) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

缩写用 PCR

func (p *VolumePlan) UnmarshalJSON(b []byte) (err error) {
if *p == nil {
*p = VolumePlan{}
// StringSlice .
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unit test

@DuodenumL
Copy link
Contributor Author

迁移工具 & 回滚工具做好啦

@DuodenumL DuodenumL force-pushed the feature/plugin branch 2 times, most recently from 60efde5 to 38de27a Compare March 31, 2022 07:55
@DuodenumL DuodenumL force-pushed the master branch 2 times, most recently from 0c06f3a to 34dacc4 Compare April 4, 2022 09:15
Copy link
Member

@jschwinger233 jschwinger233 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

整体结构还行, 兼容性可能要考虑再改改.

Storage: resourceMeta.StorageLimit,
// note here will change the node resource meta (stored in resource plugin)
// todo: add wal here
engineArgs, deltaResourceArgs, resourceArgs, err = c.resource.Realloc(ctx, workload.Nodename, workload.ResourceArgs, opts.ResourceOpts)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的 txn 有没有可能做得对称一点, 比如这里用 source.SetNodeResourceUsage + store.UpdateWorkload 来回滚 resource.Realloc, 语义是非常隐晦的, 最好是有对称的 API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

其实应该理解为Realloc的回滚是resource.SetNodeResourceUsage,store.UpdateWorkload回滚的是上面的store.UpdateWorkload,算是一一对应的,就是Realloc回滚这里名字不大对称...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

manager里把realloc和alloc的rollback包了一下,这样看起来就是对称的了。

@DuodenumL
Copy link
Contributor Author

准备合啦,接下来的工作都要基于插件化来做了。

@DuodenumL DuodenumL merged commit ac037e7 into projecteru2:master Apr 14, 2022
Aceralon added a commit to Aceralon/core that referenced this pull request May 19, 2022
Aceralon added a commit to Aceralon/core that referenced this pull request May 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants