v1.3.0
What's Changed
- Update ADOPTERS.md by @zhouzijiang in #1193
- koord-scheduler: optimize ElasticQuota preemption performance by @eahydra in #1196
- koordlet: improve koordlet profiling and logs by @xigang in #1180
- koord-scheduler: fix pods outside the Reservation preempt pods in the Reservation by @eahydra in #1197
- chore: update license ignore by @zwzhang0107 in #1202
- ci: include test folder in license check by @jasonliu747 in #1203
- koordlet: cleanup runtimehooks and remove the unused code. by @xigang in #1208
- e2e: add reservation reserves cpu cores tests by @eahydra in #1210
- koordlet: improve newKubeletStub method. by @xigang in #1212
- koordlet: fix statsinformer interface function implementation. by @xigang in #1209
- koordlet: fix koordlet batchresource when batch resource limit is not set by @saintube in #1222
- koordlet: reduce the frequency of updating NodeResourceTopology by @wangxiaoq in #1221
- koordlet: support cpuset for system qos pods by @zwzhang0107 in #1215
- koord-manager: add webhook to check slo-config by @chzhj in #1205
- koord-scheduler: support System QOS reserved cpus by @eahydra in #1225
- apis: Node Resource Reservation supports apply policy by @eahydra in #1217
- tests: add tests for LoadAwareScheduling estimator by @eahydra in #1227
- koordlet: revise metrics labels and states_informer by @saintube in #1234
- koord-scheduler: optimize the criteria for whether the Reservation is allocated by @ZiMengSheng in #1239
- koord-descheduler: LowNodeLoad supports configuring resourceWeights by @eahydra in #1240
- koord-manager: refactor config validation by github.com/go-playground by @chzhj in #1233
- koordlet: adjust node comparison method to improve program performance by @xigang in #1223
- koordlet: support block I/O QoS by @TheBeatles1994 in #1144
- koord-scheduler: skip daemonset usageThreshold check by @ZiMengSheng in #1243
- koordlet: update dockerfile and yaml file of koordlet by @TheBeatles1994 in #1254
- koordlet: disable unnecessary collector and informer is blkio qos fea… by @zwzhang0107 in #1253
- koordlet: support sandbox cpuset for rund by @zwzhang0107 in #1248
- koordlet: compare the nodeMetric generation to improve the nodemetric informer updateFunc performance by @xigang in #1255
- koordlet: generate mock states informer by @zwzhang0107 in #1257
- koord-descheduler: add leaderelection metric by @ZiMengSheng in #1242
- koord-manager: add args for common controllers by @saintube in #1235
- bugfix: when cpuset is an empty string, error log by @leason00 in #1259
- koord-manager: make fieldIndex register more scalable by @eahydra in #1261
- apis: add system usage in NodeMetric by @zwzhang0107 in #1262
- koordlet: improve metric GC by @saintube in #1267
- chore: more comments in injectForOrigin function by @Tiana2018 in #1264
- refactor: replace deprecated pointer function with the recommended pointer function by @Gala-R in #1271
- apis: defined Reservation.Spec.AllocateOnce as pointer by @eahydra in #1273
- koordlet: fix host cgroup check failed when the config is not the default by @saintube in #1275
- koord-manager: add controllers options by @saintube in #1274
- koord-scheduler: take over snapshot generation by @eahydra in #1268
- koord-scheduler: support select reservation via reservation affinity by @eahydra in #1265
- koordlet: change metric cache to tsdb by @zwzhang0107 in #1228
- koord-scheduler: refactor ReservationInfo as public by @eahydra in #1280
- koordlet: fix log error by @wangxiaoq in #1281
- koordlet: fix some unstable ut by @zwzhang0107 in #1282
- koord-scheduler: optimize the signature of the transformer interface by @eahydra in #1283
- koord-scheduler: refactor deviceShare nodeDevice by @eahydra in #1285
- koord-manager: fix variable typo by @stulzq in #1286
- runtime-proxy: extract annotations from labels for docker by @Solomonwisdom in #1220
- koord-scheduler: implement EnqueueExtensions by @eahydra in #1288
- koord-scheduler: fix duplicate allocate GPU after leader selection changed by @eahydra in #1289
- koord-scheduler: optimize the flow of restore Reservation by @eahydra in #1284
- koord-scheduler: DeviceShare skips zero requests by @eahydra in #1293
- koordlet: fix custom runtime endpoint path by @saintube in #1294
- koord-scheduler: optimize the patch implementation of the PreBind stage by @eahydra in #1296
- koord-scheduler: enhance DeviceShare with reservation by @eahydra in #1290
- chore: move koord-manager scheme under options, add host run mount by @saintube in #1298
- koord-scheduler: coscheduling supports sensing PodGroup changes by @ZiMengSheng in #1299
- koord-scheduler: support exporting scheduler cache and queue to plugin by @eahydra in #1300
- koord-manager: fix options by @saintube in #1313
- koord-scheduler: podgroup support minMember = 0 by @ZiMengSheng in #1318
- koord-scheduler: complement Reservation plugin's Pod onUpdate handler by @eahydra in #1314
- koordlet: add pod filter for metricsadvisor by @saintube in #1319
- koord-scheduler: implement PreBindExtensions to support one-time apply patches by @eahydra in #1317
- koord-scheduler: support Reservation allocate policy by @eahydra in #1291
- koord-descheduler: allow annotated pod only pass non-retriable filter by @ZiMengSheng in #1321
- koord-manager: ClusterColocationProfile support grayscale by @ZiMengSheng in #1322
- koord-descheduler: make pvc reservation switchable by @ZiMengSheng in #1323
- koord-scheduler: support dispatching scheduling error by @eahydra in #1324
- proposal: NUMA Topology Scheduling by @eahydra in #1224
- koord-scheduler: support reserve by pod by @eahydra in #1315
- koord-manager: allow skip update resource by @ZiMengSheng in #1325
- koord-scheduler: fix status is not handled correctly in DeviceShare by @eahydra in #1328
- koordlet: refactor node metrics to tsdb by @LambdaHJ in #1316
- koord-scheduler: add forget pod handler in extended handle by @eahydra in #1330
- koord-manager: remove priorityClassname validation in profile by @ZiMengSheng in #1331
- koord-scheduler: coscheduling supports reservation by @eahydra in #1335
- koord-scheduler: fix reservation affinity in filter stage by @eahydra in #1338
- koord-scheduler: update coscheduling comments by @VinceCui in #1340
- koordlet: revise cgroup update err when unsupported or path not exist by @saintube in #1339
- koord-yarn: add pkg dir for YARN related modules by @zwzhang0107 in #1337
- koord-scheduler: fix leaky operating pod in reservation by @eahydra in #1341
- koord-scheduler: export ReservationInfos via service endpoint by @eahydra in #1342
- koordlet: refactor pod/container metrics to tsdb by @LambdaHJ in #1334
- chore: resolve failed unit tests by @eahydra in #1347
- apis: support Hygon DCU device resource API by @eahydra in #1352
- koordlet: fix cgroup driver detection on cgroups-v2, revise UTs by @saintube in #1353
- koord-scheduler: coscheduling supports skipping check schedule cycle by @eahydra in #1351
- koordlet: fix flaky test of BEResourceCollector by @saintube in #1358
- koord-descheduler: compatible with k8s v1.21.x no policy/v1/eviction by @leason00 in #1362
- koord-scheduler: strictly check whether ReservationInfo is available or terminating by @eahydra in #1354
- add a histogram util by @hormes in #1365
- koord-scheduler: optimize the Reservation deletion process by @eahydra in #1360
- koord-scheduler: only remove unallocated ports if Reservation released by @eahydra in #1364
- koordlet: pod/container PSI/CPI metircs cache to tsdb by @zshmmm in #1336
- koord-scheduler: reject waiting reserve pod if reservation has been deleted by @eahydra in #1369
- koord-scheduler: complement Coscheduling plugin's onPodUpdate handler by @eahydra in #1370
- koordlet: fix node memory collection by @saintube in #1392
- koordlet: refactor info-type resource to kv by @LambdaHJ in #1384
- koord-scheduler: enhance the preemption when common resources are reserved by @eahydra in #1371
- koord-scheduler: eliminate unnecessary memory overhead and lock conflicts by @eahydra in #1390
- koord-manager: use GenerationChangedPredicate to prevent unnessary repeated reconcile calls by @jasonliu747 in #1396
- koordlet: add a resource usage predictserver to estimate long-term reclaimed resource by @hormes in #1389
- apis: add ProdReclaimableMetric field in NodeMetrics by @hormes in #1397
- koordlet: refactor statesinformer to solve the circular dependency problem by @hormes in #1400
- apis: add Mid tier api by @saintube in #1375
- koord-scheduler: fix pod outside reservation unexpected preemption by @eahydra in #1406
- koordlet: fix can not collect psi metrics by @lucming in #1402
- koordlet: report the amount of reclaimable resources of the node by @hormes in #1407
- all: upgrade to k8s v1.24.15 by @eahydra in #1398
- koord-manager: add noderesource plugin for Mid by @saintube in #1401
- koordlet: add various configurations to the prediction package by @hormes in #1411
- koord-scheduler: clean up the adaptation code in NodeNUMAResource by @eahydra in #1410
- koord-scheduler: refactor the implementation of Validate/Convert of request in DeviceShare by @eahydra in #1413
- util: remove eviction-related duplicate implementations by @leason00 in #1393
- koord-manager: enhance noderesource framework for updating node labels by @saintube in #1408
- koordlet: fix prediction setup by @saintube in #1416
- proposal: NRI Mode Resource Management by @kangclzjc in #1366
- koord-scheduler: remove CompatibleDefaultPreemption plugin by @eahydra in #1414
- koord-scheduler: use informer transformer to be compatible or modify objects by @eahydra in #1417
- koord-controller-manager: noderesource supports configuration plugins by @hormes in #1418
- koord-scheduler: remove unused tests in DeviceShare by @eahydra in #1420
- koordlet: exclude LSE from share cpu pool by @j4ckstraw in #1404
- koord-scheduler: load-aware supports estimate node allocatable by @eahydra in #1424
- koord-scheduler: load-aware estimates pod that only has requests by @eahydra in #1425
- koordlet: refactor cgroup driver setup and delay resource check by @saintube in #1426
- koord-scheduler: refresh specific NodeInfos on demand instead of all by @eahydra in #1422
- koord-scheduler: coscheduling support multiple gang match policy by @xulinfei1996 in #1380
- docs: propose node prediction by @saintube in #1385
- chore: remove the tag in internal config as usual by @eahydra in #1432
- koord-descheduler: fix GetMaxUnavailable not properly handling percentage type param by @wenchezhao in #1433
- koord-descheduler: support ignore expected replicas argument by @leason00 in #1419
- koord-scheduler: improve throughput with reservation via return PreFilterResult by @eahydra in #1434
- koord-scheduler: pod must declare gpu-memory or gpu-memory-ratio when applying for GPU by @eahydra in #1439
- koord-scheduler: fix deletion of Reservation in concurrent scenarios by @eahydra in #1437
- koord-scheduler: ElasticQuotaPlugin add RunDecoratePod in Reserve/Unreserve by @xulinfei1996 in #1443
- koord-manager: move expiring metrics into util, add prom metrics by @saintube in #1440
- koord-scheduler: fix missed json tag of internal configuration by @eahydra in #1447
- doc: add forecasting controller proposal by @zwzhang0107 in #1394
- koord-scheduler: run plugin transformers in configured order by @eahydra in #1448
- apis: add default PriorityClass and QoSClass, revise batch resource calculation by @saintube in #1446
- apis: add moduleID in Device CRD by @eahydra in #1466
- koordlet: fix scope of prodReclaimablePredictor resource reclaim by @hormes in #1469
- koordlet: change the
memory.high
calculation formula as 1.27 by @novahe in #1455 - apis: add priority-class compatible label by @hormes in #1461
- koordlet: enhance node prediction by @saintube in #1462
- koord-manager: add leader elect resource lock arg by @saintube in #1471
- koord-scheduler: optimize Coscheduling error message and logs by @eahydra in #1475
- koord-scheduler: add elastic quota informer transformer by @xulinfei1996 in #1472
- koordlet: evict pod event object use pod by @haoyann in #1468
- koord-scheduler: optimize nominate reservation by @eahydra in #1456
- koord-descheduler: LowNodeLoad supports descheduling by node pool by @eahydra in #1459
- koordlet: refactor becpu resource store by @zshmmm in #1451
- all: fix GetPriorityClass with default by @hormes in #1478
- koordlet: cgroupv2 support cpuburst by @lucming in #1485
- apis: Device CRD add topology and virtual functions fields by @eahydra in #1487
- koordlet: cleanup koordlet get the cgroup driver logic by @xigang in #1488
- koord-descheduler: podMigrationJob filterExpectedReplicas by @ZiMengSheng in #1490
- koord-scheduler: Add unschedulable field to Reservation by @FillZpp in #1381
- koordlet: rename kubeClinet to kubeClient by @xigang in #1492
- fix: reduce PMU multiplexing influence by @bowen-intel in #1489
- koordlet: remove dependencies of sqlite3 by @LambdaHJ in #1484
- koord-manager: add noderesource plugin setup options by @saintube in #1491
- koordlet: fix util/perf compilation errors on darwin by @eahydra in #1497
- koordlet: fix flaky tests by @saintube in #1498
- koord-manager: add README for noderesource framework by @saintube in #1496
- koordlet: refactor resmanager to qosmanager with plug-in framework by @zwzhang0107 in #1500
- koordlet: fix cpi compute with CpuCyclesProfiler by @bowen-intel in #1482
- koord-descheduler: PodMigrationJob supports skipping all filters by @ZiMengSheng in #1505
- chore: add unittests by @hormes in #1515
- koordlet: add error info on eviction event by @zwzhang0107 in #1520
- koordlet: fix the cpu.shares conversion on cgroups-v2 by @saintube in #1521
- koordlet: support numa topology reporting by @saintube in #1516
- chore: adjust the order of reviewers by @eahydra in #1525
- koord-scheduler: set default priority in reserve pod by @eahydra in #1524
- proposal: DeviceShare supports scoring by @eahydra in #1522
- koord-scheduler: add scoring strategy in DeviceShareArgs by @eahydra in #1526
- apis: exclude slo pkg imports from extension to avoid cycle ref by @zwzhang0107 in #1512
- proposal: enhance GPU Share API by @eahydra in #1528
- apis: add gang match policy alias annotation by @eahydra in #1530
- koord-descheduler: ignore create pmj when pod.schedulerName is not koord-scheduler by @lucming in #1533
- koordlet: improve peak predictor server by @xigang in #1532
- koordlet: report system resource usage in node metric by @zwzhang0107 in #1538
- koord-descheduler: accumulates the sum of system and Pod usage as node usage by @eahydra in #1535
- koord-scheduler: invalidate nodeDevice when deleting Device CRD object by @eahydra in #1537
- koord-scheduler: DeviceShare supports scoring by @eahydra in #1534
- koord-descheduler: pmj controller support assign many schedulers that can handle reservation by @lucming in #1540
- koordlet: add nri server for runtimehooks by @kangclzjc in #1501
- koord-descheduler: handle reservation filter in preparePendingJob by @lucming in #1542
- koord-manager: Optimize logs for reporting node devices by @FillZpp in #1545
- koord-scheduler: make sure NodeInfo and other objects have synced before scheduling by @eahydra in #1543
- Proposal: support cold memory collect and report by @BUPT-wxq in #1514
- CI: fix goreleaser.yaml for the deprecated replacement field by @saintube in #1546
- fix crio version by @kangclzjc in #1548
- koord-scheduler: fix missing AddEventHandler with forceSyncEventHandler by @eahydra in #1551
- koord-manager: fix noderesource update when compared values are zero by @saintube in #1550
- koord-manager: fix node slo extension got overwriten by @zwzhang0107 in #1552
New Contributors
- @zhouzijiang made their first contribution in #1193
- @TheBeatles1994 made their first contribution in #1144
- @Tiana2018 made their first contribution in #1264
- @Gala-R made their first contribution in #1271
- @stulzq made their first contribution in #1286
- @Solomonwisdom made their first contribution in #1220
- @VinceCui made their first contribution in #1340
- @kangclzjc made their first contribution in #1366
- @wenchezhao made their first contribution in #1433
- @haoyann made their first contribution in #1468
- @bowen-intel made their first contribution in #1489
- @BUPT-wxq made their first contribution in #1514
Full Changelog: v1.2.0...v1.3.0