A Firmament-based Kubernetes scheduler
Clone or download
k8s-ci-robot Merge pull request #143 from islinwb/run_dep_ensure
fix `make` after running `dep ensure` and `hack/update-bazel.sh`
Latest commit 4f74ea0 Oct 15, 2018
Permalink
Failed to load latest commit information.
build auto generated kube commit version Apr 25, 2018
cmd/poseidon Adding Events support for Poseidon Oct 9, 2018
deploy fix building poseidon image Jun 8, 2018
docs add design docs on gang scheduling Oct 12, 2018
hack update hack/lib/golang.sh Oct 15, 2018
image build poseidon image from scratch Apr 25, 2018
pkg Merge branch 'master' of https://github.com/kubernetes-sigs/poseidon Oct 9, 2018
test Modifying the E2E firmament docker image with the Events image Oct 9, 2018
third_party/forked/shell2junit Fixing the Poseidon-Verify error, missing bc command Apr 13, 2018
vendor "run 'dep ensure' and 'hack/update-bazel.sh'" Oct 15, 2018
.gitignore smaller env clean in local Jun 7, 2018
.gitmodules Make the test compile Mar 13, 2018
.travis.yml Adding travis config file for poseidon May 11, 2018
BUILD.bazel Add skaffolding for bazel build mechanism Apr 13, 2018
CONTRIBUTING.md Update CONTRIBUTING.md Aug 17, 2018
Gopkg.lock "run 'dep ensure' and 'hack/update-bazel.sh'" Oct 15, 2018
Gopkg.toml Few bug fixes and minor improvements. Aug 10, 2018
LICENSE LICENSE: use formal Apache License Content May 18, 2018
Makefile Fix govet and godeps Mar 21, 2018
OWNERS add hanxiaoshuai to reviewers Jun 19, 2018
README.md Revised Roadmap details. Aug 10, 2018
SECURITY_CONTACTS Adding security contact file for poseidon repo May 30, 2018
WORKSPACE Add skaffolding for bazel build mechanism Apr 13, 2018
code-of-conduct.md Adding code of conduct file May 30, 2018
k8s-version make poseidon release Apr 24, 2018

README.md

Build Status

Introduction

The Poseidon/Firmament scheduler incubation project is to bring integration of Firmament Scheduler OSDI paper in Kubernetes. At a very high level, Poseidon/Firmament scheduler augments the current Kubernetes scheduling capabilities by incorporating a new novel flow network graph based scheduling capabilities alongside the default Kubernetes Scheduler. Firmament models workloads on a cluster as flow networks and runs min-cost flow optimizations over these networks to make scheduling decisions.

Due to the inherent rescheduling capabilities, the new scheduler enables a globally optimal scheduling for a given policy that keeps on refining the dynamic placements of the workload.

As we all know that as part of the Kubernetes multiple schedulers support, each new pod is typically scheduled by the default scheduler, but Kubernetes can be instructed to use another scheduler by specifying the name of another custom scheduler (Poseidon in our case) at the time of pod deployment. In this case, the default scheduler will ignore that Pod and allow Poseidon scheduler to schedule the Pod on a relevant node. We plugin Poseidon as an add-on scheduler to K8s, by using the 'schedulerName' as Poseidon in the pod template this will by-pass the default-scheduler.

Key Advantages

  • Flow graph scheduling provides the following

    • Support for high-volume workloads placement.
    • Complex rule constraints.
    • Globally optimal scheduling for a given policy.
    • Extremely high scalability.

    NOTE: Additionally, it is also very important to highlight that Firmament scales much better than default scheduler as the number of nodes increase in a cluster.

Current Project Stage

Alpha Release

Design

Poseidon/Firmament Integration architecture

For more details about the design of this project see the design document doc.

Installation

In-cluster installation of Poseidon, please start here.

Development

For developers please refer here

Release Process

To view details related to coordinated release process between Firmament & Poseidon repos, refer here.

Roadmap

  • Release 0.1 – Currently Available:

    • Baseline Poseidon/Firmament Scheduling capabilities using new multi-dimensional CPU/Memory cost model is part of this release. Currently, this does not include node and pod level affinity/anti-affinity capabilities. As shown below, we are building all this out as part of the upcoming releases.
    • Entire test.infra BOT automation jobs are in place as part of this release.
  • Release 0.2 – Target Date: 25th May 2018:

    • Node level Affinity and Anti-Affinity implementation.
  • Release 0.3 – Target Date: 15th June 2018:

    • Pod level Affinity and Anti-Affinity implementation using multi-round scheduling based affinity and anti-affinity.
  • Release 0.4 – Tentative Target Date: 15th August 2018:

    • Taints & Tolerations.
    • Support for Pod anti-affinity symmetry.
    • Throughput Performance Optimizations.
  • Release 0.5 onwards:

    • Support for Max. Pods per Node.
    • Co-Existence with Default Scheduler.
    • Optimizations for reducing the no. of arcs by limiting the number of eligible nodes in a cluster.
    • CPU/Mem combination optimizations.
    • Transitioning to Metrics server API – Our current work for upstreaming new Heapster sink is not a possibility as Heapster is getting deprecated.
    • Continuous running scheduling loop versus scheduling intervals mechanism.
    • Provide High Availability/Failover for in-memory Firmament/Poseidon processes.
    • Gang Scheduling.
    • Priority Pre-emption support.
    • Resource Utilization benchmark.