Skip to content

Commit

Permalink
Add sites manager to nodeunit and nodegroup (#289)
Browse files Browse the repository at this point in the history
* sites-manager: Add site controller

* sites-manager: Add sites-manager controller

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: make build site-manager pass

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Rename sites-manager to site-manager

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Update nodeunit && nodegroup df

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Add nodeunit controller handler

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Add site-manager deploy && dockerfile

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Updata logs

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Updata

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Update site-manager client-set

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Add ready output

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Update nodeunit version

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Get nodeunit node status

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Update node status with nodeunit status

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Add init node annotations

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Fix node not found unit error

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Add nodeGroup logic

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Update nodeGroup pkg

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Add node unit CUD witch node Group

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Add default unit create

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Add node role

Signed-off-by: attleewang <attleewang@tencent.com>

* apps-manager: Add nodeunit && nodeGroup crd

Signed-off-by: attleewang <attleewang@tencent.com>

* apps-manager: clean codes

Signed-off-by: attleewang <attleewang@tencent.com>

* apps-manager: Move site-manager shell

Signed-off-by: attleewang <attleewang@tencent.com>

* sites-manager: Add site controller

* sites-manager: Add sites-manager controller

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: make build site-manager pass

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Rename sites-manager to site-manager

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Update nodeunit && nodegroup df

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Add nodeunit controller handler

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Add site-manager deploy && dockerfile

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Updata logs

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Updata

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Update site-manager client-set

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Add ready output

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Update nodeunit version

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Get nodeunit node status

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Update node status with nodeunit status

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Add init node annotations

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Fix node not found unit error

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Add nodeGroup logic

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Update nodeGroup pkg

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Add node unit CUD witch node Group

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Add default unit create

Signed-off-by: attleewang <attleewang@tencent.com>

* site-manager: Add node role

Signed-off-by: attleewang <attleewang@tencent.com>

* apps-manager: Add nodeunit && nodeGroup crd

Signed-off-by: attleewang <attleewang@tencent.com>

* apps-manager: clean codes

Signed-off-by: attleewang <attleewang@tencent.com>

* apps-manager: Move site-manager shell

Signed-off-by: attleewang <attleewang@tencent.com>

* apps-manager: Add site manager user docs

Signed-off-by: attleewang <attleewang@tencent.com>

* apps-manager: Update site-manager docs

Signed-off-by: attleewang <attleewang@tencent.com>

* apps-manager: Update site-manager docs with action

Signed-off-by: attleewang <attleewang@tencent.com>

* apps-manager: Update node unit error

Signed-off-by: attleewang <attleewang@tencent.com>
  • Loading branch information
attlee-wang committed Nov 29, 2021
1 parent b49a0f7 commit 7de1ffd
Show file tree
Hide file tree
Showing 58 changed files with 5,262 additions and 15 deletions.
5 changes: 5 additions & 0 deletions build/docker/site-manager/Dockerfile
@@ -0,0 +1,5 @@
From alpine:3.9

ADD site-manager /usr/local/bin

ENTRYPOINT ["/usr/local/bin/site-manager"]
107 changes: 107 additions & 0 deletions cmd/site-manager/app/options/options.go
@@ -0,0 +1,107 @@
/*
Copyright 2020 The SuperEdge Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package options

import (
"runtime"
"strings"
"time"

"github.com/spf13/pflag"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
utilfeature "k8s.io/apiserver/pkg/util/feature"
cliflag "k8s.io/component-base/cli/flag"
"k8s.io/component-base/config"
)

const (
SiteManagerDaemonUserAgent = "site-manager-daemon"
)

type Options struct {
Burst int
SyncPeriod int
SyncPeriodAsWhole int
Worker int
QPS float32
Master string
Kubeconfig string
FeatureGates map[string]bool
config.LeaderElectionConfiguration
}

func NewSiteManagerDaemonOptions() *Options {
featureGates := make(map[string]bool)
featureGates["ServiceTopology"] = true
featureGates["EndpointSlice"] = true

return &Options{
SyncPeriod: 30,
SyncPeriodAsWhole: 30,
Burst: 1000,
QPS: float32(1000),
Worker: runtime.NumCPU(),
LeaderElectionConfiguration: config.LeaderElectionConfiguration{
ResourceLock: "site-manager",
ResourceNamespace: metav1.NamespaceSystem,
ResourceName: SiteManagerDaemonUserAgent,
RetryPeriod: metav1.Duration{Duration: time.Second * time.Duration(2)},
LeaseDuration: metav1.Duration{Duration: time.Second * time.Duration(15)},
RenewDeadline: metav1.Duration{Duration: time.Second * time.Duration(10)},
},
FeatureGates: featureGates,
}
}

func (o *Options) AddFlags(fs *pflag.FlagSet) {
fs.StringVar(&o.Master, "master", o.Master, "apiserver master address")
fs.StringVar(&o.Kubeconfig, "kubeconfig", o.Kubeconfig, "kubeconfig path, empty path means in cluster mode")
fs.Float32Var(&o.QPS, "kube-qps", o.QPS, "kubeconfig qps setting")
fs.IntVar(&o.Burst, "kube-burst", o.Burst, "kubeconfig burst setting")
fs.Var(cliflag.NewMapStringBool(&o.FeatureGates), "feature-gates", "A set of key=value pairs that describe feature gates for alpha/experimental features. "+
"Options are:\n"+strings.Join(utilfeature.DefaultMutableFeatureGate.KnownFeatures(), "\n"))
fs.IntVar(&o.SyncPeriod, "sync-period", o.SyncPeriod, "Period for syncing the objects")
fs.IntVar(&o.SyncPeriodAsWhole, "sync-period-as-whole", o.SyncPeriodAsWhole, "Period for syncing the dns hosts as whole")
fs.IntVar(&o.Worker, "worker", o.Worker, "worker number of controller")

fs.BoolVar(&o.LeaderElect, "leader-elect", o.LeaderElect, ""+
"Start a leader election client and gain leadership before "+
"executing the main loop. Enable this when running replicated "+
"components for high availability.")
fs.DurationVar(&o.LeaseDuration.Duration, "leader-elect-lease-duration", o.LeaseDuration.Duration, ""+
"The duration that non-leader candidates will wait after observing a leadership "+
"renewal until attempting to acquire leadership of a led but unrenewed leader "+
"slot. This is effectively the maximum duration that a leader can be stopped "+
"before it is replaced by another candidate. This is only applicable if leader "+
"election is enabled.")
fs.DurationVar(&o.RenewDeadline.Duration, "leader-elect-renew-deadline", o.RenewDeadline.Duration, ""+
"The interval between attempts by the acting master to renew a leadership slot "+
"before it stops leading. This must be less than or equal to the lease duration. "+
"This is only applicable if leader election is enabled.")
fs.DurationVar(&o.RetryPeriod.Duration, "leader-elect-retry-period", o.RetryPeriod.Duration, ""+
"The duration the clients should wait between attempting acquisition and renewal "+
"of a leadership. This is only applicable if leader election is enabled.")
fs.StringVar(&o.ResourceLock, "leader-elect-resource-lock", o.ResourceLock, ""+
"The type of resource object that is used for locking during "+
"leader election. Supported options are `endpoints` (default) and `configmaps`.")
fs.StringVar(&o.ResourceName, "leader-elect-resource-name", o.ResourceName, ""+
"The name of resource object that is used for locking during "+
"leader election.")
fs.StringVar(&o.ResourceNamespace, "leader-elect-resource-namespace", o.ResourceNamespace, ""+
"The namespace of resource object that is used for locking during "+
"leader election.")
}
142 changes: 142 additions & 0 deletions cmd/site-manager/app/server.go
@@ -0,0 +1,142 @@
/*
Copyright 2020 The SuperEdge Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package app

import (
"context"
"os"
"time"

"github.com/spf13/cobra"
"k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/util/uuid"
clientset "k8s.io/client-go/kubernetes"
clientgokubescheme "k8s.io/client-go/kubernetes/scheme"
v1core "k8s.io/client-go/kubernetes/typed/core/v1"
restclient "k8s.io/client-go/rest"
"k8s.io/client-go/tools/clientcmd"
"k8s.io/client-go/tools/leaderelection"
"k8s.io/client-go/tools/leaderelection/resourcelock"
"k8s.io/client-go/tools/record"
"k8s.io/klog/v2"

"github.com/superedge/superedge/cmd/site-manager/app/options"
"github.com/superedge/superedge/pkg/site-manager/config"
"github.com/superedge/superedge/pkg/site-manager/controller"
crdClientset "github.com/superedge/superedge/pkg/site-manager/generated/clientset/versioned"
"github.com/superedge/superedge/pkg/util"
"github.com/superedge/superedge/pkg/version"
"github.com/superedge/superedge/pkg/version/verflag"
)

func NewSiteManagerDaemonCommand() *cobra.Command {
siteOptions := options.NewSiteManagerDaemonOptions()
cmd := &cobra.Command{
Use: "site-manager",
Run: func(cmd *cobra.Command, args []string) {
verflag.PrintAndExitIfRequested()

klog.Infof("Site-manager Versions: %#v\n", version.Get())
util.PrintFlags(cmd.Flags())

kubeconfig, err := clientcmd.BuildConfigFromFlags(siteOptions.Master, siteOptions.Kubeconfig)
if err != nil {
klog.Fatalf("Failed to create kubconfig: %#v", err)
}

kubeconfig.QPS = siteOptions.QPS
kubeconfig.Burst = siteOptions.Burst
kubeClient := clientset.NewForConfigOrDie(kubeconfig)
crdClient := crdClientset.NewForConfigOrDie(kubeconfig)

// not leade elect
if !siteOptions.LeaderElect {
runController(context.TODO(), kubeClient, crdClient, siteOptions.Worker, siteOptions.SyncPeriod, siteOptions.SyncPeriodAsWhole)
panic("Start site-manager failed\n")
}

hostname, err := os.Hostname()
if err != nil {
klog.Fatalf("Failed to get hostname %#v", err)
}
identityId := hostname + "_" + string(uuid.NewUUID())

// Create resource lock
copyConfig := *kubeconfig
copyConfig.Timeout = time.Second * siteOptions.RenewDeadline.Duration
leaderElectionClient := clientset.NewForConfigOrDie(restclient.AddUserAgent(&copyConfig, "leader-election"))
resourceLock, err := resourcelock.New(siteOptions.ResourceLock, siteOptions.ResourceNamespace, siteOptions.ResourceName,
leaderElectionClient.CoreV1(),
leaderElectionClient.CoordinationV1(),
resourcelock.ResourceLockConfig{
Identity: identityId,
EventRecorder: createRecorder(kubeClient, options.SiteManagerDaemonUserAgent),
})
if err != nil {
klog.Fatalf("Creating leader elect lock error %#v", err)
}

// leader running controller
var electionChecker *leaderelection.HealthzAdaptor
electionChecker = leaderelection.NewLeaderHealthzAdaptor(time.Second * 20)
leaderelection.RunOrDie(context.TODO(), leaderelection.LeaderElectionConfig{
Lock: resourceLock,
LeaseDuration: siteOptions.LeaseDuration.Duration,
RenewDeadline: siteOptions.RenewDeadline.Duration,
RetryPeriod: siteOptions.RetryPeriod.Duration,
Callbacks: leaderelection.LeaderCallbacks{
OnStartedLeading: func(ctx context.Context) {
runController(ctx, kubeClient, crdClient, siteOptions.Worker, siteOptions.SyncPeriod, siteOptions.SyncPeriodAsWhole)
},
OnStoppedLeading: func() {
klog.Fatalf("Leader election lost")
},
},
WatchDog: electionChecker,
Name: options.SiteManagerDaemonUserAgent,
})
panic("Start site-manager failed\n")
},
}

fs := cmd.Flags()
siteOptions.AddFlags(fs)

return cmd
}

func runController(parent context.Context, kubeClient *clientset.Clientset,
crdClient *crdClientset.Clientset, workerNum, syncPeriod, syncPeriodAsWhole int) {

controllerConfig := config.NewControllerConfig(kubeClient, crdClient, time.Second*time.Duration(syncPeriod))
sitesManagerDaemonController := controller.NewSitesManagerDaemonController(controllerConfig.NodeInformer,
controllerConfig.NodeUnitInformer, controllerConfig.NodeGroupInformer, kubeClient, crdClient)

ctx, cancel := context.WithCancel(parent)
defer cancel()

controllerConfig.Run(ctx.Done())
go sitesManagerDaemonController.Run(workerNum, syncPeriodAsWhole, ctx.Done())
<-ctx.Done()
}

func createRecorder(kubeClient clientset.Interface, userAgent string) record.EventRecorder {
eventBroadcaster := record.NewBroadcaster()
eventBroadcaster.StartLogging(klog.Infof)
eventBroadcaster.StartRecordingToSink(&v1core.EventSinkImpl{Interface: kubeClient.CoreV1().Events("")})
return eventBroadcaster.NewRecorder(clientgokubescheme.Scheme, v1.EventSource{Component: userAgent})
}
66 changes: 66 additions & 0 deletions cmd/site-manager/site-manager.go
@@ -0,0 +1,66 @@
/*
Copyright 2020 The SuperEdge Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package main

import (
"flag"
"math/rand"
"os"
"time"

"github.com/spf13/pflag"
cliflag "k8s.io/component-base/cli/flag"
"k8s.io/klog/v2"

"github.com/superedge/superedge/cmd/site-manager/app"
)

func main() {
klog.InitFlags(nil)
rand.Seed(time.Now().UnixNano())

pflag.CommandLine.SetNormalizeFunc(cliflag.WordSepNormalizeFunc)
pflag.CommandLine.AddGoFlagSet(flag.CommandLine)

klogSet()
defer klog.Flush()

command := app.NewSiteManagerDaemonCommand()
if err := command.Execute(); err != nil {
os.Exit(1)
}
}

func klogSet() {
pflag.CommandLine.MarkHidden("log-dir")
pflag.CommandLine.MarkHidden("version")
pflag.CommandLine.MarkHidden("vmodule")
pflag.CommandLine.MarkHidden("one-output")
pflag.CommandLine.MarkHidden("logtostderr")
pflag.CommandLine.MarkHidden("skip-headers")
pflag.CommandLine.MarkHidden("add-dir-header")
pflag.CommandLine.MarkHidden("alsologtostderr")
pflag.CommandLine.MarkHidden("stderrthreshold")
pflag.CommandLine.MarkHidden("log-backtrace-at")
pflag.CommandLine.MarkHidden("skip-log-headers")
pflag.CommandLine.MarkHidden("log-file-max-size")
pflag.CommandLine.MarkHidden("log-flush-frequency")

flag.Set("v", "4")
flag.Set("logtostderr", "false")
flag.Set("alsologtostderr", "true")
}

0 comments on commit 7de1ffd

Please sign in to comment.