Skip to content

Commit

Permalink
add torque example
Browse files Browse the repository at this point in the history
  • Loading branch information
chriskery committed Sep 28, 2023
1 parent aa23948 commit 5c2b38a
Show file tree
Hide file tree
Showing 3 changed files with 104 additions and 35 deletions.
48 changes: 14 additions & 34 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,52 +1,31 @@
# kubecluster
[![Build Status](https://github.com/chriskery/kubecluster/actions/workflows/test-go.yaml/badge.svg?branch=master)](https://github.com/chriskery/kubecluster/actions/workflows/test-go.yaml?branch=master)
[![Coverage Status](https://coveralls.io/repos/github/kubeflow/training-operator/badge.svg?branch=master)](https://coveralls.io/github/kubeflow/training-operator?branch=master)
[![Build Status](https://github.com/chriskery/kubecluster/actions/workflows/test-go.yml/badge.svg?branch=master)](https://github.com/chriskery/kubecluster/actions/workflows/test-go.yaml?branch=master)
[![Coverage Status](https://coveralls.io/repos/github/chriskery/kubecluster/badge.svg?branch=master)](https://coveralls.io/github/chriskery/kubecluster?branch=master)
[![Go Report Card](https://goreportcard.com/badge/github.com/chriskery/kubecluster)](https://goreportcard.com/report/github.com/chriskery/kubecluster)

### A simple way to create multiple type of clusters in the kubernetes cluster
### The kubecluster implements a mechanism that makes it easy to build Slurm/Torque clusters on Kubernetes.

## Description
Just with a
## Features
Kubecluster uses Pods to simulate nodes in different clusters, currently supports the following cluster types :

- [Slurm](pkg/controller/slurm_schema)
- [Torque( PBS )](pkg/controller/torque_schema)
## Getting Started
You’ll need a Kubernetes cluster to run against. You can use [KIND](https://sigs.k8s.io/kind) to get a local cluster for testing, or run against a remote cluster.
**Note:** Your controller will automatically use the current context in your kubeconfig file (i.e. whatever cluster `kubectl cluster-info` shows).

### Running on the cluster
1. Install Instances of Custom Resources:
## Installation

```sh
kubectl apply -k manifests/samples/
```
### Master Branch

2. Build and push your image to the location specified by `IMG`:

```sh
make docker-build docker-push IMG=<some-registry>/kubecluster:tag
```bash
kubectl apply -k "github.com/chriskery/kubecluster/manifests/default"
```

3. Deploy the controller to the cluster with the image specified by `IMG`:
## Quick Start

```sh
make deploy IMG=<some-registry>/kubecluster:tag
```
Please refer to the [quick-start.md](docs/quick-start.md) and [Kubeflow Training User Guide](https://www.kubeflow.org/docs/guides/components/tftraining/) for more information.

### Uninstall CRDs
To delete the CRDs from the cluster:

```sh
make uninstall
```

### Undeploy controller
UnDeploy the controller from the cluster:

```sh
make undeploy
```

## Contributing
// TODO(user): Add detailed information on how you would like others to contribute to this project

### How it works
This project aims to follow the Kubernetes [Operator pattern](https://kubernetes.io/docs/concepts/extend-kubernetes/operator/).
Expand Down Expand Up @@ -96,3 +75,4 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.


89 changes: 89 additions & 0 deletions docs/quick-start.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
## Create a Torque Cluster

**Create Torque YAML**

```
kubectl create -f ../manifests/samples/torque-centos.yaml
```

The torque centos example create a torque cluster with 1 server and 1 worker,
so it will create two pods to simulate two nodes for the torque cluster

**Get Torque Status**

Execute the following command:
```
kubectl get kubeclusters
```
The output is like:
```shell
> kubectl get kubeclusters
NAME AGE STATE
torque-centos-sample 3s Running
```

Now you can enter the " server node " and use this torque-centos-sample look like you're actually using a physical torque cluster
```
> kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-deployment-5bc4c45dc9-npwxp 1/1 Running 16 46h
torque-centos-sample-cpu-0 1/1 Running 0 2m43s
torque-centos-sample-server-0 1/1 Running 0 2m43s
```
torque-centos-sample-server-0 is the server node of cluster torque-centos-sample
```
> kubectl exec -it torque-centos-sample-server-0 /bin/bash
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
[root@torque-centos-sample-server-0 /]#
```

**Using Torque Cluster**
Viewing Nodes' status of torque-centos-sample
```
[root@torque-centos-sample-server-0 pbs]# pbsnodes -a
torque-centos-sample-server-0
Mom = torque-centos-sample-server-0
Port = 15002
pbs_version = 19.0.0
ntype = PBS
state = free
pcpus = 16
resources_available.arch = linux
resources_available.host = torque-centos-sample-server-0
resources_available.mem = 64756484kb
resources_available.ncpus = 16
resources_available.vnode = torque-centos-sample-server-0
resources_assigned.accelerator_memory = 0kb
resources_assigned.hbmem = 0kb
resources_assigned.mem = 0kb
resources_assigned.naccelerators = 0
resources_assigned.ncpus = 0
resources_assigned.vmem = 0kb
resv_enable = True
sharing = default_shared
last_state_change_time = Thu Sep 28 07:05:43 2023
torque-centos-sample-cpu-0
Mom = 10-244-0-56.torque-centos-sample-cpu-0.default.svc.cluster.local
Port = 15002
pbs_version = 19.0.0
ntype = PBS
state = free
pcpus = 16
resources_available.arch = linux
resources_available.host = 10-244-0-56
resources_available.mem = 64756484kb
resources_available.ncpus = 16
resources_available.vnode = torque-centos-sample-cpu-0
resources_assigned.accelerator_memory = 0kb
resources_assigned.hbmem = 0kb
resources_assigned.mem = 0kb
resources_assigned.naccelerators = 0
resources_assigned.ncpus = 0
resources_assigned.vmem = 0kb
resv_enable = True
sharing = default_shared
last_state_change_time = Thu Sep 28 07:05:43 2023
```


2 changes: 1 addition & 1 deletion manifests/manager/kustomization.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
resources:
- manager.yaml
apiVersion: kustomize.manifests.k8s.io/v1beta1
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
images:
- name: controller
Expand Down

0 comments on commit 5c2b38a

Please sign in to comment.