Skip to content

Commit 00f9252

Browse files
committed
Moved deployment readme file to docs folder.
Signed-off-by: Diana Arroyo <darroyo@us.ibm.com>
1 parent f682f5a commit 00f9252

File tree

1 file changed

+190
-0
lines changed

1 file changed

+190
-0
lines changed

doc/deploy/deployment.md

Lines changed: 190 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,190 @@
1+
# Deploying Multi-Cluster-App-Wrapper Controller
2+
Follow the instructions below to deploy the Multi-Cluster-App-Wrapper controller in an existing Kubernetes cluster:
3+
4+
## Pre-Reqs
5+
### - Cluster running Kubernetes v1.10 or higher.
6+
```
7+
kubectl version
8+
```
9+
### - Access to the `kube-system` namespace.
10+
```
11+
kubectl get pods -n kube-system
12+
```
13+
### - Install the Helm Package Manager
14+
Install the Helm Client on your local machine and the Helm Cerver on your kubernetes cluster. Helm installation documentation is [here]
15+
(https://docs.helm.sh/using_helm/#installing-helm). After you install Helm you can list the Help packages installed with the following command:
16+
```
17+
helm list
18+
```
19+
20+
### Determine if the cluster has enough resources for installing the Helm chart for the Multi-Cluster-App-Dispatcher.
21+
22+
The default memory resource demand for the multi-cluster-app-dispatcher controller is `2G`. If your cluster is a small installation such as MiniKube you will want to adjust the Helm installation resource requests accordingly.
23+
24+
25+
To list available compute nodes on your cluster enter the following command:
26+
```
27+
kubectl get nodes
28+
```
29+
For example:
30+
```
31+
$ kubectl get nodes
32+
NAME STATUS ROLES AGE VERSION
33+
minikube Ready master 91d v1.10.0
34+
```
35+
36+
To find out the available resources in you cluster inspect each node from the command output above with the following command:
37+
```
38+
$ kubectl describe node <node_name>
39+
```
40+
For example:
41+
```
42+
$ kubectl describe node minikube
43+
...
44+
Name: minikube
45+
Roles: master
46+
Labels: beta.kubernetes.io/arch=amd64
47+
beta.kubernetes.io/os=linux
48+
...
49+
Capacity:
50+
cpu: 2
51+
ephemeral-storage: 16888216Ki
52+
hugepages-2Mi: 0
53+
memory: 2038624Ki
54+
pods: 110
55+
Allocatable:
56+
cpu: 2
57+
ephemeral-storage: 15564179840
58+
hugepages-2Mi: 0
59+
memory: 1936224Ki
60+
pods: 110
61+
...
62+
Allocated resources:
63+
(Total limits may be over 100 percent, i.e., overcommitted.)
64+
Resource Requests Limits
65+
-------- -------- ------
66+
cpu 1915m (95%) 1 (50%)
67+
memory 1254Mi (66%) 1364Mi (72%)
68+
Events: <none>
69+
70+
```
71+
In the example above, there is only one node (`minikube`) in the cluster with the majority of the cluster memory used (`1,254Mi` used out of `1,936Mi` allocatable capacity) leaving less than `700Mi` available capacity for new pod deployments in the cluster. Since the default memory demand for the Enhanced QueueuJob Controller pod is `2G` the cluster has insufficient memory to deploy the controller. Instruction notes provided below show how to override the defaults according to the available capacity in your cluster.
72+
73+
## Installation Instructions
74+
### 1. Download the github project.
75+
Download this github project to your local machine.
76+
```
77+
git clone -b queuejob-dispatcher --single-branch git@github.ibm.com:ARMS/extended-queuejob.git
78+
```
79+
### 2. Navigate to the Helm deployment directory.
80+
```
81+
cd extended-queuejob/contrib/DLaaS/deployment
82+
```
83+
84+
### 3. Run the installation using Helm.
85+
Install the Multi-Cluster-App-Dispatcher Controller using the commands below. The `--wait` parameter in the Helm command below is used to ensure all pods of the helm chart are running and will not return unless the default timeout expires (*typically 300 seconds*) or all the pods are in `Running` state.
86+
87+
88+
Before submitting the command below you should ensure you have enough resources in your cluster to deploy the helm chart (*see Pre-Reqs section above*). If you do not have enough compute resources in your cluster you can adjust the resource request via the command line. See an example in the `Note` below.
89+
90+
All Helm parameters and described in the table below.
91+
#### 3.a Start the Multi-Cluster-App-Dispatcher Controller on All Target Deployment Clusters (*Agent Mode*).
92+
__Agent Mode__: Install and set up the Multi-Cluster-App-Dispatcher Controler (XQJ) in *Agent Mode* for each clusters that will orchestrate the resources defined within an XQJ using Helm. *Agent Mode* is the default mode when deploying the XQJ controller.
93+
```
94+
helm install kube-arbitrator --namespace kube-system --wait --set image.repository=<image repository and name> --set image.tag=<image tag> --set imagePullSecret.name=<Name of image pull kubernetes secret> --set imagePullSecret.password=<REPLACE_WITH_REGISTRY_TOKEN_GENERATED_IN_PREREQs_STAGE1_REGISTRY.d)> --set localConfigName=<Local Kubernetes Config File for Current Cluster> --set volumes.hostPath=<Host_Path_location_of_local_Kubernetes_config_file>
95+
```
96+
97+
For example (*Assuming the default for `image.repository`, `image.tag`*):
98+
```
99+
helm install kube-arbitrator --namespace kube-system
100+
```
101+
or
102+
```
103+
helm install kube-arbitrator --namespace kube-system --wait --set imagePullSecret.name=extended-queuejob-controller-registry-secret --set imagePullSecret.password=eyJhbGc...y8gJNcpnipUu0 --set image.pullPolicy=Always --set localConfigName=config_110 --set volumes.hostPath=/etc/kubernetes
104+
```
105+
NOTE: You can adjust the cpu and memory demands of the deployment with command line overrides. For example:
106+
107+
```
108+
helm install kube-arbitrator --namespace kube-system --wait -set resources.requests.cpu=1000m --set resources.requests.memory=1024Mi --set resources.limits.cpu=1000m --set resources.limits.memory=1024Mi --set image.repository=k8s-spark-mcm-dispatcher-master-1:8443/xqueuejob-controller --set image.tag=v1.11 --set image.pullPolicy=Always
109+
```
110+
#### 3.b Start the Multi-Cluster-App-Dispatcher Controller on the Controller Cluster (*Dispatcher Mode*).
111+
_Dispatcher Mode__: Install and set up the Multi-Cluster-App-Dispatcher Controler (XQJ) in *Dispatcher Mode* for the control cluster that will dispatch the XQJ to an *Agent* cluster using Helm.
112+
113+
114+
__Dispatcher Mode__: Installing the Multi-Cluster-App-Dispatcher Controler in *Dispatcher Mode*.
115+
```
116+
helm install kube-arbitrator --namespace kube-system --wait --set image.repository=<image repository and name> --set image.tag=<image tag> --set configMap.name=<Config> --set configMap.dispatcherMode='"true"' --set configMap.agentConfigs=agent101config:uncordon --set volumes.hostPath=<Host_Path_location_of_all_agent_Kubernetes_config_files>
117+
```
118+
119+
For example:
120+
```
121+
helm install kube-arbitrator --namespace kube-system --wait --set image.repository=tonghoon --set image.tag=both --set configMap.name=xqj-deployer --set configMap.dispatcherMode='"true"' --set configMap.agentConfigs=agent101config:uncordon --set volumes.hostPath=/etc/kubernetes
122+
```
123+
### Chart configuration
124+
125+
The following table lists the configurable parameters of the helm chart and their default values.
126+
127+
| Parameter | Description | Default | Sample values |
128+
| ----------------------- | ------------------------------------ | ------------- | ------------------------------------------------ |
129+
| `configMap.agentConfigs` | *For Every Agent Cluster separated by commas(,):* Name of *agent* config file _:_ Set the dispatching mode for the _*Agent Cluster*_. Note:For the dispatching mode `uncordon`, indicating XQJ controller is allowed to dispatched jobs to the _*Agent Cluster*_, is only supported. | &lt;_No default for agent config file_&gt;:`uncordon` | `agent101config:uncordon,agent110config:uncordon` |
130+
| `configMap.dispatcherMode` | Whether the XQJ Controller should be launched in Dispatcher mode or not | `false` | `true` |
131+
| `configMap.name` | Name of the Kubernetes *ConfigMap* resource to configure the Enhance QueueJob Controller | | `xqj-deployer` |
132+
| `deploymentName` | Name of XQJ Controller Deployment Object | `xqueuejob-controller` | `my-xqj-controller` |
133+
| `image.pullPolicy` | Policy that dictates when the specified image is pulled | `Always` | `Never` |
134+
| `imagePullSecret.name` | Kubernetes secret name to store password for image registry | | `extended-queuejob-controller-registry-secret` |
135+
| `imagePullSecret.password` | Image registry pull secret password | | `eyJhbGc...y8gJNcpnipUu0` |
136+
| `imagePullSecret.username` | Image registry pull user name | `iamapikey` | `token` |
137+
| `image.repository` | Name of repository containing XQueueJob Controller image | `registry.stage1.ng.bluemix.net/ibm/kube-arbitrator` | `my-repository` |
138+
| `image.tag` | Tag of desired image within repository | `latest` | `my-image` |
139+
| `namespace` | Namespace in which XQJ Controller Deployment is created | `kube-system` | `my-namespace` |
140+
| `nodeSelector.hostname` | Host Name field for XQJ Controller Pod Node Selector | | `example-host` |
141+
| `replicaCount` | Number of replicas of XQJ Controller Deployment | 1 | 2 |
142+
| `resources.limits.cpu` | CPU Limit for XQJ Controller Deployment | `2000m` | `1000m` |
143+
| `resources.limits.memory` | Memory Limit for XQJ Controller Deployment | `2048Mi` | `1024Mi` |
144+
| `resources.requests.cpu` | CPU Request for XQJ Controller Deployment (must be less than CPU Limit) | `2000m` | `1000m` |
145+
| `resources.requests.memory` | Memory Request for XQJ Controller Deployment (must be less than Memory Limit) | `2048Mi` | `1024Mi` |
146+
| `serviceAccount` | Name of service account of XQJ Controller | `xqueuejob-controller` | `my-service-account` |
147+
| `volumes.hostPath` | Full path on the host location where the `localConfigName` file is stored | | `/etc/kubernetes` |
148+
149+
150+
### 5. Verify the installation.
151+
List the Helm installation. The `STATUS` should be `DEPLOYED`.
152+
153+
NOTE: The `--wait` parameter in the helm installation command from *step #3* above ensures all resources are deployed and running if the `STATUS` indicates `DEPLOYED`. Installing the Helm Chart without the `--wait` parameter does not ensure all resources are successfully running but may still show a `Status` of `Deployed`.
154+
155+
The `STATUS` value of `FAILED` indicates all resources were not created and running before the timeout occurred. Usually this indicates a pod creation failure is due to insufficient resources to create the Multi-Cluster-App-Dispatcher Controller pod. Example instructions on how to adjust the resources requested for the Helm chart are described in the `NOTE` comment of *step #4* above.
156+
```
157+
$ helm list
158+
NAME REVISION UPDATED STATUS CHART NAMESPACE
159+
opinionated-antelope 1 Mon Jan 21 00:52:39 2019 DEPLOYED kube-arbitrator-0.1.0 kube-system
160+
161+
```
162+
163+
Ensure the new resource but listing the Extended QueueJobs.
164+
```bash
165+
kubectl get xqueuejobs
166+
```
167+
168+
Since no `xqueuejobs` have been deploy yet to your cluster you should receive a message indicating `No resources found.` for `xqueuejobs` but your cluster now has `xqueuejobs` enabled. Use the [tutorial](../doc/usage/tutorial.md) to deploy an example `xqueuejob`.
169+
170+
### 6. Remove the Multi-Cluster-App-Dispatcher Controller from your cluster.
171+
172+
List the deployed Helm charts and identify the name of the Multi-Cluster-App-Dispatcher Controller installation.
173+
```bash
174+
helm list
175+
```
176+
For Example
177+
```
178+
$ helm list
179+
NAME REVISION UPDATED STATUS CHART NAMESPACE
180+
opinionated-antelope 1 Mon Jan 21 00:52:39 2019 DEPLOYED kube-arbitrator-0.1.0 kube-system
181+
182+
```
183+
Delete the Helm deployment.
184+
```
185+
helm delete <deployment_name>
186+
```
187+
For example:
188+
```bash
189+
helm delete opinionated-antelope
190+
```

0 commit comments

Comments
 (0)