## Kubernetes on the Google Cloud Platform

This notebook can be used to launch a Kubernetes Cluster on the [Google Cloud Platform](). 

1. First make sure that [gcloud and kubectl are  installed on your computer](https://zero-to-jupyterhub-with-kubernetes.readthedocs.io/en/latest/create-k8s-cluster.html#setting-up-kubernetes-on-google-cloud). 
2. Make sure [helm is installed](https://zero-to-jupyterhub-with-kubernetes.readthedocs.io/en/latest/setup-helm.html).

3. Create a project on the Google Cloud Platform. 
4. Enable the Google Cloud API. 

*Note: If `gcloud` needs to be updated, you are better off launching a terminal and doing it in there.  There can be timeout issues when doing it from a Jupyter notebooks.*

How big of a cluster do you need? Check out [this spreadsheet](https://docs.google.com/spreadsheets/d/1EvGMgS2JiGm8UuB9eDOQRjm79LapjJz4ubk7YDVqplY/edit?usp=sharing) or this 
[Notebook](https://github.com/data-8/jupyterhub-k8s/blob/master/docs/cost-estimation/gce_budgeting.ipynb) to estimate the size. 



## Configuration Setup

There are a variety of different parameters that need to be set for your cluster. We have the defaults set in the Kubernetes.yaml file.   

### Cluster Properties
*Cluster Name (cluster_name)*:  Nothing special for the cluster name. You just have to name it something.  

*Namesapece*: You can read more about namespaces [here](https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/).

*Releasename*: A releasename is a way of versioning your system.  

*Zone*: See this advice from google on [choosing a zone](https://cloud.google.com/compute/docs/regions-zones/regions-zones#choosing_a_region_and_zone).

### Cluster Size and Autoscaling

*Number of Nodes (num_nodes)*: This is the number of servers which you are asking Google to launch.  This will clearly be one of the factors which drives the overall costs.

*Machine Type*: This is a (https://cloud.google.com/compute/docs/machine-types)

*Maximum Number of Nodes (max_nodes)*: If you enable autoscaling, Google will launch additional servers based on demand, up to the `maz Autoscaling is the process of increasing the number of servers based on demand.

**If you would like to make changes, just update the Kubernetes.yaml file.** 


In [6]:
#This will import some required libraries.
import sys 
import ruamel.yaml 
#This is your configuration file. 
general_yaml='../../config/config.yaml'
with open(general_yaml, 'r') as yaml:
    cf=ruamel.yaml.round_trip_load(yaml, preserve_quotes=True)

google_yaml='../../config/gcloud/config.yaml'
with open(google_yaml, 'r') as g_yaml:
    cf_g=ruamel.yaml.round_trip_load(g_yaml, preserve_quotes=True)

    #This will allow us to import some useful utilities. 
if cf['docker']:
    cf_g['path']=cf['docker_path']
else:
    cf_g['path']=cf['local_path']

sys.path.append(cf_g['path']+"/lib/kuberutils") 
print(cf_g['path']+"/lib/kuberutils")
import importlib
import kuberutils as ku
importlib.reload(ku)
#This will load common commands for your cluster
cf_g=ku.gcloud_commands(cf_g)
print(ruamel.yaml.dump(cf_g, sys.stdout, Dumper=ruamel.yaml.RoundTripDumper))

/Users/jasonkuruzovich/githubdesktop/0_class/admin-tools/lib/kuberutils
project: kuberlytics             #Project name can be anything, but it should already be created.
cluster_name: kuberlytics         #Name your cluster whatever you like
region: us-east1               #Selection from gcloud compute regions list.
zone: us-east1-b                  #Selection from gcloud compute regions list.
machine_type: n1-highmem-4        #Type of Server
num_nodes: 1                      #The default number of nodes (servers)
num_nodes_class: 2
max_nodes: 4                      #Maximum number of nodes (servers)
account: jkuruzovich@gmail.com    #Email Associated with the account.
authorization_file: auth.json     #Service account authorization file.
service_account_name: kuberlytics2  #Service account name.
fixedip_namespace: jupyterhub-dojo
path: /Users/jasonkuruzovich/githubdesktop/0_class/admin-tools
create_service_account: gcloud iam service-accounts create kuberlytics2 --display-name
  kuberl

### Web Login

In order to use the web login, you have to open a terminal session.  Click on the `Jupyter` icon on the top left of the notebook and then click `new`->`terminal`.

Once in the terminal, issue the following command to login to your gmail account. It will provide you with a code you will then enter into the container. 

```
gcloud init
```

```
sudo docker run -t -i --name gcloud-config kuberlytics/gcloud-sdk gcloud init
```

Once you login you can then use the terminal to create a service account:  (NEED TO TEST THIS)

```
sudo docker run -t -i --name gcloud-config kuberlytics/gcloud-sdkgcloud iam service-accounts keys create --iam-account <youremail> /kuberlytics/config/gcloud/newkey.json
```

### Service Account Login
This is the easier way to manage login programatically.  Please be very careful about sharing this file and DON'T commit it!

Follow (this link)[https://cloud.google.com/compute/docs/access/create-enable-service-accounts-for-instances#createanewserviceaccount] to setup your service account. 

In you configuration file, set the name of your file here.  
`authorization_file: auth.json`

Then place the file in the `/config/gcloud`  directory.

https://cloud.google.com/iam/docs/granting-roles-to-service-accounts
gcloud iam service-accounts get-iam-policy \
kuberlytics@kuberlytics.iam.gserviceaccount.com \
--format json > policy.json

In [7]:
#Login
ku.bash_command('login',cf_g)

Executing login:
 gcloud auth activate-service-account  --key-file /Users/jasonkuruzovich/githubdesktop/0_class/admin-tools/config/gcloud/auth.json


'Activated service account credentials for: [kuberlytics2@kuberlytics.iam.gserviceaccount.com]\n'

### Set Project and Zone
`gcloud` must be configured to the appropriate project. In the output above your will see a response that says `the current project is [PROJECT_NAME]`. If you have muliple projects, you probabaly already know this. You can see all of your projects by using the command: 

`!gcloud projects list`

and then set the appropriate project:

`!gcloud config set project PROJECT_NAME


In [8]:
#This will set the project. 
ku.bash_command('set_project',cf_g)


Executing set_project:
 gcloud config set project kuberlytics


'Updated property [core/project].\n'

In [4]:
#This will set the zone.
ku.bash_command('set_zone',cf_g)

Executing set_zone:
 gcloud config set compute/zone us-east1-b


'Updated property [compute/zone].\n'

### Create the Cluster
You are ready to go!  We are now going to tell the Google Cloud platform to create a Kubernetes cluster for us.  We do this with the `gcloud container clusters create` 

To see the full range of possible configurations enter: `gcloud container clusters create --help.` This will show all the possible parameters that can be passed:

```markdown
    gcloud container clusters create NAME [--additional-zones=ZONE,[ZONE,...]]
        [--async] [--cluster-ipv4-cidr=CLUSTER_IPV4_CIDR]
        [--cluster-version=CLUSTER_VERSION]
        [--disable-addons=[DISABLE_ADDONS,...]] [--disk-size=DISK_SIZE]
        [--no-enable-cloud-endpoints] [--no-enable-cloud-logging]
        [--no-enable-cloud-monitoring] [--image-type=IMAGE_TYPE]
        [--machine-type=MACHINE_TYPE, -m MACHINE_TYPE]
        [--max-nodes-per-pool=MAX_NODES_PER_POOL] [--network=NETWORK]
        [--node-labels=[NODE_LABEL,...]] [--num-nodes=NUM_NODES; default="3"]
        [--password=PASSWORD] [--scopes=SCOPE,[SCOPE,...]]
        [--subnetwork=SUBNETWORK] [--tags=TAG,[TAG,...]]
        [--username=USERNAME, -u USERNAME; default="admin"]
        [--zone=ZONE, -z ZONE] [GLOBAL-FLAG ...]
        
        ```
        
 The user does not have access to service account "default".\n'
 Service Account Actor"

In [5]:
#This will create the cluseter
print(ku.bash_command('create_cluster',cf_g))

Executing create_cluster:
 gcloud container clusters create kuberlytics --num-nodes=1 --machine-type=n1-highmem-4 --zone=us-east1-b
Creating cluster kuberlytics...
..............................................................................................................................................................done.
Created [https://container.googleapis.com/v1/projects/kuberlytics/zones/us-east1-b/clusters/kuberlytics].
kubeconfig entry generated for kuberlytics.
NAME         ZONE        MASTER_VERSION  MASTER_IP     MACHINE_TYPE  NODE_VERSION  NUM_NODES  STATUS
kuberlytics  us-east1-b  1.6.9           35.196.99.23  n1-highmem-4  1.6.9         1          RUNNING



### Verifying the Servers
We can now verify that the cluster has successfully launched by asking gcloud to list the instances. This should report the number of instances specified in the num_nodes variable. 

`!gcloud compute instances list`


In [9]:
print(ku.bash_command("gcloud compute instances list"))

Executing gcloud compute instances list:
 gcloud compute instances list
NAME                                        ZONE        MACHINE_TYPE  PREEMPTIBLE  INTERNAL_IP  EXTERNAL_IP    STATUS
gke-kuberlytics-default-pool-3af057b0-grlb  us-east1-b  n1-highmem-4               10.142.0.3   35.196.222.73  RUNNING
gke-kuberlytics-default-pool-3af057b0-q2hx  us-east1-b  n1-highmem-4               10.142.0.2   104.196.9.241  RUNNING



### Get Credentials for Kubectl
We need to add the credentials for Kubectl to work. 

In [28]:
#gcloud container clusters get-credentials kuberlytics
print(ku.bash_command('get_credentials',cf_g))


Executing get_credentials:
 gcloud container clusters get-credentials kuberlytics
Fetching cluster endpoint and auth data.
kubeconfig entry generated for kuberlytics.



In [20]:
print(ku.bash_command("kubectl cluster-info"))


Executing kubectl cluster-info:
 kubectl cluster-info
[0;32mKubernetes master[0m is running at [0;33mhttps://35.196.99.23[0m
[0;32mGLBCDefaultBackend[0m is running at [0;33mhttps://35.196.99.23/api/v1/namespaces/kube-system/services/default-http-backend/proxy[0m
[0;32mHeapster[0m is running at [0;33mhttps://35.196.99.23/api/v1/namespaces/kube-system/services/heapster/proxy[0m
[0;32mKubeDNS[0m is running at [0;33mhttps://35.196.99.23/api/v1/namespaces/kube-system/services/kube-dns/proxy[0m
[0;32mkubernetes-dashboard[0m is running at [0;33mhttps://35.196.99.23/api/v1/namespaces/kube-system/services/kubernetes-dashboard/proxy[0m

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.



### Helm Installation.  
We are going to be utilizing Helm for  installations of a variety of analytics tools.  This command will install Tiller on your cluster.  As they say, "Happy Helming!"  

In [29]:
# Provided helm is installed, this will install Tiller on the Cluster.
print(ku.bash_command("helm init --upgrade"))

Executing helm init --upgrade:
 helm init --upgrade
$HELM_HOME has been configured at /Users/jasonkuruzovich/.helm.

Tiller (the helm server side component) has been upgraded to the current version.
Happy Helming!



In [43]:
#give this a few minutes, but it should show a client and a server version. 
print(ku.bash_command("helm version"))

Executing helm version:
 helm version
Client: &version.Version{SemVer:"v2.5.0", GitCommit:"012cb0ac1a1b2f888144ef5a67b8dab6c2d45be6", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.5.0", GitCommit:"012cb0ac1a1b2f888144ef5a67b8dab6c2d45be6", GitTreeState:"clean"}



### Enabling Autoscaling (optional)

This should launch a pod within your kubernetes cluster that will handle autoscaling of the cluster. Note that this seems to take a while and may even timeout. Consider opening and running in a terminal session. 

In [41]:
ku.bash_command(cf_g['autoscale'])

Executing gcloud alpha container clusters update kuberlytics --enable-autoscaling --min-nodes=1 --max-nodes=4 --zone=us-east1-b --node-pool=default-pool:
 gcloud alpha container clusters update kuberlytics --enable-autoscaling --min-nodes=1 --max-nodes=4 --zone=us-east1-b --node-pool=default-pool


'Updating kuberlytics...\n.................................................................................................................................................................................................................done.\nUpdated [https://container.googleapis.com/v1/projects/kuberlytics/zones/us-east1-b/clusters/kuberlytics].\n'

### Fixed IP Address (optional)
Often it can be useful to utilize a fixed IP address in order to point a DNS to an applicaiton. 

[1] global
 [2] region: asia-east1
 [3] region: asia-northeast1
 [4] region: asia-southeast1
 [5] region: europe-west1
 [6] region: us-central1
 [7] region: us-east1
 [8] region: us-east4
 [9] region: us-west1

In [None]:
#Reserve a fixed IP (note: you can only do this once.)
print(ku.bash_command(cf_g['create_fixedip']))


In [12]:
fixed_ip=ku.get_fixed_ip(cf_g)
print(fixed_ip)

Executing gcloud compute addresses describe jupyterhub-dojo --region=us-east1:
 gcloud compute addresses describe jupyterhub-dojo --region=us-east1
35.185.85.199


In [13]:
#Write the public IP to the Jupyterhub file.
jupyterhub_yaml='../../config/jupyterhub/config.yaml'
with open(jupyterhub_yaml, 'r') as j_yaml:
    cf_j=ruamel.yaml.round_trip_load(j_yaml, preserve_quotes=True)
cf_j['fixed_ip']=fixed_ip
ruamel.yaml.round_trip_dump(cf_j, open(jupyterhub_yaml, 'w'))

### Kubernetes Web Dashboard
The kubernetes web dashboard can be utilized to launch applications. 

You the user for the dashboard is `admin` and the password can be found using the commands below. Just go ahead to the kubernetes dashbaord now. It is a great place to see the usage of your cluster and other things.

In [4]:
#You can use this to show the Kubernetes Dashboard.
result=ku.bash_command("kubectl cluster-info")
print(result)
result=ku.bash_command('describe_cluster',cf_g)
result=result.split("\n")
password=[x for x in result if "password:" in x]
print (password)

Executing kubectl cluster-info:
 kubectl cluster-info
[0;32mKubernetes master[0m is running at [0;33mhttps://35.196.99.23[0m
[0;32mGLBCDefaultBackend[0m is running at [0;33mhttps://35.196.99.23/api/v1/namespaces/kube-system/services/default-http-backend/proxy[0m
[0;32mHeapster[0m is running at [0;33mhttps://35.196.99.23/api/v1/namespaces/kube-system/services/heapster/proxy[0m
[0;32mKubeDNS[0m is running at [0;33mhttps://35.196.99.23/api/v1/namespaces/kube-system/services/kube-dns/proxy[0m
[0;32mkubernetes-dashboard[0m is running at [0;33mhttps://35.196.99.23/api/v1/namespaces/kube-system/services/kubernetes-dashboard/proxy[0m

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

Executing describe_cluster:
 gcloud container clusters describe kuberlytics
['  password: ddRfSjvzwokwQvX6']


#### That is it! You now have your own Kubernetes cluster that is ready to go. 

### Resize a Cluster
To stop a cluster without deleting it you just resize it to 0.

In [48]:
ku.bash_command(cf_g['class_size_cluster'])

Executing gcloud container clusters resize kuberlytics --size=2 --quiet:
 gcloud container clusters resize kuberlytics --size=2 --quiet


'Resizing kuberlytics...\ndone.\nUpdated [https://container.googleapis.com/v1/projects/kuberlytics/zones/us-east1-b/clusters/kuberlytics].\n'

In [44]:
ku.bash_command(cf_g['stop_cluster'])

Executing gcloud container clusters resize kuberlytics --size=0 --quiet:
 gcloud container clusters resize kuberlytics --size=0 --quiet


'Resizing kuberlytics...\ndone.\nUpdated [https://container.googleapis.com/v1/projects/kuberlytics/zones/us-east1-b/clusters/kuberlytics].\n'

In [54]:
ku.bash_command(cf_g['normal_size_cluster'])

Executing gcloud container clusters resize kuberlytics --size=1 --quiet:
 gcloud container clusters resize kuberlytics --size=1 --quiet


'Resizing kuberlytics...\ndone.\nUpdated [https://container.googleapis.com/v1/projects/kuberlytics/zones/us-east1-b/clusters/kuberlytics].\n'

### Deleting a Kubernetes Cluster

# **WARNING: DELETE JUPYTERHUB INSTANCE FIRST 
Should really put that in the codebase.  If you don't delete the jupyterhub instance and you are using a fixed IP the forwarding rules for the fixed-ip won't be deleted.

This will delete the Kubernetes cluster.

In [14]:
#Always delete the namespace first. 
print(ku.bash_command('delete_cluster',cf_g))

Executing delete_cluster:
 gcloud container clusters delete kuberlytics --zone=us-east1-b --quiet
Deleting cluster kuberlytics...
.............................................................done.
Deleted [https://container.googleapis.com/v1/projects/kuberlytics/zones/us-east1-b/clusters/kuberlytics].



### Deleting Fixed IP Address

In [None]:
ku.bash_command('delete_fixedip',cf_g)

In [56]:
print(ku.bash_command('describe_fixedip',cf_g))

Executing describe_fixedip:
 gcloud compute addresses describe jupyterhub-dojo --region=us-east1
address: 35.185.85.199
creationTimestamp: '2017-07-09T07:36:35.439-07:00'
description: ''
id: '1353409085614031260'
kind: compute#address
name: jupyterhub-dojo
region: https://www.googleapis.com/compute/v1/projects/kuberlytics/regions/us-east1
selfLink: https://www.googleapis.com/compute/v1/projects/kuberlytics/regions/us-east1/addresses/jupyterhub-dojo
status: IN_USE
users:
- https://www.googleapis.com/compute/v1/projects/kuberlytics/regions/us-east1/forwardingRules/a507e076f966b11e7adad42010a8e001



In [13]:
!gcloud compute forwarding-rules delete forwarding_rule --quiet


[1;31mERROR:[0m (gcloud.compute.forwarding-rules.delete) Underspecified resource [forwarding_rule]. Specify one of the [--global, --region] flags.


In [None]:
!gcloud compute forwarding-rules delete jupyterhub-dojo  --region=us-east1

The following forwarding rules will be deleted:
 - [jupyterhub-dojo] in [us-east1]

Do you want to continue (Y/n)?  