Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spin up AWS Kubernetes cluster for workshop #1

Closed
jmunroe opened this issue Apr 15, 2019 · 5 comments
Closed

Spin up AWS Kubernetes cluster for workshop #1

jmunroe opened this issue Apr 15, 2019 · 5 comments

Comments

@jmunroe
Copy link
Owner

jmunroe commented Apr 15, 2019

Follow the installation instructions from ZeroToJupyterHub:

https://zero-to-jupyterhub.readthedocs.io/en/latest/amazon/step-zero-aws-eks.html

and Pangeo:

https://pangeo.io/setup_guides/cloud.html

Related issues with good suggestions/advice:

@jmunroe
Copy link
Owner Author

jmunroe commented Apr 15, 2019

There are two ways to deploy Kubernetes on AWS. Until recently, kops was needed but now AWS supports Kubernetes natively with EKS. My plan is to start with EKS and use kops as a fall back if I encounter any show-stopping issues.

@jmunroe
Copy link
Owner Author

jmunroe commented Apr 15, 2019

We will have AWS credits for the workshop but in the near term, I am using my own AWS account for testing.

@jmunroe
Copy link
Owner Author

jmunroe commented Apr 15, 2019

(Following the style of ESIPFed/esiphub-dev#26 (comment) here)

https://zero-to-jupyterhub.readthedocs.io/en/latest/amazon/step-zero-aws-eks.html

1. Create a IAM Role for EKS Service Role.

It should have the following policies

AmazonEKSClusterPolicy
AmazonEKSServicePolicy
AmazonEC2ContainerRegistryReadOnly
(From the user interface, select EKS as the service, then follow the default steps)`

I created an IAM Role (not a user) associated with EKS. I expect that soon AWS will automatically create a service-linked role for me, but for now I created a role called eksServiceRole. I added the AmazonEC2ContainerRegistryReadOnly policy after the role was created.

2. Create a VPC if you don’t already have one.
3. Create a Security Group for the EKS Control Plane to use

I followed the defaults given at https://docs.aws.amazon.com/eks/latest/userguide/getting-started.html

Called my VPC eks-vpc. After the VPC has been created the VpcID, SecurityGroups and the SubnetIds are available in the output tab.

4 .Create your EKS cluster (using the user interface)

I created a Kubernetes cluster PangeoC3DISKubernetesCluster. The default version of kubernetes on EKS with 1.11 so let's use that. The VPC, Subnets, and SecuritysGroups all have the root name eks-vpc in them so they were easy to identify. I left the public access API enabled and all of the logging disabled for now.

5. Install kubectl and aws-iam-authenticator

Instructions for installing kubectl were found at
https://docs.aws.amazon.com/eks/latest/userguide/install-kubectl.html

Instructions for installing aws-iam-authenticator were found at
https://docs.aws.amazon.com/eks/latest/userguide/getting-started.html

I also installed the AWS CLI tools at this point: https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-install.html

6. Configure kubeconfig

To set up a ~/.kube/config file, I used
aws eks update-kubeconfig --name PangeoC3DISKubernetesCluster --region us-east-1

7. Verify kubectl works

It took a couple of attempts to get the permissions sorted out. I initially tried to create a new IAM user and attached the AdministratorAccess policy to try and give access to everything. Then I found out that to access the new Kubernetes cluster, it has to be by the same role that created it. (Something about Kubernetes RBAC). There appeared to be ways of assuming the role of the eksServiceRole but I kept hitting issues. Then I thought about why this is not covered in the default documentation. So I resorted to creating an AWS Access Code on my root account and configuring that with aws configure. Lo and behold, kubectl get svc now works. This is probably something I will need to revisit again.

8. Create the nodes using CloudFormation

At this point I am recognizing that AWS EKS still requires a fair amount of significant manual set up. I think this is good for me to go through to understand the nuts and bolts but I am tempted to use eksctl the next time through which may automate much of this for me.

For creating the work-nodes, I followed the instructions here: https://docs.aws.amazon.com/eks/latest/userguide/getting-started.html . Important to note that these seem to only be on-demand nodes. I'll have to investigate further how to leverage spot pricing.

pangeo-worker-nodes

Min Nodes 1, Default Nodes 2, Max Nodes 5, Choose AMI for us-west-1, used m5.large instances.

9. Create a AWS authentication ConfigMap

The NodeInstanceRole was available in the Output tab for the worker nodes stack. It only took about 1 minute for two nodes to be READY.

10. Preparing authenticator for Helm

Created the new RBAC role as suggested:

kubectl create clusterrolebinding cluster-system-anonymous --clusterrole=cluster-admin --user=system:anonymous

The ZTJH instructions reference "Cluster Autoscaler
If you’d like to do some optimizations, you need to deploy Cluster Autoscaler (CA) first.

See https://eksworkshop.com/scaling/deploy_ca/" which is something to check up on later.

That completes the instructions for setting up a K8 cluster. Now for helm.

curl https://raw.githubusercontent.com/kubernetes/helm/master/scripts/get | bash
kubectl --namespace kube-system create serviceaccount tiller
kubectl create clusterrolebinding tiller --clusterrole cluster-admin --serviceaccount=kube-system:tiller
helm init --service-account tiller --wait
kubectl patch deployment tiller-deploy --namespace=kube-system --type=json --patch='[{"op": "add", "path": "/spec/template/spec/containers/0/command", "value": ["/tiller", "--listen=localhost:44134"]}]'

Following the instructions at https://zero-to-jupyterhub.readthedocs.io/en/latest/setup-helm.html I installed helm on my new K8 cluster. Other than remembering to wait a minute for tiller to run on the cluster, helm looks like it installed just fine.

~ $ helm version                                                                                                                                                      
Client: &version.Version{SemVer:"v2.13.1", GitCommit:"618447cbf203d147601b4b9bd7f8c37a5d39fbb4", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.13.1", GitCommit:"618447cbf203d147601b4b9bd7f8c37a5d39fbb4", GitTreeState:"clean"}

And quickly just going through https://zero-to-jupyterhub.readthedocs.io/en/latest/setup-jupyterhub.html results in JupyterHub being up and running.


I've now shut down this node group and the EKS cluster. Some AWS credits have been made available on a CSIRO linked account. I'll try this again, but this time with eksctl.
_

@jmunroe
Copy link
Owner Author

jmunroe commented Apr 18, 2019

I use MacOS locally, so Homebrew seems the easiest way to get the required CLI tools installed:

brew tap weaveworks/tap
brew install weaveworks/tap/eksctl
brew install awscli
brew install aws-iam-authenticator
brew install kubernetes-helm

I already have an IAM user set up with an access key. Since we are setting up the cluster for an audience in Australia, it makes sense to me to use the region ap-southeast-2 (Sydney).

~ $ aws configure
AWS Access Key ID [****************33N7]:
AWS Secret Access Key [****************1s5c]:
Default region name [ap-southeast-2]:
Default output format [None]:

Creating a K8 cluster requires setting up EKS, VPC, SecurityGroups, IAM roles, and NodeGroup. The claim is that eksctl will handle all of that. Let's give it a shot with the defaults:

~ $ eksctl create cluster
[ℹ]  using region ap-southeast-2
[ℹ]  setting availability zones to [ap-southeast-2b ap-southeast-2a ap-southeast-2c]
[ℹ]  subnets for ap-southeast-2b - public:192.168.0.0/19 private:192.168.96.0/19
[ℹ]  subnets for ap-southeast-2a - public:192.168.32.0/19 private:192.168.128.0/19
[ℹ]  subnets for ap-southeast-2c - public:192.168.64.0/19 private:192.168.160.0/19
[ℹ]  nodegroup "ng-9785e504" will use "ami-0f0121e9e64ebd3dc" [AmazonLinux2/1.12]
[ℹ]  creating EKS cluster "floral-creature-1555603083" in "ap-southeast-2" region
[ℹ]  will create 2 separate CloudFormation stacks for cluster itself and the initial nodegroup
[ℹ]  if you encounter any issues, check CloudFormation console or try 'eksctl utils describe-stacks --region=ap-southeast-2 --name=floral-creature-1555603083'
[ℹ]  2 sequential tasks: { create cluster control plane "floral-creature-1555603083", create nodegroup "ng-9785e504" }
[ℹ]  building cluster stack "eksctl-floral-creature-1555603083-cluster"
[ℹ]  deploying stack "eksctl-floral-creature-1555603083-cluster"
[ℹ]  buildings nodegroup stack "eksctl-floral-creature-1555603083-nodegroup-ng-9785e504"
[ℹ]  --nodes-min=2 was set automatically for nodegroup ng-9785e504
[ℹ]  --nodes-max=2 was set automatically for nodegroup ng-9785e504
[ℹ]  deploying stack "eksctl-floral-creature-1555603083-nodegroup-ng-9785e504"
[✔]  all EKS cluster resource for "floral-creature-1555603083" had been created
[✔]  saved kubeconfig as "/Users/jmunroe/.kube/config"
[ℹ]  adding role "arn:aws:iam::669462648944:role/eksctl-floral-creature-1555603083-NodeInstanceRole-1W1R2AENPS0IJ" to auth ConfigMap
[ℹ]  nodegroup "ng-9785e504" has 0 node(s)
[ℹ]  waiting for at least 2 node(s) to become ready in "ng-9785e504"
[ℹ]  nodegroup "ng-9785e504" has 2 node(s)
[ℹ]  node "ip-192-168-35-242.ap-southeast-2.compute.internal" is ready
[ℹ]  node "ip-192-168-87-86.ap-southeast-2.compute.internal" is ready
[ℹ]  kubectl command should work with "/Users/jmunroe/.kube/config", try 'kubectl get nodes'
[✔]  EKS cluster "floral-creature-1555603083" in "ap-southeast-2" region is ready

Took a few minutes, but it definitely appears we have a K8 cluster up and running.

Now we set up helm on this new cluster.

kubectl create clusterrolebinding cluster-admin-binding --clusterrole=cluster-admin 
kubectl --namespace kube-system create serviceaccount tiller
kubectl create clusterrolebinding tiller --clusterrole cluster-admin --serviceaccount=kube-system:tiller
helm init --service-account tiller --wait
kubectl --namespace=kube-system patch deployment tiller-deploy --type=json --patch='[{"op": "add", "path": "/spec/template/spec/containers/0/command", "value": ["/tiller", "--listen=localhost:44134"]}]'

Check that helm is up and running

~ $ helm version
Client: &version.Version{SemVer:"v2.13.1", GitCommit:"618447cbf203d147601b4b9bd7f8c37a5d39fbb4", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.13.1", GitCommit:"618447cbf203d147601b4b9bd7f8c37a5d39fbb4", GitTreeState:"clean"}

Add pangeo helm repositories

helm repo add pangeo https://pangeo-data.github.io/helm-chart/
helm repo update

Create secret_config.yaml and jupyter_config.yaml files. Deploy helm chart:

helm install jupyterhub/binderhub --version=0.2.0-7b2c4f8  \
             --name=c3dis --namespace=c3dis \
             -f secret.yaml -f config.yaml

Find out the external IP for the public-proxy

kubectl --namespace=c3dis get svc proxy-public

Add the lines

config:
  BinderHub:
    hub_url: http://<external-IP>

to config.yaml. Then upgrade helm

helm upgrade c3dis jupyterhub/binderhub --version=0.2.0-7b2c4f8 -f secret.yaml -f config.yaml

Finally, get the IP to get to binderhub:

kubectl --namespace=c3dis get svc binder

helm install pangeo/pangeo --devel
--namespace=c3dis --name=c3dis
-f secret_config.yaml
-f jupyter_config.yaml

@jmunroe
Copy link
Owner Author

jmunroe commented Apr 29, 2019

This is taken several kicks at the can but I now have BinderHub up and running on AWS. I could not manage to get AWS ECR working, so I am using DockerHub instead. I also don't have any autoscaling going on so my plan is to scale this cluster up for the training and bring it back down manually afterwards.

The only remaining issue is to figure out how to do ingress properly with an Nginx server. Right now, the binderhub URL needs to be retrieved with

kubectl get svc --namespace pangeo binder

I should create DNS record that is not going to change and set up something that points to the ephemeral IP that AWS is assigning to me.

@jmunroe jmunroe closed this as completed Sep 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant