Skip to content

Latest commit

 

History

History
executable file
·
109 lines (75 loc) · 2.45 KB

05-data-engineering.md

File metadata and controls

executable file
·
109 lines (75 loc) · 2.45 KB

[OK!] Chapter 5. Data Engineering


Validating the JupyterHub installation


$ kubectl get ingress -n ml-workshop
NAME                   CLASS   HOSTS                            ADDRESS        PORTS     AGE
ap-airflow2            nginx   airflow.192.168.49.2.nip.io      192.168.49.2   80, 443   3m36s
grafana                nginx   grafana.192.168.49.2.nip.io      192.168.49.2   80, 443   3m36s
jupyterhub             nginx   jupyterhub.192.168.49.2.nip.io   192.168.49.2   80, 443   3m39s
minio-ml-workshop-ui   nginx   minio.192.168.49.2.nip.io        192.168.49.2   80, 443   3m36s
mlflow                 nginx   mlflow.192.168.49.2.nip.io       192.168.49.2   80, 443   3m36s

// mluser / mluser
https://jupyterhub.192.168.49.2.nip.io

Image: Base Elyra Notebook
Container size: Default

// Check new container
$ kubectl get pods -n ml-workshop | grep mluser
jupyterhub-nb-mluser                           1/1     Running     0             111s

Jupyter > git (left panel) > clone a repo > https://github.com/webmakaka/Machine-Learning-on-Kubernetes.git


File > Hub Control Panel > Stop My Server


Creating a Spark cluster


$ kubectl get pods -n ml-workshop | grep spark-operator
spark-operator-545676669f-nnz84                1/1     Running     0             80m

$ kubectl create -f Chapter05/simple-spark-cluster.yaml -n ml-workshop

$ kubectl get pods -n ml-workshop | grep simple-spark
simple-spark-cluster-m-vpvk2                   1/1     Running     0             85s
simple-spark-cluster-w-b7qgp                   1/1     Running     0             84s

$ kubectl delete sparkcluster simple-spark-cluster -n ml-workshop

// mluser / mluser
https://jupyterhub.192.168.49.2.nip.io

Image: Elyra Notebook Image with Spark
Container size: Small

$ kubectl get pods -n ml-workshop | grep mluser
jupyterhub-nb-mluser                           1/1     Running     0             6m55s
spark-cluster-mluser-m-bckpw                   1/1     Running     0             6m55s
spark-cluster-mluser-w-2sc6h                   1/1     Running     0             6m55s
spark-cluster-mluser-w-jth79                   1/1     Running     0             6m55s

// SPARK UI
https://spark-cluster-mluser.192.168.49.2.nip.io

Jupyter > Chapter05/hellospark.ipynb

Run > Run All cells


File > Hub Control Panel > Stop My Server