# Current Architecture Diagram

<a href="https://ncar.github.io/cisl-cloud/_static/ccpp-diagram.png"><img src="https://ncar.github.io/cisl-cloud/_static/ccpp-diagram.png"/></a>

# Architecture Explanation

All of the CISL Cloud infrastructure is deployed inside the NCAR network and is only accessible while connected to the network. This includes the hardware and everything installed on it. Communication to external services goes through a firewall that controls what is permitted access and what is denied. 

## Kubernetes

We have installed [Kubernetes (k8s)](https://kubernetes.io/) master nodes on virtual machines running Linux and worker nodes on bare metal servers running Linux. Currently we have 5 Supermicro servers with 2 Intel Xeon Gold 6326 processors with 16 cores each, 512 GB of RAM, and Nvidia A2 Tensor GPUs that have 1280 cores and 16 GB of memory. 

## Securely Exposing Services

On the k8s cluster an [Nginx Ingress controller](https://docs.nginx.com/nginx-ingress-controller/) is configured to expose Services on the network. This is coupled with [ExternalDNS](https://bitnami.com/stack/external-dns/helm) in order to create DNS entries that resolve to FQDNs in order to browse to URLs of services deployed. In order to make sure the URLs exposed are secure we implemented [cert-manager](https://cert-manager.io/) to assign valid certificates to applications and perform lifecycle management on the issued certificates. This ensures all services are accessible only via HTTPS with valid certificates.   

## Storage Options

### Rook

[Rook](https://rook.io/docs/rook/v1.11/Getting-Started/intro/) is used to provide storage orchestration to k8s workloads. Rook utilizes Ceph as a distributed storage system to provide file, block, and object storage capabilities to the k8s cluster and the underlying objects hosted. 

#### GLADE Access

Read only access to data stored on GLADE is provided via NFS to the K8s nodes which is then exposed to objects in the cluster via Rook. 

### Stratus Object Storage

The [Stratus](https://arc.ucar.edu/knowledge_base/70549594) object storage system is for long term data storage and is provided by the [Advanced Research Computing](https://arc.ucar.edu/) division of NCAR | CISL

## Applications

### JupyterHub

[JupyterHub](https://jupyter.org/hub) provides a way to spin up dedicated personal Jupyter Lab environments for users. We currently utilize GitHub authentication to control and provide access. Rook is utilized to add additional storage and mounts to the Jupyter user environments. The user environments are containerized versions of Jupyter Lab that can be pulled from Docker Hub or our internal container registry Harbor. 

### Dask Gateway

[Dask](https://www.dask.org/) enables parallel computing in Python and offers options to create separate Dask clusters with dedicated resources. In the NCAR JupyterHub `dask_gateway` is utilized to provision a Dask [GatewayCluster](https://gateway.dask.org/api-client.html#gatewaycluster) via Python.

### Rancher

[Rancher](https://www.rancher.com/) is an open source container management platform built for organizations that deploy containers in production. CISL operates a Rancher cluster to help manage k8s clusters.

### Harbor

We utilize [Harbor](https://goharbor.io/) to provide a container registry based on open source software that is closer to the infrastructure running containers. A local registry allows us to utilize network infrastructure and available bandwidth between hardware for an increase in speed when pushing and pulling images locally. Harbor also includes an image scanner that will provide reports on any vulnerabilities that an image contains so we can address security concerns with images directly. 

### ArgoCD

[Argo CD](https://argo-cd.readthedocs.io/en/stable/) is a continuous delivery application for Kubernetes. It is responsible for deploying and continuously monitoring running applications and comparing their live state with the desired state set in the associated Git repository.

### Web Applications

On top of the web applications already mentioned, CISL provides the ability to host additional containerized web applications on k8s. Using k8s to host these containerized workloads offers advantages when it comes to ensuring applications have valid TLS certificates, adding DNS A records for the application, and providing highly available and redundant compute resources. Some examples of the web applications users could host in the scientific space are Panel, Bokeh, HTML & JavaScript, and JupyterBook documentation to accompany them. 