# About this Project
CISL is currently deploying a ***pilot*** on-premise prototype cloud environment for compute and storage.

2i2c deployed a JupyterHub instance in AWS. 
We would utilize this experience to leverage 2i2c knowledge for our own education. This instance can be utilized to run calculated tests to help determine costs associated with users running on AWS. 

## On-premise cloud

An on-premise (on-prem) cloud consists of storage, compute, and networking resources hosted on fully redundant hardware installed in personal/organizational facilities available to users 

### [Kubernetes (k8s)](../how-to/K8s/k8s-intro)
```{note}
This page does not cover what kubernetes is. If you are unsure exactly what kubernetes is take a look at these [search results](https://www.google.com/search?q=What+is+kubernetes) or visit the kubernetes [home page](https://kubernetes.io/).
```

We utilize a k8s cluster to host JupyterHub along with other services to support open science development. 

This cluster is configured with an ingress controller, an API to create DNS entries, and certificate manager to offer HTTPS access to user workloads via unique URLs. Documentation on how to utilize this is available at this [link to web app docs](../how-to/K8s/Hosting/web-intro). 

### [JupyterHub on k8s](../how-to/k8sJH/k8sJH-intro)
```{note}
The NCAR deployment utilized the Zero to JupyterHub [dcoumentation](https://z2jh.jupyter.org/en/stable/) to get started with customizing the environment.
```
There is a JupyterHub instance hosted on-prem at NWSC on a RKE2 provisioned k8s cluster. 

Dask gateway is installed to enable scalable parallel computing within JupyterHub with the intention to expose it to services outside JupyterHub. 

JupyterHub KubeSpawner creates single user environments with access to shared and persistent personal storage space. The Spawned user environments come with different default resource sizes with a GPU option. This spawner uses a customized Docker image that enables packages, kernels, and extensions the scientific research community utilizes to increase productivity in data analysis. The custom environment also provides users read-only access to the campaign and collections directories on GLADE as well as a shared directory whose specific use case is still being fleshed out. 

Access to this JupyterHub is handled via GitHub authentication and a team under the NCAR organization in GitHub.

### [Rancher](../how-to/K8s/Rancher/rancher-intro)
```{admonition} Rancher
[Rancher](https://www.rancher.com/)  is an open source container management platform built for organizations that deploy containers in production. 
```
CISL operates a Rancher cluster to help manage k8s clusters. It provides the ability to manage multiple clusters and services 

### [Harbor - Container Registry](../how-to/harbor/harbor-intro)

We utilize [Harbor](https://goharbor.io/) to provide a container registry based on open source software that is closer to the infrastructure running containers. A local registry allows us to utilize network infrastructure and available bandwidth between hardware for an increase in speed when pushing and pulling images locally. Harbor also includes an image scanner that will provide reports on any vulnerabilities that an image contains so we can address security concerns with images directly. 

### [Argo CD (Continuous Delivery)](../how-to/argocd/argo-user)

We have an instance of [Argo CD](https://argo-cd.readthedocs.io/en/stable/) installed to help us handle Continuous Delivery (CD). What this ultimately means is if your applications Git repo is setup in Argo CD it can be automatically configured to deploy any changes made to that repository without any intervention by the user or admins. This allows users to deploy their applications automatically to k8s without having to worry about interacting directly with Kubernetes.

### Storage

#### Rook

[Rook](https://rook.io/docs/rook/v1.11/Getting-Started/intro/) is used to provide storage orchestration to k8s workloads. Rook utilizes Ceph as a distributed storage system to provide file, block, and object storage capabilities to the k8s cluster and the underlying objects hosted. 

#### [GLADE](https://arc.ucar.edu/knowledge_base/68878466)
NFS is utilized to provide read only (RO) only access to GLADE on the Spawned JupyterHub user environments. Currently the collections and campaign directories on GLADE are available as RO.  

#### [Stratus](https://arc.ucar.edu/knowledge_base/70549594)
S3 buckets are provided via CISLs object storage platform Stratus.

## 2i2c

### [JupyterHub](../how-to/2i2cJH/2i2cJH-intro) 
2i2c deployed a JupyterHub instance in AWS. Access to this JupyterHub instance is provided by [GitHub Teams](https://github.com/orgs/NCAR/teams/2i2c-cloud-users). Costs will be incurred for any AWS compute resources utilized by users. At this point in time the 2i2c deployed JupyterHub will be used to validate the 2i2c notebook configuration for the research community users. These validations will be orchestrated to develop an estimate of potential costs to use at scale.

### Storage
Data Storage for the 2i2c JupyterHub instance is provided by AWS Elastic File System ([EFS](https://aws.amazon.com/efs/)). NCAR internal data from GLADE and Stratus will not be available from the 2i2c JupyterHub instance.

## Data Access
[AWS S3 Open Data Registry](https://registry.opendata.aws/) utilizes AWS S3 API calls the same way as STRATUS. By utilizing S3 API calls we can make Data accessible in a familiar way on the Web and on-premise. 

## Agile Program Management
**[Kanban Board](https://jira.ucar.edu/secure/RapidBoard.jspa?rapidView=220&projectKey=CCPP)**

This project is implementing a hybrid Agile Project Management workflow. Waterfall techniques will be used for high level project management. Kanban will be used for day to day tasks and creating a continuous flow of value to users. 