# Overview

In this notebook we will explore the options for installing pachyderm locally (i.e. in a self-hosted format). This means no commercial requirements like a cloud provider or any manages service or applicance.

Currently the latest stable version is [2.5.5](https://github.com/pachyderm/pachyderm/releases/tag/v2.5.5) which was released 4/27/2023.

# Installation Options

Looking through the documentation we see two options for installing pachyderm: locally or in the cloud. We then see that these two opions break down further. The table below tries to enumerate the options presented or implied in the documentation.

- [Local Installation](https://docs.pachyderm.com/2.3.x/getting-started/local-installation/)
    - [Docker Desktop](https://docs.pachyderm.com/latest/getting-started/local-deploy/docker/) - A commercial application that allows running containers as well as a single node kubernetes cluster
    - [Minikube](https://docs.pachyderm.com/latest/getting-started/local-deploy/minikube/) - An open source tool that allows running a single node kubernetes cluster locally
- [On-premisis Installation](https://docs.pachyderm.com/latest/deploy-manage/deploy/on-premises/)
    - kubernetes cluster
- Cloud Installation
    - AWS
    - Azure
    - GCP

As the pachyderm instructions are assuming that local installation is performed on a desktop environment, they offer instructions tailored to Windows, Mac, and Linux.

As mentioned earlier, we will be installing to a self-hosted kubernetes cluster. 

The official instructions for the local installation can be found [here](https://docs.pachyderm.com/latest/getting-started/local-deploy/).

# Installation On Kubernetes Cluster

Regardless of which local installation method we have chosen, the steps should be reletively similar as we are essentially deployign an application to kubernetes.

## Install Homebrew (docker desktop / minicube only)

The first step in the installation is to install Homebrew (if using linux or mac; For Windows, do documentation lists manual steps that must be undertaken). 

Homebrew is a package manager. It is used to install other components including:
- The kubernetes environment (Docker Desktop / Minicube)
- pachctl
- helm

Originally named Linuxbrew, Homebrew was developed for macOS to provide users with a convenient way to install Linux applications. After the tool gained popularity for its large selection of applications and ease of use, Homebrew developers created a native Linux version.

Homebrew is an “add-on” package manager. Homebrew installs packages alongside whatever system it runs on.

As we are running on a full blown kubernetes cluster we do not need to use brew.

## Install pachctl
The pachctl is a command-line tool that you can use to interact with a Pachyderm cluster in your terminal. It is provided as a precompiled binary available from the [github releases page](https://github.com/pachyderm/pachyderm/releases/tag/v2.5.5).

```
[root@os004k8-master001 ~]# curl -L -O https://github.com/pachyderm/pachyderm/releases/download/v2.5.5/pachctl_2.5.5_linux_amd64.tar.gz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100 37.1M  100 37.1M    0     0  33.6M      0  0:00:01  0:00:01 --:--:-- 82.4M

[root@os004k8-master001 ~]# tar -xzvf pachctl_2.5.5_linux_amd64.tar.gz
pachctl_2.5.5_linux_amd64/pachctl

[root@os004k8-master001 ~]# cp pachctl_2.5.5_linux_amd64/pachctl /usr/bin/

[root@os004k8-master001 ~]# pachctl version
COMPONENT           VERSION
pachctl             2.5.5


```

**Note**:The official installation instructions can be found [here](https://docs.pachyderm.com/2.3.x/getting-started/local-installation/#install-pachctl).

## Install Helm (full kuberentes only)

We can think of helm as a package and deployment manager for kubernetes. Helm automates the creation, packaging, configuration, and deployment of Kubernetes applications. It does this through a packaging structure that combines your configuration files into a single reusable format that can be understood and managed by the utility.

### Helm Compatibility

In order to install helm, we need to figure out which version is comptible with the version of our kubernetes cluster. The helm documentation lists the [compatibility matrix](https://helm.sh/docs/topics/version_skew/) as seen below:


|Helm Version|Supported Kubernetes Versions|
|------------|-----------------------------|
|3.11.x |1.26.x - 1.23.x|
|3.10.x|1.25.x - 1.22.x|
|3.9.x|1.24.x - 1.21.x|
|3.8.x|1.23.x - 1.20.x|
|3.7.x|1.22.x - 1.19.x|
|3.6.x|1.21.x - 1.18.x|
|3.5.x|1.20.x - 1.17.x|
|3.4.x|1.19.x - 1.16.x|


In my case my case my kubernetes cluster was running version 1.21.14:
```
[root@os004k8-master001 ~]# kubectl version
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.9", GitCommit:"b631974d68ac5045e076c86a5c66fba6f128dc72", GitTreeState:"clean", BuildDate:"2022-01-19T17:51:12Z", GoVersion:"go1.16.12", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.14", GitCommit:"0f77da5bd4809927e15d1658fb4aa8f13ad890a5", GitTreeState:"clean", BuildDate:"2022-06-15T14:11:36Z", GoVersion:"go1.16.15", Compiler:"gc", Platform:"linux/amd64"}

```

So this means I can run helm 3.6 to 3.9. I will go with 3.9 as it's the newest version which has had the most burn in time with my version of k8.

### Download Binaries
The official installation instructions can be found [here](https://helm.sh/docs/intro/install/). Every version of helm is distributed as a binary built for x64 arhchitectures. The binaries can be doenloaded from the [github releases page](https://github.com/helm/helm/releases).

In my case, [3.9.4](https://github.com/helm/helm/releases/tag/v3.9.4) is the latest version available.


```
[root@os004k8-master001 ~]# curl -O https://get.helm.sh/helm-v3.9.4-linux-amd64.tar.gz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 13.3M  100 13.3M    0     0  20.1M      0 --:--:-- --:--:-- --:--:-- 20.1M

[root@os004k8-master001 ~]# tar -zxvf helm-v3.9.4-linux-amd64.tar.gz
linux-amd64/
linux-amd64/helm
linux-amd64/LICENSE
linux-amd64/README.md

[root@os004k8-master001 ~]# linux-amd64/helm version
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /root/.kube/config
WARNING: Kubernetes configuration file is world-readable. This is insecure. Location: /root/.kube/config
version.BuildInfo{Version:"v3.9.4", GitCommit:"dbc6d8e20fe1d58d50e6ed30f09a04a77e4c68db", GitTreeState:"clean", GoVersion:"go1.17.13"}

[root@os004k8-master001 ~]# cp linux-amd64/helm /usr/bin/
[root@os004k8-master001 ~]# helm version
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /root/.kube/config
WARNING: Kubernetes configuration file is world-readable. This is insecure. Location: /root/.kube/config
version.BuildInfo{Version:"v3.9.4", GitCommit:"dbc6d8e20fe1d58d50e6ed30f09a04a77e4c68db", GitTreeState:"clean", GoVersion:"go1.17.13"}

```


### Connect helm to kubernetes cluster
In order to allow helm to install packages on kuernetes, it needs to be able to access information about the cluster. This is typically done via the kube config file. A plain text file that contains the configurations and secrets necessary for a cli to connect and authenticate against a kubernetes cluster. For example, the kubectl and kubeadm programs use this file.

Helm will default to using whatever your current Kubernetes context is, as specified in the $HOME/. kube/config file. 

### Add Helm Chart Repository

The heml package format is referred to as a chart. Similar to regular OS packages, helm charts are provided by repositories. The package manager (helmp) is configured to point to repositories to allow users to download and install packages from those repositories. Artifact Hub is a public repository providing open source helm charts.




We want to add the repo for the pachyderm repo so we can install the app on our cluster.

```
[root@os004k8-master001 ~]# helm repo add pachyderm https://helm.pachyderm.com
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /root/.kube/config
WARNING: Kubernetes configuration file is world-readable. This is insecure. Location: /root/.kube/config
"pachyderm" has been added to your repositories

[root@os004k8-master001 ~]# helm repo update
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /root/.kube/config
WARNING: Kubernetes configuration file is world-readable. This is insecure. Location: /root/.kube/config
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "pachyderm" chart repository
Update Complete. ⎈Happy Helming!⎈

```

### Inspect Helm Chart
We can ask helm for a definition of the pachyderm chart

```
[root@os004k8-master001 ~]# helm show chart pachyderm/pachyderm
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /root/.kube/config
WARNING: Kubernetes configuration file is world-readable. This is insecure. Location: /root/.kube/config
annotations:
  artifacthub.io/license: Apache-2.0
  artifacthub.io/links: |
    - name: "Pachyderm"
      url: https://www.pachyderm.com/
    - name: "Pachyderm repo"
      url: https://github.com/pachyderm/pachyderm
    - name: "Chart repo"
      url: https://github.com/pachyderm/helmchart
  artifacthub.io/prerelease: "false"
apiVersion: v2
appVersion: 2.5.5
dependencies:
- condition: postgresql.enabled
  name: postgresql
  repository: file://./dependencies/postgresql
  version: 10.8.0
- condition: pachd.lokiDeploy
  name: loki-stack
  repository: https://grafana.github.io/helm-charts
  version: 2.8.1
description: Explainable, repeatable, scalable data science
home: https://www.pachyderm.com/
icon: https://www.pachyderm.com/wp-content/themes/pachyderm/assets/img/favicons/favicon-32x32.png
keywords:
- data science
kubeVersion: '>= 1.16.0-0'
name: pachyderm
sources:
- https://github.com/pachyderm/pachyderm
- https://github.com/pachyderm/helmchart
type: application
version: 2.5.5
```

**Note**: More information about the helm chart format can be found here: https://helm.sh/docs/topics/charts/

#### Inspect Dependencies

Taking a closer look at the chart, we can wee there are dependencies listed for this chart:

```
...
dependencies:
- condition: postgresql.enabled
  name: postgresql
  repository: file://./dependencies/postgresql
  version: 10.8.0
- condition: pachd.lokiDeploy
  name: loki-stack
  repository: https://grafana.github.io/helm-charts
  version: 2.8.1
...
```

For the first dependency, we see that there is an instruction to install the postgresql chart from from a local source (the source code for this chart is specified as a relative reference). Looking at the repository in github I was able to find the code [here](https://github.com/pachyderm/pachyderm/blob/master/etc/helm/pachyderm/dependencies/postgresql/Chart.yaml). This points to  a vanilla postgress installtion [provided by bitnami](https://github.com/bitnami/charts/tree/main/bitnami/postgresql).

The second dependency points to a package called [loki-stack](https://github.com/grafana/helm-charts/blob/main/charts/loki-stack/Chart.yaml). This package is provided by the grafana project and hosts the Loki service. Grafana is the open source analytics and monitoring solution. Loki is a log aggregation system designed to store and query logs from applications and infrastructure. Loki and Grafana work together to store and to query and display the logs respectively. 
The official instructions for installing Grafana Loki canbe found [here](https://grafana.com/docs/loki/latest/installation/helm/).

#### Inspect Values
Helm was designed so that the helm charts could be defined in a flexible and customizable way. One of the ways this is facilitated is through the values object. Helm assumes the chart is a template and allows users to specify values which map into the configurations hosted in the chart. In this way a user might have a single chart for multiple database deployments; a separate values file could be used to configure each instance (i.e. set the password etc.).

We can see the values that are packaged with the helm chart in the [git repository](https://github.com/pachyderm/pachyderm/blob/master/etc/helm/pachyderm/values.yaml) or we can ask helm to tell us what values are associated with a given chart with the following:

```
### \#
[root@os004k8-master001 ~]# helm show values pachyderm/pachyderm
# SPDX-FileCopyrightText: Pachyderm, Inc. <info@pachyderm.com>
# SPDX-License-Identifier: Apache-2.0

# Deploy Target configures the storage backend to use and cloud provider
# settings (storage classes, etc). It must be one of GOOGLE, AMAZON,
# MINIO, MICROSOFT, CUSTOM or LOCAL.
deployTarget: ""

global:
  postgresql:
    # postgresqlUsername is the username to access the pachyderm and dex databases
    postgresqlUsername: "pachyderm"
    # postgresqlPassword to access the postgresql database.  We set a default well-known password to
    # facilitate easy upgrades when testing locally.  Any sort of install that needs to be secure
    # must specify a secure password here, or provide the postgresqlExistingSecretName and
    # postgresqlExistingSecretKey secret.  If using an external Postgres instance (CloudSQL / RDS /
    # etc.), this is the password that Pachyderm will use to connect to it.
    postgresqlPassword: "insecure-user-password"
    # When installing a local Postgres instance, postgresqlPostgresPassword defines the root
    # ('postgres') user's password.  It must remain consistent between upgrades, and must be
    # explicitly set to a value if security is desired.  Pachyderm does not use this account; this
    # password is only required so that administrators can manually perform administrative tasks.
    postgresqlPostgresPassword: "insecure-root-password"
    # The auth type to use with postgres and pg-bouncer. md5 is the default
    postgresqlAuthType: "md5"
    # If you want to supply the postgresql password in an existing secret, leave Password blank and
    # Supply the name of the existing secret in the namespace and the key in that secret with the password
    postgresqlExistingSecretName: ""
    postgresqlExistingSecretKey: ""
    # postgresqlDatabase is the database name where pachyderm data will be stored
    postgresqlDatabase: "pachyderm"
    # The postgresql database host to connect to. Defaults to postgres service in subchart
    postgresqlHost: "postgres"
    # The postgresql database port to connect to. Defaults to postgres server in subchart
    postgresqlPort: "5432"
    # postgresqlSSL is the SSL mode to use for pg-bouncer connecting to Postgres, for the default local postgres it is disabled
    postgresqlSSL: "disable"
    # CA Certificate required to connect to Postgres
    postgresqlSSLCACert: ""
    # TLS Secret with cert/key to connect to Postgres
    postgresqlSSLSecret: ""
    # Indicates the DB name that dex connects to
    # Indicates the DB name that dex connects to. Defaults to "Dex" if not set.
    identityDatabaseFullNameOverride: ""
  # imagePullSecrets allow you to pull images from private repositories, these will also be added to pipeline workers
  # https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/
  # Example:
  # imagePullSecrets:
  #   - regcred
  imagePullSecrets: []
  # when set, the certificate file in pachd-tls-cert will be loaded as the root certificate for pachd, console, and enterprise-server pods
  customCaCerts: false
  # Sets the HTTP/S proxy server address for console, pachd, and enterprise server.  (This is for
  # traffic leaving the cluster, not traffic coming into the cluster.)
  proxy: ""
  # If proxy is set, this allows you to set a comma-separated list of destinations that bypass the proxy
  noProxy: ""
  # Set security context runAs users. If running on openshift, set enabled to false as openshift creates its own contexts.
  securityContexts:
    enabled: true

console:
  # enabled controls whether the console manifests are created or not.
  enabled: true
  annotations: {}
  image:
    # repository is the image repo to pull from; together with tag it
    # replicates the --console-image & --registry arguments to pachctl
    # deploy.
    repository: "pachyderm/haberdashery"
    pullPolicy: "IfNotPresent"
    # tag is the image repo to pull from; together with repository it
    # replicates the --console-image argument to pachctl deploy.
    tag: "2.5.5-1"
  priorityClassName: ""
  nodeSelector: {}
  tolerations: []
  # podLabels specifies labels to add to the console pod.
  podLabels: {}
  # resources specifies the resource request and limits.
  resources:
    {}
    #limits:
    #  cpu: "1"
    #  memory: "2G"
    #requests:
    #  cpu: "1"
    #  memory: "2G"
  config:
    reactAppRuntimeIssuerURI: "" # Inferred if running locally or using ingress
    oauthRedirectURI: "" # Infered if running locally or using ingress
    oauthClientID: "console"
    oauthClientSecret: "" # Autogenerated on install if blank
    # oauthClientSecretSecretName is used to set the OAuth Client Secret via an existing k8s secret.
    # The value is pulled from the key, "OAUTH_CLIENT_SECRET".
    oauthClientSecretSecretName: ""
    graphqlPort: 4000
    pachdAddress: "pachd-peer:30653"
    disableTelemetry: false # Disables analytics and error data collection

  service:
    annotations: {}
    # labels specifies labels to add to the console service.
    labels: {}
    # type specifies the Kubernetes type of the console service.
    type: ClusterIP

etcd:
  affinity: {}
  annotations: {}
  # dynamicNodes sets the number of nodes in the etcd StatefulSet.  It
  # is analogous to the --dynamic-etcd-nodes argument to pachctl
  # deploy.
  dynamicNodes: 1
  image:
    repository: "pachyderm/etcd"
    tag: "v3.5.5"
    pullPolicy: "IfNotPresent"
  # maxTxnOps sets the --max-txn-ops in the container args
  maxTxnOps: 10000
  priorityClassName: ""
  nodeSelector: {}
  # podLabels specifies labels to add to the etcd pod.
  podLabels: {}
  # resources specifies the resource request and limits
  resources:
    {}
    #limits:
    #  cpu: "1"
    #  memory: "2G"
    #requests:
    #  cpu: "1"
    #  memory: "2G"
  # storageClass indicates the etcd should use an existing
  # StorageClass for its storage.  It is analogous to the
  # --etcd-storage-class argument to pachctl deploy.
  # More info for setting up storage classes on various cloud providers:
  # AWS: https://docs.aws.amazon.com/eks/latest/userguide/storage-classes.html
  # GCP: https://cloud.google.com/compute/docs/disks/performance#disk_types
  # Azure: https://docs.microsoft.com/en-us/azure/aks/concepts-storage#storage-classes
  storageClass: ""
  # storageSize specifies the size of the volume to use for etcd.
  # Recommended Minimum Disk size for Microsoft/Azure: 256Gi  - 1,100 IOPS https://azure.microsoft.com/en-us/pricing/details/managed-disks/
  # Recommended Minimum Disk size for Google/GCP: 50Gi        - 1,500 IOPS https://cloud.google.com/compute/docs/disks/performance
  # Recommended Minimum Disk size for Amazon/AWS: 500Gi (GP2) - 1,500 IOPS https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-volume-types.html
  storageSize: 10Gi
  service:
    # annotations specifies annotations to add to the etcd service.
    annotations: {}
    # labels specifies labels to add to the etcd service.
    labels: {}
    # type specifies the Kubernetes type of the etcd service.
    type: ClusterIP
  tolerations: []

enterpriseServer:
  enabled: false
  affinity: {}
  annotations: {}
  tolerations: []
  priorityClassName: ""
  nodeSelector: {}
  service:
    type: ClusterIP
    apiGRPCPort: 31650
    prometheusPort: 31656
    oidcPort: 31657
    identityPort: 31658
    s3GatewayPort: 31600
  # There are three options for TLS:
  # 1. Disabled
  # 2. Enabled, existingSecret, specify secret name
  # 3. Enabled, newSecret, must specify cert, key and name
  tls:
    enabled: false
    secretName: ""
    newSecret:
      create: false
      crt: ""
      key: ""
  resources:
    {}
    #limits:
    #  cpu: "1"
    #  memory: "2G"
    #requests:
    #  cpu: "1"
    #  memory: "2G"
  # podLabels specifies labels to add to the pachd pod.
  podLabels: {}
  clusterDeploymentID: ""
  image:
    repository: "pachyderm/pachd"
    pullPolicy: "IfNotPresent"
    # tag defaults to the chart’s specified appVersion.
    tag: ""

ingress:
  enabled: false
  annotations: {}
  host: ""
  # when set to true, uriHttpsProtoOverride will add the https protocol to the ingress URI routes without configuring certs
  uriHttpsProtoOverride: false
  # There are three options for TLS:
  # 1. Disabled
  # 2. Enabled, existingSecret, specify secret name
  # 3. Enabled, newSecret, must specify cert, key, secretName and set newSecret.create to true
  tls:
    enabled: false
    secretName: ""
    newSecret:
      create: false
      crt: ""
      key: ""

# loki-stack contains values that will be passed to the loki-stack subchart
loki-stack:
  loki:
    serviceAccount:
      automountServiceAccountToken: false
    persistence:
      enabled: true
      accessModes:
        - ReadWriteOnce
      size: 10Gi
      # More info for setting up storage classes on various cloud providers:
      # AWS: https://docs.aws.amazon.com/eks/latest/userguide/storage-classes.html
      # GCP: https://cloud.google.com/compute/docs/disks/performance#disk_types
      # Azure: https://docs.microsoft.com/en-us/azure/aks/concepts-storage#storage-classes
      storageClassName: ""
      annotations: {}
      priorityClassName: ""
      nodeSelector: {}
      tolerations: []
    config:
      server:
        grpc_server_max_recv_msg_size: 67108864 # 64MiB
      query_scheduler:
        grpc_client_config:
          max_send_msg_size: 67108864 # 64MiB
      limits_config:
        retention_period: 24h
        retention_stream:
          - selector: '{suite="pachyderm"}'
            priority: 1
            period: 168h # = 1 week
  grafana:
    enabled: false
  promtail:
    config:
      clients:
        - url: "http://{{ .Release.Name }}-loki:3100/loki/api/v1/push"
      snippets:
        # The scrapeConfigs section is copied from loki-stack-2.6.4
        # The pipeline_stages.match stanza has been added to prevent multiple lokis in a cluster from mixing their logs.
        scrapeConfigs: |
          - job_name: kubernetes-pods
            pipeline_stages:
              {{- toYaml .Values.config.snippets.pipelineStages | nindent 4 }}
              - match:
                  selector: '{namespace!="{{ .Release.Namespace }}"}'
                  action: drop
            kubernetes_sd_configs:
              - role: pod
            relabel_configs:
              - source_labels:
                  - __meta_kubernetes_pod_controller_name
                regex: ([0-9a-z-.]+?)(-[0-9a-f]{8,10})?
                action: replace
                target_label: __tmp_controller_name
              - source_labels:
                  - __meta_kubernetes_pod_label_app_kubernetes_io_name
                  - __meta_kubernetes_pod_label_app
                  - __tmp_controller_name
                  - __meta_kubernetes_pod_name
                regex: ^;*([^;]+)(;.*)?$
                action: replace
                target_label: app
              - source_labels:
                  - __meta_kubernetes_pod_label_app_kubernetes_io_instance
                  - __meta_kubernetes_pod_label_release
                regex: ^;*([^;]+)(;.*)?$
                action: replace
                target_label: instance
              - source_labels:
                  - __meta_kubernetes_pod_label_app_kubernetes_io_component
                  - __meta_kubernetes_pod_label_component
                regex: ^;*([^;]+)(;.*)?$
                action: replace
                target_label: component
              {{- if .Values.config.snippets.addScrapeJobLabel }}
              - replacement: kubernetes-pods
                target_label: scrape_job
              {{- end }}
              {{- toYaml .Values.config.snippets.common | nindent 4 }}
              {{- with .Values.config.snippets.extraRelabelConfigs }}
              {{- toYaml . | nindent 4 }}
              {{- end }}
        pipelineStages:
          - cri: {}
        common:
          # This is copy and paste of existing actions, so we don't lose them.
          # Cf. https://github.com/grafana/loki/issues/3519#issuecomment-1125998705
          - action: replace
            source_labels:
              - __meta_kubernetes_pod_node_name
            target_label: node_name
          - action: replace
            source_labels:
              - __meta_kubernetes_namespace
            target_label: namespace
          - action: replace
            replacement: $1
            separator: /
            source_labels:
              - namespace
              - app
            target_label: job
          - action: replace
            source_labels:
              - __meta_kubernetes_pod_name
            target_label: pod
          - action: replace
            source_labels:
              - __meta_kubernetes_pod_container_name
            target_label: container
          - action: replace
            replacement: /var/log/pods/*$1/*.log
            separator: /
            source_labels:
              - __meta_kubernetes_pod_uid
              - __meta_kubernetes_pod_container_name
            target_label: __path__
          - action: replace
            regex: true/(.*)
            replacement: /var/log/pods/*$1/*.log
            separator: /
            source_labels:
              - __meta_kubernetes_pod_annotationpresent_kubernetes_io_config_hash
              - __meta_kubernetes_pod_annotation_kubernetes_io_config_hash
              - __meta_kubernetes_pod_container_name
            target_label: __path__
          - action: keep
            regex: pachyderm
            source_labels:
              - __meta_kubernetes_pod_label_suite
          # this gets all kubernetes labels as well
          - action: labelmap
            regex: __meta_kubernetes_pod_label_(.+)
    # Tolerations for promtail pods. Promtail must run on any node where pachyderm resources will run or you won't get any logs for them
    # For example, GKE gpu nodes have a default taint of nvidia.com/gpu=present:NoSchedule so if you use GPUs we wouldn't have logs
    tolerations: []
    livenessProbe:
      failureThreshold: 5
      tcpSocket:
        port: http-metrics
      initialDelaySeconds: 10
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 1

# The pachw controller creates a pool of pachd instances running in 'pachw' mode which can dynamically scale to handle
# storage related tasks
pachw:
  # When set to true, inheritFromPachd defaults below configuration options like 'resources' and 'tolerations' to
  # values from pachd. These values can be overridden by defining the corresponding pachw values below.
  # When set to false, a nil value will be used by default instead. Some configuration variables will always use their
  # corresponding pachd value, regardless of whether 'inheritFromPachd' is true, such as 'serviceAccountName'
  inheritFromPachd: true
  # inSidecars is enabled by default to also process storage related tasks in pipeline storage sidecars like version 2.4 or less.
  # when enabled, pachw instances can still run in their own dedicated kubernetes deployment if maxReplicas is greater than 0.
  # For more control of where pachw instances run, 'inSidecars' can be disabled.
  inSidecars: true
  maxReplicas: 1
  # minReplicas: 0
  # We recommend defining resources when running pachw with a high value of maxReplicas.
  #resources:
  #  limits:
  #    cpu: "1"
  #    memory: "2G"
  #  requests:
  #    cpu: "1"
  #  memory: "2G"
  #
  #tolerations: []
  #affinity: {}
  #nodeSelector: {}
pachd:
  enabled: true
  preflightChecks:
    # if enabled runs kube validation preflight checks.
    enabled: true
  affinity: {}
  annotations: {}
  # clusterDeploymentID sets the Pachyderm cluster ID.
  clusterDeploymentID: ""
  configJob:
    annotations: {}
  # goMaxProcs is passed as GOMAXPROCS to the pachd container.  pachd can automatically pick an
  # optimal GOMAXPROCS from the configured CPU limit, but this overrides it.
  goMaxProcs: 0
  # goMemLimit is passed as GOMEMLIMIT to the pachd container. pachd can automatically pick an
  # optimal GOMEMLIMIT from the configured memory request or limit, but this overrides it.  This is a string
  # because it can be something like '256MiB'.
  goMemLimit: ""
  # gcPercent sets the initial garbage collection target percentage.
  gcPercent: 0
  image:
    repository: "pachyderm/pachd"
    pullPolicy: "IfNotPresent"
    # tag defaults to the chart’s specified appVersion.
    # This sets the worker image tag as well (they should be kept in lock step)
    tag: ""
  logLevel: "info"
  disableLogSampling: false
  developmentLogger: false
  # If lokiDeploy is true, a Pachyderm-specific instance of Loki will
  # be deployed.
  lokiDeploy: true
  # lokiLogging enables Loki logging if set.
  lokiLogging: true
  metrics:
    # enabled sets the METRICS environment variable if set.
    enabled: true
    # endpoint should be the URL of the metrics endpoint.
    endpoint: ""
  priorityClassName: ""
  nodeSelector: {}
  # podLabels specifies labels to add to the pachd pod.
  podLabels: {}
  # resources specifies the resource requests and limits
  # replicas sets the number of pachd running pods
  replicas: 1
  resources:
    {}
    #limits:
    #  cpu: "1"
    #  memory: "2G"
    #requests:
    #  cpu: "1"
    #  memory: "2G"
  # requireCriticalServersOnly only requires the critical pachd
  # servers to startup and run without errors.  It is analogous to the
  # --require-critical-servers-only argument to pachctl deploy.
  requireCriticalServersOnly: false
  # If enabled, External service creates a service which is safe to
  # be exposed externally
  externalService:
    enabled: false
    # (Optional) specify the existing IP Address of the load balancer
    loadBalancerIP: ""
    apiGRPCPort: 30650
    s3GatewayPort: 30600
    annotations: {}
  service:
    # labels specifies labels to add to the pachd service.
    labels: {}
    # type specifies the Kubernetes type of the pachd service.
    type: "ClusterIP"
    annotations: {}
    apiGRPCPort: 30650
    prometheusPort: 30656
    oidcPort: 30657
    identityPort: 30658
    s3GatewayPort: 30600
    #apiGrpcPort:
    #  expose: true
    #  port: 30650
  # DEPRECATED: activateEnterprise is no longer used.
  activateEnterprise: false
  ## if pachd.activateEnterpriseMember is set, enterprise will be activated and connected to an existing enterprise server.
  ## if pachd.enterpriseLicenseKey is set, enterprise will be activated.
  activateEnterpriseMember: false
  ## if pachd.activateAuth is set, auth will be bootstrapped by the config-job.
  activateAuth: true
  ## the license key used to activate enterprise features
  enterpriseLicenseKey: ""
  # enterpriseLicenseKeySecretName is used to pass the enterprise license key value via an existing k8s secret.
  # The value is pulled from the key, "enterprise-license-key".
  enterpriseLicenseKeySecretName: ""
  # if a token is not provided, a secret will be autogenerated on install and stored in the k8s secret 'pachyderm-bootstrap-config.rootToken'
  rootToken: ""
  # rootTokenSecretName is used to pass the rootToken value via an existing k8s secret
  # The value is pulled from the key, "root-token".
  rootTokenSecretName: ""
  # if a secret is not provided, a secret will be autogenerated on install and stored in the k8s secret 'pachyderm-bootstrap-config.enterpriseSecret'
  enterpriseSecret: ""
  # enterpriseSecretSecretName is used to pass the enterprise secret value via an existing k8s secret.
  # The value is pulled from the key, "enterprise-secret".
  enterpriseSecretSecretName: ""
  # if a secret is not provided, a secret will be autogenerated on install and stored in the k8s secret 'pachyderm-bootstrap-config.authConfig.clientSecret'
  oauthClientID: pachd
  oauthClientSecret: ""
  # oauthClientSecretSecretName is used to set the OAuth Client Secret via an existing k8s secret.
  # The value is pulled from the key, "pachd-oauth-client-secret".
  oauthClientSecretSecretName: ""
  oauthRedirectURI: ""
  # DEPRECATED: enterpriseRootToken is deprecated, in favor of enterpriseServerToken
  # NOTE only used if pachd.activateEnterpriseMember == true
  enterpriseRootToken: ""
  # DEPRECATED: enterpriseRootTokenSecretName is deprecated in favor of enterpriseServerTokenSecretName
  # enterpriseRootTokenSecretName is used to pass the enterpriseRootToken value via an existing k8s secret.
  # The value is pulled from the key, "enterprise-root-token".
  enterpriseRootTokenSecretName: ""
  # enterpriseServerToken represents a token that can authenticate to a separate pachyderm enterprise server,
  # and is used to complete the enterprise member registration process for this pachyderm cluster.
  # The user backing this token should have either the licenseAdmin & identityAdmin roles assigned, or
  # the clusterAdmin role.
  # NOTE: only used if pachd.activateEnterpriseMember == true
  enterpriseServerToken: ""
  # enterpriseServerTokenSecretName is used to pass the enterpriseServerToken value via an existing k8s secret.
  # The value is pulled from the key, "enterprise-server-token".
  enterpriseServerTokenSecretName: ""
  # only used if pachd.activateEnterpriseMember == true
  enterpriseServerAddress: ""
  enterpriseCallbackAddress: ""
  # Indicates to pachd whether dex is embedded in its process.
  localhostIssuer: "" # "true", "false", or "" (used string as bool doesn't support empty value)
  # set the initial pachyderm cluster role bindings, mapping a user to their list of roles
  # ex.
  # pachAuthClusterRoleBindings:
  #   robot:wallie:
  #   - repoReader
  #   robot:eve:
  #   - repoWriter
  pachAuthClusterRoleBindings: {}
  # additionalTrustedPeers is used to configure the identity service to recognize additional OIDC clients as trusted peers of pachd.
  # For example, see the following example or the dex docs (https://dexidp.io/docs/custom-scopes-claims-clients/#cross-client-trust-and-authorized-party).
  # additionalTrustedPeers:
  #   - example-app
  additionalTrustedPeers: []
  serviceAccount:
    create: true
    additionalAnnotations: {}
    name: "pachyderm" #TODO Set default in helpers / Wire up in templates
  storage:
    # backend configures the storage backend to use.  It must be one
    # of GOOGLE, AMAZON, MINIO, MICROSOFT or LOCAL. This is set automatically
    # if deployTarget is GOOGLE, AMAZON, MICROSOFT, or LOCAL
    backend: ""
    amazon:
      # bucket sets the S3 bucket to use.
      bucket: ""
      # cloudFrontDistribution sets the CloudFront distribution in the
      # storage secrets.  It is analogous to the
      # --cloudfront-distribution argument to pachctl deploy.
      cloudFrontDistribution: ""
      customEndpoint: ""
      # disableSSL disables SSL.  It is analogous to the --disable-ssl
      # argument to pachctl deploy.
      disableSSL: false
      # id sets the Amazon access key ID to use.  Together with secret
      # and token, it implements the functionality of the
      # --credentials argument to pachctl deploy.
      id: ""
      # logOptions sets various log options in Pachyderm’s internal S3
      # client.  Comma-separated list containing zero or more of:
      # 'Debug', 'Signing', 'HTTPBody', 'RequestRetries',
      # 'RequestErrors', 'EventStreamBody', or 'all'
      # (case-insensitive).  See 'AWS SDK for Go' docs for details.
      # logOptions is analogous to the --obj-log-options argument to
      # pachctl deploy.
      logOptions: ""
      # maxUploadParts sets the maximum number of upload parts.  It is
      # analogous to the --max-upload-parts argument to pachctl
      # deploy.
      maxUploadParts: 10000
      # verifySSL performs SSL certificate verification.  It is the
      # inverse of the --no-verify-ssl argument to pachctl deploy.
      verifySSL: true
      # partSize sets the part size for object storage uploads.  It is
      # analogous to the --part-size argument to pachctl deploy.  It
      # has to be a string due to Helm and YAML parsing integers as
      # floats.  Cf. https://github.com/helm/helm/issues/1707
      partSize: "5242880"
      # region sets the AWS region to use.
      region: ""
      # retries sets the number of retries for object storage
      # requests.  It is analogous to the --retries argument to
      # pachctl deploy.
      retries: 10
      # reverse reverses object storage paths.  It is analogous to the
      # --reverse argument to pachctl deploy.
      reverse: true
      # secret sets the Amazon secret access key to use.  Together with id
      # and token, it implements the functionality of the
      # --credentials argument to pachctl deploy.
      secret: ""
      # timeout sets the timeout for object storage requests.  It is
      # analogous to the --timeout argument to pachctl deploy.
      timeout: "5m"
      # token optionally sets the Amazon token to use.  Together with
      # id and secret, it implements the functionality of the
      # --credentials argument to pachctl deploy.
      token: ""
      # uploadACL sets the upload ACL for object storage uploads.  It
      # is analogous to the --upload-acl argument to pachctl deploy.
      uploadACL: "bucket-owner-full-control"
    google:
      bucket: ""
      # cred is a string containing a GCP service account private key,
      # in object (JSON or YAML) form.  A simple way to pass this on
      # the command line is with the set-file flag, e.g.:
      #
      #  helm install pachd -f my-values.yaml --set-file storage.google.cred=creds.json pachyderm/pachyderm
      cred: ""
      # Example:
      # cred: |
      #  {
      #    "type": "service_account",
      #    "project_id": "…",
      #    "private_key_id": "…",
      #    "private_key": "-----BEGIN PRIVATE KEY-----\n…\n-----END PRIVATE KEY-----\n",
      #    "client_email": "…@….iam.gserviceaccount.com",
      #    "client_id": "…",
      #    "auth_uri": "https://accounts.google.com/o/oauth2/auth",
      #    "token_uri": "https://oauth2.googleapis.com/token",
      #    "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
      #    "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/…%40….iam.gserviceaccount.com"
      #  }
    local:
      # hostPath indicates the path on the host where the PFS metadata
      # will be stored.  It must end in /.  It is analogous to the
      # --host-path argument to pachctl deploy.
      hostPath: ""
      requireRoot: true #Root required for hostpath, but we run rootless in CI
    microsoft:
      container: ""
      id: ""
      secret: ""
    minio:
      # minio bucket name
      bucket: ""
      # the minio endpoint. Should only be the hostname:port, no http/https.
      endpoint: ""
      # the username/id with readwrite access to the bucket.
      id: ""
      # the secret/password of the user with readwrite access to the bucket.
      secret: ""
      # enable https for minio with "true" defaults to "false"
      secure: ""
      # Enable S3v2 support by setting signature to "1". This feature is being deprecated
      signature: ""
    # putFileConcurrencyLimit sets the maximum number of files to
    # upload or fetch from remote sources (HTTP, blob storage) using
    # PutFile concurrently.  It is analogous to the
    # --put-file-concurrency-limit argument to pachctl deploy.
    putFileConcurrencyLimit: 100
    # uploadConcurrencyLimit sets the maximum number of concurrent
    # object storage uploads per Pachd instance.  It is analogous to
    # the --upload-concurrency-limit argument to pachctl deploy.
    uploadConcurrencyLimit: 100
    # The shard size corresponds to the total size of the files in a shard.
    # The shard count corresponds to the total number of files in a shard.
    # If either criteria is met, a shard will be created.
    # values are strings
    compactionShardSizeThreshold: "0"
    compactionShardCountThreshold: "0"
    memoryThreshold: 0
    levelFactor: 0
    maxFanIn: 10
    maxOpenFileSets: 50
    # diskCacheSize and memoryCacheSize are defined in units of 8 Mb chunks. The default is 100 chunks which is 800 Mb.
    diskCacheSize: 100
    memoryCacheSize: 100
  ppsWorkerGRPCPort: 1080
  # the number of seconds between pfs's garbage collection cycles.
  # if this value is set to 0, it will default to pachyderm's internal configuration.
  # if this value is less than 0, it will turn off garbage collection.
  storageGCPeriod: 0
  # the number of seconds between chunk garbage colletion cycles.
  # if this value is set to 0, it will default to pachyderm's internal configuration.
  # if this value is less than 0, it will turn off chunk garbage collection.
  storageChunkGCPeriod: 0
  # There are three options for TLS:
  # 1. Disabled
  # 2. Enabled, existingSecret, specify secret name
  # 3. Enabled, newSecret, must specify cert, key and name
  tls:
    enabled: false
    secretName: ""
    newSecret:
      create: false
      crt: ""
      key: ""
  tolerations: []
  worker:
    image:
      repository: "pachyderm/worker"
      pullPolicy: "IfNotPresent"
      # Worker tag is set under pachd.image.tag (they should be kept in lock step)
    serviceAccount:
      create: true
      additionalAnnotations: {}
      # name sets the name of the worker service account.  Analogous to
      # the --worker-service-account argument to pachctl deploy.
      name: "pachyderm-worker" #TODO Set default in helpers / Wire up in templates
  rbac:
    # create indicates whether RBAC resources should be created.
    # Setting it to false is analogous to passing --no-rbac to pachctl
    # deploy.
    create: true
  # Set up default resources for pipelines that don't include any requests or limits.  The values
  # are k8s resource quantities, so "1Gi", "2", etc.  Set to "0" to disable setting any defaults.
  defaultPipelineCPURequest: ""
  defaultPipelineMemoryRequest: ""
  defaultPipelineStorageRequest: ""
kubeEventTail:
  # Deploys a lightweight app that watches kubernetes events and echos them to logs.
  enabled: true
  # clusterScope determines whether kube-event-tail should watch all events or just events in its namespace.
  clusterScope: false
  image:
    repository: pachyderm/kube-event-tail
    pullPolicy: "IfNotPresent"
    tag: "v0.0.7"
  resources:
    limits:
      cpu: "1"
      memory: 100Mi
    requests:
      cpu: 100m
      memory: 45Mi

pgbouncer:
  service:
    type: ClusterIP
  annotations: {}
  priorityClassName: ""
  nodeSelector: {}
  tolerations: []
  image:
    repository: pachyderm/pgbouncer
    tag: 1.16.2
  resources:
    {}
    #limits:
    #  cpu: "1"
    #  memory: "2G"
    #requests:
    #  cpu: "1"
    #  memory: "2G"
  # maxConnections specifies the maximum number of concurrent connections into pgbouncer.
  maxConnections: 1000
  # defaultPoolSize specifies the maximum number of concurrent connections from pgbouncer to the postgresql database.
  defaultPoolSize: 20

# Note: Postgres values control the Bitnami Postgresql Subchart
postgresql:
  # enabled controls whether to install postgres or not.
  # If not using the built in Postgres, you must specify a Postgresql
  # database server to connect to in global.postgresql
  # The enabled value is watched by the 'condition' set on the Postgresql
  # dependency in Chart.yaml
  enabled: true
  image:
    repository: pachyderm/postgresql
    tag: "13.3.0"
  # DEPRECATED from pachyderm 2.1.5
  initdbScripts:
    dex.sh: |
      #!/bin/bash
      set -e
      psql -v ON_ERROR_STOP=1 --username "$POSTGRES_USER" --dbname "$POSTGRES_DB" <<-EOSQL
        CREATE DATABASE dex;
        GRANT ALL PRIVILEGES ON DATABASE dex TO "$POSTGRES_USER";
      EOSQL
  fullnameOverride: postgres
  persistence:
    # Specify the storage class for the postgresql Persistent Volume (PV)
    # See notes in Bitnami chart values.yaml file for more information.
    # More info for setting up storage classes on various cloud providers:
    # AWS: https://docs.aws.amazon.com/eks/latest/userguide/storage-classes.html
    # GCP: https://cloud.google.com/compute/docs/disks/performance#disk_types
    # Azure: https://docs.microsoft.com/en-us/azure/aks/concepts-storage#storage-classes
    storageClass: ""
    # storageSize specifies the size of the volume to use for postgresql
    # Recommended Minimum Disk size for Microsoft/Azure: 256Gi  - 1,100 IOPS https://azure.microsoft.com/en-us/pricing/details/managed-disks/
    # Recommended Minimum Disk size for Google/GCP: 50Gi        - 1,500 IOPS https://cloud.google.com/compute/docs/disks/performance
    # Recommended Minimum Disk size for Amazon/AWS: 500Gi (GP2) - 1,500 IOPS https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-volume-types.html
    size: 10Gi
    labels:
      suite: pachyderm
  primary:
    priorityClassName: ""
    nodeSelector: {}
    tolerations: []
  readReplicas:
    priorityClassName: ""
    nodeSelector: {}
    tolerations: []

cloudsqlAuthProxy:
  # connectionName may be found by running `gcloud sql instances describe INSTANCE_NAME --project PROJECT_ID`
  connectionName: ""
  serviceAccount: ""
  iamLogin: false
  port: 5432
  enabled: false
  image:
    # repository is the image repo to pull from; together with tag it
    # replicates the --dash-image & --registry arguments to pachctl
    # deploy.
    repository: "gcr.io/cloudsql-docker/gce-proxy"
    pullPolicy: "IfNotPresent"
    # tag is the image repo to pull from; together with repository it
    # replicates the --dash-image argument to pachctl deploy.
    tag: "1.23.0"
  priorityClassName: ""
  nodeSelector: {}
  tolerations: []
  # podLabels specifies labels to add to the dash pod.
  podLabels: {}
  # resources specifies the resource request and limits.
  resources: {}
  #  requests:
  #    # The proxy's memory use scales linearly with the number of active
  #    # connections. Fewer open connections will use less memory. Adjust
  #    # this value based on your application's requirements.
  #    memory: ""
  #    # The proxy's CPU use scales linearly with the amount of IO between
  #    # the database and the application. Adjust this value based on your
  #    # application's requirements.
  #    cpu: ""
  service:
    # labels specifies labels to add to the cloudsql auth proxy service.
    labels: {}
    # type specifies the Kubernetes type of the cloudsql auth proxy service.
    type: ClusterIP

oidc:
  issuerURI: "" #Inferred if running locally or using ingress
  requireVerifiedEmail: false
  # IDTokenExpiry is parsed into golang's time.Duration: https://pkg.go.dev/time#example-ParseDuration
  IDTokenExpiry: 24h
  # (Optional) If set, enables OIDC rotation tokens, and specifies the duration where they are valid.
  # RotationTokenExpiry is parsed into golang's time.Duration: https://pkg.go.dev/time#example-ParseDuration
  RotationTokenExpiry: 48h
  # (Optional) Only set in cases where the issuerURI is not user accessible (ie. localhost install)
  userAccessibleOauthIssuerHost: ""
  ## to set up upstream IDPs, set pachd.mockIDP to false,
  ## and populate the pachd.upstreamIDPs with an array of Dex Connector configurations.
  ## See the example below or https://dexidp.io/docs/connectors/
  # upstreamIDPs:
  #   - id: idpConnector
  #     config:
  #       issuer: ""
  #       clientID: ""
  #       clientSecret: ""
  #       redirectURI: "http://localhost:30658/callback"
  #       insecureEnableGroups: true
  #       insecureSkipEmailVerified: true
  #       insecureSkipIssuerCallbackDomainCheck: true
  #       forwardedLoginParams:
  #       - login_hint
  #     name: idpConnector
  #     type: oidc
  #
  #   - id: okta
  #     config:
  #       issuer: "https://dev-84362674.okta.com"
  #       clientID: "client_id"
  #       clientSecret: "notsecret"
  #       redirectURI: "http://localhost:30658/callback"
  #       insecureEnableGroups: true
  #       insecureSkipEmailVerified: true
  #       insecureSkipIssuerCallbackDomainCheck: true
  #       forwardedLoginParams:
  #       - login_hint
  #     name: okta
  #     type: oidc
  upstreamIDPs: []
  # upstreamIDPsSecretName is used to pass the upstreamIDPs value via an existing k8s secret.
  # The value is pulled from the secret key, "upstream-idps".
  upstreamIDPsSecretName: ""
  # Some dex configurations (like Google) require a credential file. Whatever secret is included in this
  # below secret will be mounted to the pachd pod at /dexcreds/ so for example serviceAccountFilePath: /dexcreds/googleAuth.json
  dexCredentialSecretName: ""
  mockIDP: true
  # additionalClients specifies a list of clients for the cluster to recognize
  # See the ecample below or the dex docs (https://dexidp.io/docs/using-dex/#configuring-your-app).
  # additionalOIDCClient:
  #   - id: example-app
  #     secret: example-app-secret
  #     name: 'Example App'
  #     redirectURIs:
  #     - 'http://127.0.0.1:5555/callback'
  additionalClients: []
  additionalClientsSecretName: ""
  #TODO scopes:

testConnection:
  image:
    repository: alpine
    tag: latest

# The proxy is a service to handle all Pachyderm traffic (S3, Console, OIDC, Dex, GRPC) on a single
# port; good for exposing directly to the Internet.
proxy:
  # If enabled, create a proxy deployment (based on the Envoy proxy) and a service to expose it.  If
  # ingress is also enabled, any Ingress traffic will be routed through the proxy before being sent
  # to pachd or Console.
  enabled: true
  # The external hostname (including port if nonstandard) that the proxy will be reachable at.
  # If you have ingress enabled and an ingress hostname defined, the proxy will use that.
  # Ingress will be deprecated in the future so configuring the proxy host instead is recommended.
  host: ""
  # The number of proxy replicas to run.  1 should be fine, but if you want more for higher
  # availability, that's perfectly reasonable.  Each replica can handle 50,000 concurrent
  # connections.  There is an affinity rule to prefer scheduling the proxy pods on the same node as
  # pachd, so a number here that matches the number of pachd replicas is a fine configuration.
  # (Note that we don't guarantee to keep the proxy<->pachd traffic on-node or even in-region.)
  replicas: 1
  # The envoy image to pull.
  image:
    repository: "envoyproxy/envoy-distroless"
    tag: "v1.24.1"
    pullPolicy: "IfNotPresent"
  # Set up resources.  The proxy is configured to shed traffic before using 500MB of RAM, so that's
  # a resonable memory limit.  It doesn't need much CPU.
  resources:
    requests:
      cpu: 100m
      memory: 512Mi
    limits:
      memory: 512Mi
  # Any additional labels to add to the pods.  These are also added to the deployment and service
  # selectors.
  labels: {}
  # Any additional annotations to add to the pods.
  annotations: {}
  # A nodeSelector statement for each pod in the proxy Deployment, if desired.
  nodeSelector: {}
  # A tolerations statement for each pod in the proxy Deployment, if desired.
  tolerations: []
  # A priority class name for each pod in the proxy Deployment, if desired.
  priorityClassName: ""
  # Configure the service that routes traffic to the proxy.
  service:
    # The type of service can be ClusterIP, NodePort, or LoadBalancer.
    type: ClusterIP
    # If the service is a LoadBalancer, you can specify the IP address to use.
    loadBalancerIP: ""
    # The port to serve plain HTTP traffic on.
    httpPort: 80
    # The port to serve HTTPS traffic on, if enabled below.
    httpsPort: 443
    # If the service is a NodePort, you can specify the port to receive HTTP traffic on.
    httpNodePort: 30080
    httpsNodePort: 30443
    # Any additional annotations to add.
    annotations: {}
    # Any additional labels to add to the service itself (not the selector!).
    labels: {}
    # The proxy can also serve each backend service on a numbered port, and will do so for any port
    # not numbered 0 here.  If this service is of type NodePort, the port numbers here will be used
    # for the node port, and will need to be in the node port range.
    legacyPorts:
      console: 0 # legacy 30080, conflicts with default httpNodePort
      grpc: 0 # legacy 30650
      s3Gateway: 0 # legacy 30600
      oidc: 0 # legacy 30657
      identity: 0 # legacy 30658
      metrics: 0 # legacy 30656
    # externalTrafficPolicy determines cluster-wide routing policy; see "kubectl explain
    # service.spec.externalTrafficPolicy".
    externalTrafficPolicy: ""
  # Configuration for TLS (SSL, HTTPS).
  tls:
    # If true, enable TLS serving.  Enabling TLS is incompatible with support for legacy ports (you
    # can't get a generally-trusted certificate for port numbers), and disables support for
    # cleartext communication (cleartext requests will redirect to the secure server, and HSTS
    # headers are set to prevent downgrade attacks).
    #
    # Note that if you are planning on putting the proxy behind an ingress controller, you probably
    # want to configure TLS for the ingress controller, not the proxy.  This is intended for the
    # case where the proxy is exposed directly to the Internet.  (It is possible to have your
    # ingress controller talk to the proxy over TLS, in which case, it's fine to enable TLS here in
    # addition to in the ingress section above.)
    enabled: false
    # The secret containing "tls.key" and "tls.crt" keys that contain PEM-encoded private key and
    # certificate material.  Generate one with "kubectl create secret tls <name> --key=tls.key
    # --cert=tls.cert".  This format is compatible with the secrets produced by cert-manager, and
    # the proxy will pick up new data when cert-manager rotates the certificate.
    secretName: ""
    # If set, generate the secret from values here.  This is intended only for unit tests.
    secret: {}
```


## Deploy Persistent Volumes Kubernetes

The pachyderm solution will rely on several services to make the solution work. Such as etcd, postgresql, loki, and others. These services in particular will require access to storage. 

We will provide storage to kubernetes hosted services through a kubernetes resource called Persistend Volumes.

### Understanding Volumes
> Volumes
>
> On-disk files in a container are ephemeral, which presents some problems for non-trivial applications when running in containers. One problem occurs when a container crashes or is stopped. Container state is not saved so all of the files that were created or modified during the lifetime of the container are lost. During a crash, kubelet restarts the container with a clean state. Another problem occurs when multiple containers are running in a Pod and need to share files. It can be challenging to setup and access a shared filesystem across all of the containers. The Kubernetes volume abstraction solves both of these problems.
>
> ...
>
> Kubernetes supports many types of volumes. A Pod can use any number of volume types simultaneously. Ephemeral volume types have a lifetime of a pod, but persistent volumes exist beyond the lifetime of a pod. When a pod ceases to exist, Kubernetes destroys ephemeral volumes; however, Kubernetes does not destroy persistent volumes. For any kind of volume in a given pod, data is preserved across container restarts.
>
> At its core, a volume is a directory, possibly with some data in it, which is accessible to the containers in a pod. How that directory comes to be, the medium that backs it, and the contents of it are determined by the particular volume type used.
>

> https://kubernetes.io/docs/concepts/storage/volumes/


### Understanding Persistent Volumes
Persistent Volumes are a non-ephemeral volume implimentation. They are a kubernetes resource which provides storage.

> A PersistentVolume (PV) is a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using Storage Classes. It is a resource in the cluster just like a node is a cluster resource. PVs are volume plugins like Volumes, but have a lifecycle independent of any individual Pod that uses the PV. This API object captures the details of the implementation of the storage, be that NFS, iSCSI, or a cloud-provider-specific storage system.
>
> https://kubernetes.io/docs/concepts/storage/persistent-volumes/


Persistent Volume Claims are requests to have resources allocated to them.

> A PersistentVolumeClaim (PVC) is a request for storage by a user. It is similar to a Pod. Pods consume node resources and PVCs consume PV resources. Pods can request specific levels of resources (CPU and Memory). Claims can request specific size and access modes (e.g., they can be mounted ReadWriteOnce, ReadOnlyMany or ReadWriteMany, see AccessModes).
>
> https://kubernetes.io/docs/concepts/storage/persistent-volumes/
>

### Understanding Storage Classes

The Persistant Volume Claim allows a user to request access to specific storage resources. But in some cases, we may want to logically classify our storage and then make a selection from a particular classification. For example if we want a 7200 rpm disk vs a 5200 rpm disk. Storage Classes provide this abstraction:

> A StorageClass provides a way for administrators to describe the "classes" of storage they offer. Different classes might map to quality-of-service levels, or to backup policies, or to arbitrary policies determined by the cluster administrators. Kubernetes itself is unopinionated about what classes represent. This concept is sometimes called "profiles" in other storage systems
> 
> https://kubernetes.io/docs/concepts/storage/storage-classes/
>
> While PersistentVolumeClaims allow a user to consume abstract storage resources, it is common that users need PersistentVolumes with varying properties, such as performance, for different problems. Cluster administrators need to be able to offer a variety of PersistentVolumes that differ in more ways than size and access modes, without exposing users to the details of how those volumes are implemented. For these needs, there is the StorageClass resource.
> 
> https://kubernetes.io/docs/concepts/storage/persistent-volumes/
>
> Each StorageClass has a provisioner that determines what volume plugin is used for provisioning PVs. This field must be specified.
>
> https://kubernetes.io/docs/concepts/storage/storage-classes/#provisioner

Kubernetes supports the following Volumn plugins and provisioners:

|Volume Plugin | Internal Provisioner | Config Example |
|--------------|----------------------|----------------|
|AWSElasticBlockStore | ✓ | AWS EBS |
|AzureFile | ✓ | Azure File |
|AzureDisk | ✓ | Azure Disk |
|CephFS | - | - |
|Cinder | ✓ | OpenStack Cinder |
|FC | - | - |
|FlexVolume | - | -
|GCEPersistentDisk | ✓ | GCE PD |
|iSCSI | - | - |
|NFS | - | NFS |
|RBD | ✓ | Ceph RBD |
|VsphereVolume | ✓ | vSphere |
|PortworxVolume | ✓ | Portworx Volume |
|Local | - | Local |

### The NFS Storage Class

For my deployment I will keep things simple and use an NFS storage volume. In the past I have used ceph, but in this case I will keep things very simple.

**Note**: Kubernetes doesn't include an internal NFS provisioner. You need to use an external provisioner to create a StorageClass for NFS. Here are some examples:

- NFS Ganesha server and external provisioner
- NFS subdir external provisioner

Configuration example and documentation can be found [here](https://kubernetes.io/docs/concepts/storage/storage-classes/#nfs).

#### Installing the NFS server

https://dev.to/prajwalmithun/setup-nfs-server-client-in-linux-and-unix-27id

```
[root@localhost ~]# yum -y install nfs-utils
[root@localhost ~]# vi /etc/exports
[root@localhost ~]# cat /etc/exports
/nfs_exports   *(rw,root_squash,sync,no_subtree_check)
[root@localhost ~]# mkdir /nfs_exports
[root@localhost ~]# chmod 777 /nfs_exports
[root@localhost ~]# systemctl enable rpcbind
[root@localhost ~]# systemctl enable nfs-server
[root@localhost ~]# systemctl enable nfs-lock
[root@localhost ~]# systemctl enable nfs-idmap
[root@localhost ~]# systemctl start rpcbind
[root@localhost ~]# systemctl start nfs-server
[root@localhost ~]# systemctl start nfs-lock
[root@localhost ~]# systemctl start nfs-idmap
[root@localhost ~]# systemctl status nfs
● nfs-server.service - NFS server and services
   Loaded: loaded (/usr/lib/systemd/system/nfs-server.service; disabled; vendor preset: disabled)
   Active: active (exited) since Mon 2023-05-01 18:12:08 EDT; 2s ago
  Process: 27255 ExecStartPost=/bin/sh -c if systemctl -q is-active gssproxy; then systemctl reload gssproxy ; fi (code=exited, status=0/SUCCESS)
  Process: 27253 ExecStart=/usr/sbin/rpc.nfsd $RPCNFSDARGS (code=exited, status=0/SUCCESS)
  Process: 27250 ExecStartPre=/usr/sbin/exportfs -r (code=exited, status=0/SUCCESS)
 Main PID: 27253 (code=exited, status=0/SUCCESS)
    Tasks: 0
   Memory: 0B
   CGroup: /system.slice/nfs-server.service
   
[root@localhost ~]# exportfs
/nfs_exports    <world>
```

**Note**: Make sure your firewall is properly configured

#### Install NFS Client on K8 Nodes
This need to be installed on the mater and the workers

```
[root@os004k8-master001 ~]# yum -y install nfs-utils
```

#### Test NFS Connection
Test we can mount the nfs server and have the permissions to create files and directories

```
[root@os004k8-master001 ~]# mount -t nfs 15.4.22.101:/nfs_exports /mnt/test
[root@os004k8-master001 ~]# mkdir /mnt/test/test
```

#### Installing NFS Subdir External Provisioner Using Helm

Official instructions can be found [here](https://github.com/kubernetes-sigs/nfs-subdir-external-provisioner)
```
[root@os004k8-master001 ~]# helm repo add nfs-subdir-external-provisioner https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner/
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /root/.kube/config
WARNING: Kubernetes configuration file is world-readable. This is insecure. Location: /root/.kube/config
"nfs-subdir-external-provisioner" has been added to your repositories

[root@os004k8-master001 ~]# helm install nfs-subdir-external-provisioner nfs-subdir-external-provisioner/nfs-subdir-external-provisioner  --set nfs.server=15.4.22.101 --set nfs.path=/nfs_exports
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /root/.kube/config
WARNING: Kubernetes configuration file is world-readable. This is insecure. Location: /root/.kube/config
NAME: nfs-subdir-external-provisioner
LAST DEPLOYED: Mon May  1 18:24:08 2023
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None

[root@os004k8-master001 ~]# kubectl get deployment nfs-subdir-external-provisioner
NAME                              READY   UP-TO-DATE   AVAILABLE   AGE
nfs-subdir-external-provisioner   1/1     1            1           102s

```

#### Define Storage Class

When we deployed the helm chart in the previous step, a storage class was created for us:

```
[root@os004k8-master001 pachyderm]# kubectl get storageclass
NAME         PROVISIONER                                     RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
nfs-client   cluster.local/nfs-subdir-external-provisioner   Delete          Immediate           true                   6m31s

```

#### Test the installation

```
[root@os004k8-master001 pachyderm]# cat test-claim.yaml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: test-claim
spec:
  storageClassName: nfs-client
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 1Mi
      
[root@os004k8-master001 pachyderm]# kubectl apply -f test-claim.yaml
persistentvolumeclaim/test-claim created
[root@os004k8-master001 pachyderm]# kubectl get persistentvolumeclaim/test-claim
NAME         STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
test-claim   Bound    pvc-c83b7c08-3ff0-4a3d-948e-16436326f31f   1Mi        RWX            nfs-client     10s
```


```
[root@os004k8-master001 pachyderm]# cat test-pod.yaml
kind: Pod
apiVersion: v1
metadata:
  name: test-pod
spec:
  containers:
  - name: test-pod
    image: busybox:stable
    command:
      - "/bin/sh"
    args:
      - "-c"
      - "touch /mnt/SUCCESS && exit 0 || exit 1"
    volumeMounts:
      - name: nfs-pvc
        mountPath: "/mnt"
  restartPolicy: "Never"
  volumes:
    - name: nfs-pvc
      persistentVolumeClaim:
        claimName: test-claim

[root@os004k8-master001 pachyderm]# kubectl apply -f test-pod.yaml
pod/test-pod created

[root@os004k8-master001 pachyderm]# kubectl get pod/test-pod
NAME       READY   STATUS      RESTARTS   AGE
test-pod   0/1     Completed   0          7s

```

### Deploy Minio Object Store

> An object store is used by Pachyderm’s pachd for storing all your data. The object store you use must be accessible via a low-latency, high-bandwidth connection.
>
> Storage providers like MinIO (the most common and officially supported option), EMC’s ECS, Ceph, or SwiftStack provide S3-compatible access to enterprise storage for on-premises deployment.
> 
> https://docs.pachyderm.com/latest/deploy-manage/deploy/on-premises/#on-premises-sizing-and-configuring-the-object-store

## Create Values File For Helm

We need to create a vluaes file to inform helm of the configurations for our storage class.

```
[root@os004k8-master001 pachyderm]# vi values.yaml
[root@os004k8-master001 pachyderm]# cat values.yaml
etcd:
  storageClass: nfs-client
  size: 10Gi

postgresql:
  persistence:
    storageClass: nfs-client
    size: 10Gi

```

## Install Pachyderm using helm

Pachyderm uses the client server model. The pachD damon is packaged as a kubernetes pod and the pachctl cli connects to the server (the daemon running in the pod) to execute commands etc.

We can ask helm to list all the charts it can find. In my case I only have one repo (the pachyderm repo) and we can list out all the charts available for a given repo:

```
[root@os004k8-master001 ~]# helm repo list
NAME            URL
pachyderm       https://helm.pachyderm.com

[root@os004k8-master001 ~]# helm search repo -l -r pachyderm 2>/dev/null
NAME                    CHART VERSION   APP VERSION     DESCRIPTION
pachyderm/pachyderm     2.5.5           2.5.5           Explainable, repeatable, scalable data science
pachyderm/pachyderm     2.5.4           2.5.4           Explainable, repeatable, scalable data science
pachyderm/pachyderm     2.5.3           2.5.3           Explainable, repeatable, scalable data science
pachyderm/pachyderm     2.5.2           2.5.2           Explainable, repeatable, scalable data science
pachyderm/pachyderm     2.5.1           2.5.1           Explainable, repeatable, scalable data science
pachyderm/pachyderm     2.5.0           2.5.0           Explainable, repeatable, scalable data science
pachyderm/pachyderm     2.4.6           2.4.6           Explainable, repeatable, scalable data science


```

We can then ask helm to install our preferred version:

```
[root@os004k8-master001 ~]# helm install pachd pachyderm/pachyderm --set deployTarget=LOCAL --set proxy.enabled=true --set proxy.service.type=LoadBalancer --version 2.5.5
W0501 12:39:42.848489   21435 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W0501 12:39:43.115008   21435 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
NAME: pachd
LAST DEPLOYED: Mon May  1 12:39:40 2023
NAMESPACE: default
STATUS: deployed
REVISION: 1
NOTES:
```

**Note**: In the command above, we are providing a number of parameters to the `helm install` command. The `--version` parameter instructs the help utility which specific version to install. The `--set` parameter instructs helm to override a value specified in a helm chart. Values, as we will discuss, are arbitrary settings that can override default settings or pvide values for templates in a helm chart. As the helm chart is written in yaml and consists of complex objects, the values specified in the ``--set` argument may consist of a json path description of the field being accessed. For example the value proxy.service.type is modifying a value for the type attribute of the service object on the proxy object. Fr more information on the arguments helm install accepts, see the [official documentation](https://helm.sh/docs/helm/helm_install/) for more detail.

**Note**: While the `--set` parameter can be used to override individual chart settings, the `helm install` command also allows the user to spcify a values.yaml file (i.e. a values file) to do a bulk override. 

#### Review Install Parameters

The deployTarget parameter.

example values.yaml with minio (s3)
https://git.app.uib.no/caleno/helm-charts/-/blob/57dc2b1fe0d1f0a2d3400610e411eda5c3e1417c/stable/pachyderm/values.yaml

minio deployment instructions
https://academic.oup.com/bioinformatics/article/35/5/839/5068160

We can then ask kubernetes about the status of the pods it has spun up to host the pachyderm solution.

```
[root@os004k8-master001 ~]# kubectl get pods -A
NAMESPACE     NAME                                                   READY   STATUS              RESTARTS   AGE
default       console-587c67787f-cm8sg                               0/1     ContainerCreating   0          38s
default       etcd-0                                                 0/1     Pending             0          38s
default       pachd-bd45db8cd-4bh5l                                  0/1     ContainerCreating   0          35s
default       pachd-loki-0                                           0/1     Pending             0          38s
default       pachd-promtail-64qpn                                   0/1     ContainerCreating   0          39s
default       pachd-promtail-jtrkt                                   0/1     ContainerCreating   0          39s
default       pachd-promtail-kjlcq                                   0/1     ContainerCreating   0          38s
default       pachd-promtail-nljnp                                   0/1     ContainerCreating   0          38s
default       pachd-promtail-xp82l                                   0/1     ContainerCreating   0          39s
default       pachd-promtail-z6n65                                   0/1     ContainerCreating   0          38s
default       pachyderm-kube-event-tail-6c6598cd5-pcthr              0/1     ContainerCreating   0          38s
default       pachyderm-proxy-7f4545985c-zf2bv                       0/1     ContainerCreating   0          38s
default       pg-bouncer-88dbc966b-l4xzs                             0/1     ContainerCreating   0          38s
default       postgres-0                                             0/1     Pending             0          38s
kube-system   coredns-558bd4d5db-478nt                               1/1     Running             1          17h
kube-system   coredns-558bd4d5db-x42cd                               1/1     Running             1          17h
kube-system   etcd-os004k8-master001.foobar.com                      1/1     Running             125        17h
kube-system   kube-apiserver-os004k8-master001.foobar.com            1/1     Running             1          17h
kube-system   kube-controller-manager-os004k8-master001.foobar.com   1/1     Running             1          17h
kube-system   kube-proxy-2l94p                                       1/1     Running             1          16h
kube-system   kube-proxy-2m6hr                                       1/1     Running             1          17h
kube-system   kube-proxy-f4vbn                                       1/1     Running             1          16h
kube-system   kube-proxy-kzz98                                       1/1     Running             1          17h
kube-system   kube-proxy-plhkk                                       1/1     Running             1          17h
kube-system   kube-proxy-sm9wf                                       1/1     Running             1          17h
kube-system   kube-proxy-wcl4n                                       1/1     Running             1          16h
kube-system   kube-scheduler-os004k8-master001.foobar.com            1/1     Running             1          17h
kube-system   weave-net-472vx                                        2/2     Running             4          17h
kube-system   weave-net-4lmgh                                        2/2     Running             4          17h
kube-system   weave-net-6w7qx                                        2/2     Running             4          16h
kube-system   weave-net-fkzm5                                        2/2     Running             3          16h
kube-system   weave-net-g4sjk                                        2/2     Running             2          17h
kube-system   weave-net-k7l4s                                        2/2     Running             3          17h
kube-system   weave-net-qs929                                        2/2     Running             4          16h
```

**Note**: This will take some time before all the pods are running. Remember, the kubernetes cluster is going to download a bunch of docker images from dockerhub. This may take some time. It then neets to start the pods (containers) and wait for their internal services to come online.

### Troubleshooting Pods Not Ready
If the helm installation fails (the pods never become ready) we can uninstall using the following:

```
[root@os004k8-master001 ~]# helm uninstall pachd
W0501 12:58:20.207995   30002 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
release "pachd" uninstalled

```

We can use the describe command to interrogate why a pod is not running and 

```
[root@os004k8-master001 ~]# kubectl describe pod pachd-loki-0
Name:           pachd-loki-0
Namespace:      default
Priority:       0
Node:           <none>
Labels:         app=loki
                controller-revision-hash=pachd-loki-5bc57fd4dd
                name=pachd-loki
                release=pachd
                statefulset.kubernetes.io/pod-name=pachd-loki-0
Annotations:    checksum/config: 9688827d4f9db7e59b48154e6433ef91fdc762d5b7545bfbe786f8d75c4de68a
                prometheus.io/port: http-metrics
                prometheus.io/scrape: true
Status:         Pending
IP:
IPs:            <none>
Controlled By:  StatefulSet/pachd-loki
Containers:
  loki:
    Image:       grafana/loki:2.6.1
    Ports:       3100/TCP, 9095/TCP, 7946/TCP
    Host Ports:  0/TCP, 0/TCP, 0/TCP
    Args:
      -config.file=/etc/loki/loki.yaml
    Liveness:     http-get http://:http-metrics/ready delay=45s timeout=1s period=10s #success=1 #failure=3
    Readiness:    http-get http://:http-metrics/ready delay=45s timeout=1s period=10s #success=1 #failure=3
    Environment:  <none>
    Mounts:
      /data from storage (rw)
      /etc/loki from config (rw)
      /tmp from tmp (rw)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  storage:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  storage-pachd-loki-0
    ReadOnly:   false
  tmp:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  config:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  pachd-loki
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason            Age                  From               Message
  ----     ------            ----                 ----               -------
  Warning  FailedScheduling  45s (x3 over 2m12s)  default-scheduler  0/7 nodes are available: 7 pod has unbound immediate PersistentVolumeClaims.

```

In the events section, I see the pod is not running because it failed to schedule. This failure shows an associated message of "pod has unbound immediate PersistentVolumeClaims". I see that the pod is trying to run a loki container provided by the grafana project. I googled this error to try an understand the root cause.

I managed to find an [issue](https://community.grafana.com/t/helm-installation-with-persisistent-storage-does-not-bind-storage/45672) from an issue tracker mentioning a similar issue.

The first question was to list out the persistent volume claims with a `kubectl get pvc`. Mine was as follows:

```
[root@os004k8-master001 ~]# kubectl get PersistentVolumeClaims
NAME                   STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
data-postgres-0        Pending                                                     128m
etcd-storage-etcd-0    Pending                                                     128m
storage-pachd-loki-0   Pending                                                     128m
```

Describing the resource I see the following:

```
[root@os004k8-master001 ~]# kubectl describe pvc storage-pachd-loki-0
Name:          storage-pachd-loki-0
Namespace:     default
StorageClass:
Status:        Pending
Volume:
Labels:        app=loki
               release=pachd
Annotations:   <none>
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode:    Filesystem
Used By:       pachd-loki-0
Events:
  Type    Reason         Age                     From                         Message
  ----    ------         ----                    ----                         -------
  Normal  FailedBinding  4m25s (x502 over 129m)  persistentvolume-controller  no persistent volumes available for this claim and no storage class is set

```

Having a look at the deployment manifest I see:

```
[root@os004k8-master001 ~]# kubectl get pvc storage-pachd-loki-0 -o yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  creationTimestamp: "2023-05-01T16:39:45Z"
  finalizers:
  - kubernetes.io/pvc-protection
  labels:
    app: loki
    release: pachd
  name: storage-pachd-loki-0
  namespace: default
  resourceVersion: "34516"
  uid: 33d258a0-32a3-4911-a54a-46f57a2c757e
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  volumeMode: Filesystem
status:
  phase: Pending

```

I wanted to understand what a PVC was. I had a look at the official kubernetes documentation to dig in.

> The PersistentVolume subsystem provides an API for users and administrators that abstracts details of how storage is provided from how it is consumed. To do this, we introduce two new API resources: PersistentVolume and PersistentVolumeClaim.
>
> A PersistentVolume (PV) is a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using Storage Classes. It is a resource in the cluster just like a node is a cluster resource. PVs are volume plugins like Volumes, but have a lifecycle independent of any individual Pod that uses the PV. This API object captures the details of the implementation of the storage, be that NFS, iSCSI, or a cloud-provider-specific storage system.
>
> A PersistentVolumeClaim (PVC) is a request for storage by a user. It is similar to a Pod. Pods consume node resources and PVCs consume PV resources. Pods can request specific levels of resources (CPU and Memory). Claims can request specific size and access modes (e.g., they can be mounted ReadWriteOnce, ReadOnlyMany or ReadWriteMany, see AccessModes).
>
> https://kubernetes.io/docs/concepts/storage/persistent-volumes/

At this point I think I am understanding the problem: the helm chart deployed pods which required PersistentVolumes, but those volumes do not exist. I wondered why and continued reading:

> Provisioning
>
>There are two ways PVs may be provisioned: statically or dynamically.
>
> Static
>
> A cluster administrator creates a number of PVs. They carry the details of the real storage, which is available for use by cluster users. They exist in the Kubernetes API and are available for consumption.
>
> Dynamic
>
> When none of the static PVs the administrator created match a user's PersistentVolumeClaim, the cluster may try to dynamically provision a volume specially for the PVC. This provisioning is based on StorageClasses: the PVC must request a storage class and the administrator must have created and configured that class for dynamic provisioning to occur. Claims that request the class "" effectively disable dynamic provisioning for themselves.
>
> To enable dynamic storage provisioning based on storage class, the cluster administrator needs to enable the DefaultStorageClass admission controller on the API server. This can be done, for example, by ensuring that DefaultStorageClass is among the comma-delimited, ordered list of values for the --enable-admission-plugins flag of the API server component. For more information on API server command-line flags, check kube-apiserver documentation.


So, if there are no volumes, the volume claims can never be satisfied. So my next question is: "Are resources being defined/requested but not being deployed correctly (i.e. something is failing) or are resources not being defined but are being requested?"

I checked and confirmed I do not have any persistent volumes provisioned on my cluster:

```
[root@os004k8-master001 ~]# kubectl get PersistentVolumes
No resources found

```

I would expect that if something was defined and a deployment failed I would see it listed in that output. This is telling me that nothing was provisioned.


To make sure this suspicion was correct I had a look at loki documentation. It does [mention](https://grafana.com/docs/loki/latest/installation/helm/configure-storage/) that in order to use loki, stoage must be configured. The storage configuration section in the values.yaml file accepts a number of parameters. 

> Configure storage
> The scalable installation requires a managed object store such as AWS S3 or Google Cloud Storage or a self-hosted store such as Minio. The single binary installation can only use the filesystem for storage.
>
> This guide assumes Loki will be installed in on of the modes above and that a values.yaml has been created.
>
> https://grafana.com/docs/loki/latest/installation/helm/configure-storage/

I had a look at the [values.yaml file from the git repository](https://github.com/grafana/helm-charts/blob/main/charts/loki-stack/values.yaml) and saw that it did not have any definitions specific to a storage provider.


I was wondering if the documentation ever worked. Does Docker Desktop or minikube allow dynamic storage provisioning by default? MY quess was no, and after looking at the docs and how to articles I belive that assumption was correct. So how does this thing deploy!?

I can ask kubernetes to show me the deployments on the system.

```
[root@os004k8-master001 ~]# kubectl get deployments
NAME                        READY   UP-TO-DATE   AVAILABLE   AGE
console                     1/1     1            1           99m
pachd                       0/1     1            0           99m
pachw                       0/0     0            0           99m
pachyderm-kube-event-tail   1/1     1            1           99m
pachyderm-proxy             1/1     1            1           99m
pg-bouncer                  1/1     1            1           99m
```

I can then ask for the deployment manifest (yaml file) that describes a given deployment.

```
[root@os004k8-master001 ~]# kubectl get deployment pachd -o yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "1"
    meta.helm.sh/release-name: pachd
    meta.helm.sh/release-namespace: default
  creationTimestamp: "2023-05-01T16:59:51Z"
  generation: 1
  labels:
    app: pachd
    app.kubernetes.io/managed-by: Helm
    suite: pachyderm
  name: pachd
  namespace: default
  resourceVersion: "38824"
  uid: 98bd124c-ae4f-4b32-9ba9-798861aee2c9
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: pachd
      suite: pachyderm
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      annotations:
        checksum/helm-values: 96d4707a15cfe2139b4a2fed4b70dd84276812f4f292cdf18fdc1d3ea8d30486
        checksum/storage-secret: 44d530a6561604c3aa1902c08580e6d18618f932f5199448520b26bc345bc88d
      creationTimestamp: null
      labels:
        app: pachd
        suite: pachyderm
      name: pachd
      namespace: default
    spec:
      automountServiceAccountToken: true
      containers:
      - args:
        - --mode
        - $(MODE)
        command:
        - /pachd
        env:
        - name: PACHW_IN_SIDECARS
          value: "true"
        - name: PACHW_MIN_REPLICAS
          value: "0"
        - name: PACHW_MAX_REPLICAS
          value: "1"
        - name: POSTGRES_HOST
          value: postgres
        - name: POSTGRES_PORT
          value: "5432"
        - name: POSTGRES_USER
          value: pachyderm
        - name: POSTGRES_DATABASE
          value: pachyderm
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              key: postgresql-password
              name: postgres
        - name: PG_BOUNCER_HOST
          value: pg-bouncer
        - name: PG_BOUNCER_PORT
          value: "5432"
        - name: LOKI_LOGGING
          value: "true"
        - name: LOKI_SERVICE_HOST
          value: $(PACHD_LOKI_SERVICE_HOST)
        - name: LOKI_SERVICE_PORT
          value: $(PACHD_LOKI_SERVICE_PORT)
        - name: PACH_ROOT
          value: /pach
        - name: ETCD_PREFIX
        - name: STORAGE_BACKEND
          value: LOCAL
        - name: STORAGE_HOST_PATH
          value: /var/pachyderm-gApqb/pachd
        - name: WORKER_IMAGE
          value: pachyderm/worker:2.5.5
        - name: WORKER_USES_ROOT
          value: "True"
        - name: WORKER_SIDECAR_IMAGE
          value: pachyderm/pachd:2.5.5
        - name: WORKER_IMAGE_PULL_POLICY
          value: IfNotPresent
        - name: WORKER_SERVICE_ACCOUNT
          value: pachyderm-worker
        - name: METRICS
          value: "true"
        - name: PACHYDERM_LOG_LEVEL
          value: info
        - name: PACH_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        - name: REQUIRE_CRITICAL_SERVERS_ONLY
          value: "false"
        - name: PACHD_POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: PPS_WORKER_GRPC_PORT
          value: "1080"
        - name: STORAGE_UPLOAD_CONCURRENCY_LIMIT
          value: "100"
        - name: STORAGE_PUT_FILE_CONCURRENCY_LIMIT
          value: "100"
        - name: STORAGE_COMPACTION_SHARD_SIZE_THRESHOLD
          value: "0"
        - name: STORAGE_COMPACTION_SHARD_COUNT_THRESHOLD
          value: "0"
        - name: STORAGE_COMPACTION_MAX_FANIN
          value: "10"
        - name: STORAGE_FILESETS_MAX_OPEN
          value: "50"
        - name: STORAGE_DISK_CACHE_SIZE
          value: "100"
        - name: STORAGE_MEMORY_CACHE_SIZE
          value: "100"
        - name: CONSOLE_OAUTH_ID
          value: console
        - name: CONSOLE_OAUTH_SECRET
          valueFrom:
            secretKeyRef:
              key: OAUTH_CLIENT_SECRET
              name: pachyderm-console-secret
        - name: ENABLE_WORKER_SECURITY_CONTEXTS
          value: "true"
        - name: ENABLE_PREFLIGHT_CHECKS
          value: "true"
        - name: UNPAUSED_MODE
          value: full
        - name: K8S_MEMORY_REQUEST
          valueFrom:
            resourceFieldRef:
              containerName: pachd
              divisor: "0"
              resource: requests.memory
        - name: K8S_MEMORY_LIMIT
          valueFrom:
            resourceFieldRef:
              containerName: pachd
              divisor: "0"
              resource: limits.memory
        envFrom:
        - secretRef:
            name: pachyderm-storage-secret
        - secretRef:
            name: pachyderm-deployment-id-secret
        - configMapRef:
            name: pachd-config
            optional: true
        image: pachyderm/pachd:2.5.5
        imagePullPolicy: IfNotPresent
        livenessProbe:
          exec:
            command:
            - /pachd
            - --readiness
          failureThreshold: 10
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 30
        name: pachd
        ports:
        - containerPort: 1600
          name: s3gateway-port
          protocol: TCP
        - containerPort: 1650
          name: api-grpc-port
          protocol: TCP
        - containerPort: 1653
          name: peer-port
          protocol: TCP
        - containerPort: 1657
          name: oidc-port
          protocol: TCP
        - containerPort: 1658
          name: identity-port
          protocol: TCP
        - containerPort: 1656
          name: prom-metrics
          protocol: TCP
        readinessProbe:
          exec:
            command:
            - /pachd
            - --readiness
          failureThreshold: 3
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        resources: {}
        startupProbe:
          exec:
            command:
            - /pachd
            - --readiness
          failureThreshold: 10
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 30
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /tmp
          name: tmp
        - mountPath: /pach
          name: pach-disk
        - mountPath: /pachyderm-storage-secret
          name: pachyderm-storage-secret
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext:
        runAsUser: 0
      serviceAccount: pachyderm
      serviceAccountName: pachyderm
      terminationGracePeriodSeconds: 30
      volumes:
      - emptyDir: {}
        name: tmp
      - hostPath:
          path: /var/pachyderm-gApqb/pachd
          type: DirectoryOrCreate
        name: pach-disk
      - name: pachyderm-storage-secret
        secret:
          defaultMode: 420
          secretName: pachyderm-storage-secret
status:
  conditions:
  - lastTransitionTime: "2023-05-01T16:59:52Z"
    lastUpdateTime: "2023-05-01T16:59:52Z"
    message: Deployment does not have minimum availability.
    reason: MinimumReplicasUnavailable
    status: "False"
    type: Available
  - lastTransitionTime: "2023-05-01T17:09:53Z"
    lastUpdateTime: "2023-05-01T17:09:53Z"
    message: ReplicaSet "pachd-7f5f57bd7d" has timed out progressing.
    reason: ProgressDeadlineExceeded
    status: "False"
    type: Progressing
  observedGeneration: 1
  replicas: 1
  unavailableReplicas: 1
  updatedReplicas: 1

```