Skip to content

Latest commit

 

History

History

operate-first

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

Deploying Fybrik on Operate First

Operate First is a concept of bringing pre-release open source software to a production cloud environment. The Mass Open Cloud (MOC) is a production cloud resource where projects are run. Deploying Fybrik on Operate First is the first step to integrating Fybrik with Open Data Hub, making Fybrik more easily accessible to data scientists. For further questions about Operate First or contributing to Operate First GitHub repositories, join the Slack channel here

Accessing the MOC Smaug cluster

The Smaug cluster is where all user workloads are deployed. A deployment of Open Data Hub (ODH) is also managed on the Smaug cluster. This is valuable to Fybrik since this ODH Deployment includes JupyterHub.

Getting Access

Your GitHub username must be in this file to get access to the Fybrik user group on the Smaug cluster and login successfully. Create a PR in the operate-first/apps repository with your GitHub username added to group.yaml if you would like to be added as a user.

Logging In

You can access the Smaug cluster with this OpenShift console login link. Click on operate-first to login with GitHub authentication. Once logged into the OpenShift console, you can use this link to get an oc login command with a token that will let you login to the Smaug OpenShift cluster from your terminal.

Deploying Fybrik Cluster-Scoped Resources

In the Operate First environment, cluster-scoped resource manifests must be added to the operate-first/apps repository to be deployed on the Smaug cluster because of security reasons. This integration/operate-first directory contains the raw YAML files of the cluster scoped resources deployed for Fybrik, mainly the custom resource definitions (CRDs)used by Fybrik. These files are generated from the Helm charts in charts/fybrik.

If the Helm chart has been updated, follow the steps below to generate the new yaml files:

  1. Install yq and Helm in this repo. Run these commands from the root directory of this repo:
cd hack/tools
./install_yq.sh
./install_helm.sh
  1. Go back to the integration/operate-first folder and set up the Python environment there
cd integration/operate-first
pipenv install
pipenv shell
  1. Run Makefile to generate new YAML files from the Helm charts:
make all

After the cluster-scoped YAML files are generated, create a PR to the operate-first/apps repository in the cluster-scope/base directory with the YAML files in the subdirectories organized by resource type. Any resource added to base has also been added to kustomization.yaml in cluster-scope/overlays. Resources will only be deployed to the Smaug cluster if they are included in this kustomization.yaml file. Namespaces must also be added to this file to be created on the Smaug cluster. We currently have 3 namespaces that anyone in the Fybrik user group can access: fybrik-system, fybrik-blueprints, and fybrik-applications. More documentation about contributing to the operate-first/apps repository can be found here.

Deploying Namespace-Scoped resources

Operate First has an ArgoCD instance deployed on MOC that can be used to deploy OpenShift resources located on a Git Repository. Only namespace-scoped resources can be deployed with ArgoCD. Any cluster-scoped resource, such as CRDs or cluster roles, will be blocked by ArgoCD. The namespace-scoped resources required for Fybrik have been onboarded to ArgoCD by following these instructions and an ArgoCD project has been created for Fybrik. You can login to the ArgoCD instance with the same login method as above. We have deployed 2 ArgoCD applications which are automatically synced with the latest release of Fybrik. The fybrik and vault ArgoCD applications deployed on the Smaug cluster are in sync with the fybrik/charts repository.

The following are the ArgoCD application manifests which have been added to the operate-first/apps repository:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: fybrik
spec:
  destination:
    name: smaug
    namespace: fybrik-system
  source:
    path: charts/fybrik
    repoURL: 'https://github.com/fybrik/charts'
    targetRevision: HEAD
    helm:
      parameters:
        # Disable deploying Fbrik cluster scoped resources
        - name: clusterScoped
          value: "false"
        # Only watch for FybrikApplication from fybrik-applications
        - name: applicationNamespace
          value: fybrik-applications
  project: fybrik
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: vault
spec:
  project: fybrik
  source:
    repoURL: 'https://github.com/fybrik/charts'
    path: charts/vault
    targetRevision: HEAD
    helm:
      valueFiles:
        - env/dev/vault-single-cluster-values.yaml
      parameters:
        # authDelegator enables a cluster role binding to be attached to the service account.
        # The cluster role binding is already deployed in the smaug cluster and thus authDelegator can be disabled.
        - name: vault.server.authDelegator.enabled
          value: 'false'
        - name: vault.global.openshift
          value: 'true'
        - name: vault.injector.enabled
          value: 'false'
        - name: vault.server.dev.enabled
          value: 'true'
      values: |
        plugins:
          vaultPluginSecretsKubernetesReader:
            enabled: true
            clusterScope: false
            namespaces:
              - fybrik-applications
              - fybrik-system
        modulesNamespace: "fybrik-blueprints"
  destination:
    namespace: fybrik-system
    name: smaug

Running the Fybrik notebook sample on Operate First

  1. Follow the steps in Fybrik notebook sample to prepare a dataset to be accessed by the notebook, register the dataset in a data catalog, and define data access policies. Make sure to use the fybrik-applications namespace instead of the fybrik-notebook-sample namespace since fybrik-applications has already been created on the Smaug cluster.
  2. Access the JupyterHub instance deployed on the Smaug OpenShift cluster here and login with the above method.
  3. Start a notebook server using the Elyra Notebook Image or any image of your choosing
  4. Create a notebook and insert a new notebook cell with the Python code in Step 2 of Read the dataset from the notebook. Make sure to change the asset to fybrik-applications/paysim-csv instead of fybrik-notebook-sample/paysim-csv