diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md deleted file mode 100644 index 0751889e..00000000 --- a/CONTRIBUTING.md +++ /dev/null @@ -1,33 +0,0 @@ -# How to contribute - -We love your input! And we want to make it as easy as possible for you to contribute, whether it's by: -* Highlighting a bug -* Discussing an idea -* Proposing a new feature -* Creating a pull request - -## Getting started -* Make sure you have a [GitHub account](https://github.com/). -* Maybe create a [GitHub issue](https://github.com/gchq/gaffer-docker/issues): Does an issue already exist? If you have a new issue then describe it in as much detail as you can, e.g. step-by-step to reproduce. -* Fork the repository on GitHub. -* Clone the repo: `git clone https://github.com/gchq/gaffer-docker.git` -* Create a branch for your change, probably from the develop branch. Please don't work on develop. Try this: `git checkout -b gh--my_contribution develop` - -## Making changes -* Make sure you can reproduce any bugs you find. -* Make your changes and test. Make sure you include new or updated tests if you need to. -* Run the tests locally by following this guide on [Deploying using Kind](kubernetes/kind-deployment.md). - -## Submitting changes -* Sign the [GCHQ Contributor Licence Agreement](https://github.com/gchq/Gaffer/wiki/GCHQ-OSS-Contributor-License-Agreement-V1.0). -* Push your changes to your fork. -* Submit a [pull request](https://github.com/gchq/gaffer-docker/pulls). -* Link the issue by putting `closes #` in the description -* We'll look at it pretty soon after it's submitted, and we aim to respond within one week. - -## Getting it accepted -Here are some things you can do to make this all smoother: -* If you think it might be controversial then discuss it with us beforehand, via a GitHub issue. -* Add tests. -* Avoid hardcoded values in templates or Docker Compose files. Try and extract them to the Values.yaml or .env files if you can. -* Include the copyright banner on new files diff --git a/README.md b/README.md index 986fe353..085a6be0 100644 --- a/README.md +++ b/README.md @@ -1,26 +1,28 @@ -Gaffer Docker -================ +# Gaffer Docker -This repo contains the code needed to run Gaffer using Docker or Kubernetes. +This repo contains the code needed to run Gaffer using Docker or Kubernetes. There are two main sub-folders, 'docker' and 'kubernetes' which contain the project files you need for starting Gaffer using those services. # Running Gaffer Using Docker -For information on how to run Gaffer using Docker containers, please see the README in the docker directory: [Gaffer Docker README](docker/README.md) + +For information on how to run Gaffer using Docker containers, please see the documentation: [Gaffer Docker Docs](https://gchq.github.io/gaffer-doc/latest/dev/docker/) # Running Gaffer Using Kubernetes -For information on how to run Gaffer using Kubernetes, please see the README in the kubernetes directory: [Kubernetes README](kubernetes/README.md) + +For information on how to run Gaffer using Kubernetes, please see the documentation: [Gaffer Kubernetes Docs](https://gchq.github.io/gaffer-doc/latest/dev/kubernetes-guide/kubernetes/) # Versioning + Each of our images which is released will be tagged with the version of the software they represent. Every release, we update the `latest` tag for each image and add a new release which has the corresponding version tag. If we release Gaffer version 2.1.2, the following images would be uploaded: -* gchq/gaffer:latest -* gchq/gaffer:2 -* gchq/gaffer:2.1 -* gchq/gaffer:2.1.2 -* gchq/gaffer:2.1.2-accumulo-2.0.1 +- gchq/gaffer:latest +- gchq/gaffer:2 +- gchq/gaffer:2.1 +- gchq/gaffer:2.1.2 +- gchq/gaffer:2.1.2-accumulo-2.0.1 We maintain mutable versions of latest, as well as the major, minor and bugfix versions of Gaffer. For reproducibility make sure to use the full version in your build metadata. For `gaffer`/`gaffer-rest` images, we also create a tag including the @@ -30,7 +32,9 @@ are not published but can be build locally if required. The release process is automated by GitHub actions. # Known Compatible Docker Versions -* 20.10.23 + +- 20.10.23 # Contributing -If you would like to make a Contribution, we have all the details for doing that [here](CONTRIBUTING.md) + +We welcome contributions to this project. Detailed information on our ways of working can be found in our [developer docs](https://gchq.github.io/gaffer-doc/latest/dev/ways-of-working/). diff --git a/docker/README.md b/docker/README.md deleted file mode 100644 index e3fd37b1..00000000 --- a/docker/README.md +++ /dev/null @@ -1,21 +0,0 @@ -Docker -======= - -In this directory you can find the Dockerfiles and docker compose files for building container images for: -* [Gaffer](gaffer/) -* Gaffer's [REST API](gaffer-rest/) -* Gaffer's [Road Traffic Data Loader](gaffer-road-traffic-loader/) -* [HDFS](hdfs/) -* [Accumulo](accumulo/) -* Gaffer [Integration Test Runner](gaffer-integration-tests/) -* [gafferpy Jupyter Notebook](gaffer-pyspark-notebook/) -* Gaffer [options server for JupyterHub](gaffer-jhub-options-server/) -* [Spark](spark-py/) - -For more specific information on what these images are for and how to build them, please see their respective READMEs. - -Please note that some of these containers will only be useful if utilised by the Helm Charts under [Kubernetes](/kubernetes/), and may not be possible to run on their own. - -# Requirements -Before you can build and run these containers you will need to install Docker along with the compose plugin. Information on this can be found in the docker docs -* [Installing docker](https://docs.docker.com/get-docker/) diff --git a/kubernetes/README.md b/kubernetes/README.md deleted file mode 100644 index 74c25914..00000000 --- a/kubernetes/README.md +++ /dev/null @@ -1,28 +0,0 @@ -Kubernetes -========== -In this directory you can find the Helm charts required to deploy various applications onto Kubernetes clusters. -The Helm charts and associated information for each application can be found in the following places: -* [HDFS](kubernetes/hdfs/) -* [Accumulo](kubernetes/accumulo/) -* [Gaffer](kubernetes/gaffer/) -* [Example Gaffer Graph containing Road Traffic Dataset](kubernetes/gaffer-road-traffic/) -* [JupyterHub with Gaffer integrations](kubernetes/gaffer-jhub/) - -These charts can be accessed by cloning our repository or by using our Helm repo hosted on our [GitHub Pages Site](https://gchq.github.io/gaffer-docker) - - -## Adding this repo to Helm -To add the gaffer-docker repo to helm run: -```bash -helm repo add gaffer-docker https://gchq.github.io/gaffer-docker -``` - -# Kubernetes How-to Guides -We have a number of [guides](./docs/guides.md) to help you deploy Gaffer on Kubernetes. It is important you look at these before you get started, they provide the initial steps for running these applications. - -# Requirements -Before you can deploy any of these applications you need to have installed Kubernetes. -* [Installing Kubernetes](https://kubernetes.io/docs/setup/) - -You will also need to install Docker and the compose plugin. -* [installing docker](https://docs.docker.com/get-docker/) diff --git a/kubernetes/accumulo/docs/kind-deployment.md b/kubernetes/accumulo/docs/kind-deployment.md deleted file mode 100644 index b2d95cb8..00000000 --- a/kubernetes/accumulo/docs/kind-deployment.md +++ /dev/null @@ -1,48 +0,0 @@ -Deploying Accumulo using kind -================================= - -All the scripts found here are designed to be run from the kubernetes/accumulo folder. - -First follow the [instructions here](../../docs/kind-deployment.md) to provision and configure a local Kubernetes cluster, using [kind](https://kind.sigs.k8s.io/) (Kubernetes IN Docker), that the Accumulo Helm Chart can be deployed on. - -```bash -export HADOOP_VERSION=${HADOOP_VERSION:-3.3.3} -export GAFFER_VERSION=${GAFFER_VERSION:-2.0.0} - -helm dependency update - -helm install accumulo . \ - --set hdfs.namenode.tag=${HADOOP_VERSION} \ - --set hdfs.datanode.tag=${HADOOP_VERSION} \ - --set hdfs.shell.tag=${HADOOP_VERSION} \ - --set accumulo.image.tag=${GAFFER_VERSION} - -helm test accumulo -``` - -# Accessing Web UIs (via `kubectl port-forward`) -| Component | Command | URL | -| ----------- | ---------------------------------------------------------------- | --------------------------- | -| Accumulo | `kubectl port-forward svc/road-traffic-gaffer-monitor 9995:80` | http://localhost:9995/ | - - -# Accessing Web UIs (via [Nginx Ingress Controller](https://github.com/kubernetes/ingress-nginx)) -Register the FQDNs for each component in DNS e.g. -``` -echo "127.0.0.1 gaffer.k8s.local accumulo.k8s.local hdfs.k8s.local" | sudo tee -a /etc/hosts -``` - -Update the Gaffer deployment to route ingress based on FQDNs: -``` -helm upgrade accumulo . -f ./values-host-based-ingress.yaml --reuse-values -``` - -Set up port forwarding to the nginx ingress controller: -``` -sudo KUBECONFIG=$HOME/.kube/config kubectl port-forward -n ingress-nginx svc/ingress-nginx 80:80 -``` - -Access the web UIs using the following URLs: -| Component | URL | -| ----------- | ----------------------------- | -| Accumulo | http://accumulo.k8s.local/ | \ No newline at end of file diff --git a/kubernetes/docs/add-libraries.md b/kubernetes/docs/add-libraries.md deleted file mode 100644 index 24d98c9f..00000000 --- a/kubernetes/docs/add-libraries.md +++ /dev/null @@ -1,63 +0,0 @@ -Adding your own libraries and functions -======================================= -By default with the Gaffer deployment you get access to the: -* Sketches library -* Time library -* Bitmap Library -* JCS cache library - -If you want more libraries than this (either one of ours of one of your own) you will need to customise the docker images and use them in place of the defaults. - -You will need a basic Gaffer instance deployed on Kubernetes. Here is [how to do that](./deploy-empty-graph.md). - -# Overwrite the REST war file -At the moment, Gaffer uses a runnable jar file located at /gaffer/jars. When it runs it includes the /gaffer/jars/lib on the classpath. There is nothing in there by default because all the dependencies are bundled in to the JAR. However, if you wanted to add your own jars, you can do it like this: -```Dockerfile -FROM gchq/gaffer-rest:latest -COPY ./my-custom-lib:1.0-SNAPSHOT.jar /gaffer/jars/lib/ -``` - -Build the image using: -```bash -docker build -t custom-rest:latest . -``` - -# Add the extra libraries to the Accumulo image -Gaffer's Accumulo image includes support for the following Gaffer libraries: -* The Bitmap Library -* The Sketches Library -* The Time Library - -In order to push down any extra value objects and filters to Accumulo that are not in those libraries, we have to add the jars to the accumulo /lib/ext directory. Here is an example `Dockerfile`: -```Dockerfile -FROM gchq/gaffer:latest -COPY ./my-library-1.0-SNAPSHOT.jar /opt/accumulo/lib/ext -``` -Then build the image -```bash -docker build -t custom-gaffer-accumulo:latest . -``` - -# Switch the images in the deployment -You will need a way of making the custom images visible to the kubernetes cluster. With EKS, you can do this by uploading the images to ECR. There is an example for how to do that in one of our [other guides](./aws-eks-deployment.md#Container+Images). With KinD, you just run `kind load docker-image `. - -Once visible you can switch them out. Create a `custom-images.yaml` file with the following contents: -```yaml -api: - image: - repository: custom-rest - tag: latest - -accumulo: - image: - repository: custom-gaffer-accumulo - tag: latest -``` - -To switch them run: -```bash -helm upgrade my-graph gaffer-docker/gaffer -f custom-images.yaml --reuse-values -``` - -# What next? -See our [guides](./guides.md) for other things you can do with Gaffer on Kubernetes. diff --git a/kubernetes/docs/aws-eks-deployment.md b/kubernetes/docs/aws-eks-deployment.md deleted file mode 100644 index b80e80f7..00000000 --- a/kubernetes/docs/aws-eks-deployment.md +++ /dev/null @@ -1,156 +0,0 @@ -Deploying Gaffer on AWS EKS -=========================== -The following instructions will guide you through provisioning and configuring an [AWS EKS](https://aws.amazon.com/eks/) cluster that our Helm Charts can be deployed on. - -# Install CLI Tools -* [docker compose](https://github.com/docker/compose/releases/latest) -* [kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/) -* [helm](https://github.com/helm/helm/releases) -* [aws-cli](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html) -* [eksctl](https://github.com/weaveworks/eksctl/releases/latest) - - -# Container Images -If the versions of the container images you would like to deploy are not available in [Docker Hub](https://hub.docker.com/u/gchq) then you will need to host them in a registry yourself. - -The following instructions build all the container images and host them in AWS ECR when run from the ./kubernetes folder: - - -```bash -export HADOOP_VERSION=${HADOOP_VERSION:-3.3.3} -export GAFFER_VERSION=${GAFFER_VERSION:-2.0.0} - -docker compose --project-directory ../docker/accumulo/ -f ../docker/accumulo/docker-compose.yaml build -docker compose --project-directory ../docker/gaffer-road-traffic-loader/ -f ../docker/gaffer-road-traffic-loader/docker-compose.yaml build - -HADOOP_IMAGES="hdfs" -GAFFER_IMAGES="gaffer gaffer-rest gaffer-road-traffic-loader" - -ACCOUNT=$(aws sts get-caller-identity --query Account --output text) -[ "${REGION}" = "" ] && REGION=$(aws configure get region) -[ "${REGION}" = "" ] && REGION=$(curl --silent -m 5 http://169.254.169.254/latest/dynamic/instance-identity/document | grep region | cut -d'"' -f 4) -REPO_PREFIX="${ACCOUNT}.dkr.ecr.${REGION}.amazonaws.com/gchq" - -for repo in ${HADOOP_IMAGES} ${GAFFER_IMAGES}; do - aws ecr create-repository --repository-name gchq/${repo} -done - -echo $(aws ecr get-login-password) | docker login -u AWS --password-stdin https://${ACCOUNT}.dkr.ecr.${REGION}.amazonaws.com - -for repo in ${HADOOP_IMAGES}; do - docker image tag gchq/${repo}:${HADOOP_VERSION} ${REPO_PREFIX}/${repo}:${HADOOP_VERSION} - docker image push ${REPO_PREFIX}/${repo}:${HADOOP_VERSION} -done - -for repo in ${GAFFER_IMAGES}; do - docker image tag gchq/${repo}:${GAFFER_VERSION} ${REPO_PREFIX}/${repo}:${GAFFER_VERSION} - docker image push ${REPO_PREFIX}/${repo}:${GAFFER_VERSION} -done -``` - -# EKS Cluster -There are a number of ways to provision an AWS EKS cluster. This guide uses a cli tool called `eksctl`. Documentation is available at https://docs.aws.amazon.com/eks/latest/userguide/getting-started.html for some of the other methods. - -Before issuing any commands, the subnets that will be used by your EKS cluster need to be tagged accordingly: -| Subnet Type | Tag Key | Tag Value | -| ----------- | ------------------------------- | --------- | -| Public | kubernetes.io/role/elb | 1 | -| Private | kubernetes.io/role/internal-elb | 1 | - -If you want the cluster to spin up in a VPC that is not the default, then set `$VPC_ID`. - -```bash -EKS_CLUSTER_NAME=${EKS_CLUSTER_NAME:-gaffer} -KUBERNETES_VERSION=${KUBERNETES_VERSION:-1.15} - -[ "${VPC_ID}" = "" ] && VPC_ID=$(aws ec2 describe-vpcs --filters Name=isDefault,Values=true --query Vpcs[0].VpcId --output text) -[ "${VPC_ID}" = "" ] && echo "Unable to detect default VPC ID, please set \$VPC_ID" && exit 1 - -# Obtain a list of public and private subnets that the cluster will be deployed into by querying for the required 'elb' tags -PUBLIC_SUBNET_IDS=$(aws ec2 describe-subnets --filters Name=vpc-id,Values=${VPC_ID} Name=tag-key,Values=kubernetes.io/role/elb --query Subnets[].SubnetId --output text | tr -s '[:blank:]' ',') -PRIVATE_SUBNET_IDS=$(aws ec2 describe-subnets --filters Name=vpc-id,Values=${VPC_ID} Name=tag-key,Values=kubernetes.io/role/internal-elb --query Subnets[].SubnetId --output text | tr -s '[:blank:]' ',') -[ "${PUBLIC_SUBNET_IDS}" = "" ] && echo "Unable to detect any public subnets. Make sure they are tagged: kubernetes.io/role/elb=1" && exit 1 -[ "${PRIVATE_SUBNET_IDS}" = "" ] && echo "Unable to detect any private subnets. Make sure they are tagged: kubernetes.io/role/internal-elb=1" && exit 1 - -eksctl create cluster \ - -n "${EKS_CLUSTER_NAME}" \ - --version "${KUBERNETES_VERSION}" \ - --managed \ - --nodes 3 \ - --nodes-min 3 \ - --nodes-max 12 \ - --node-volume-size 20 \ - --full-ecr-access \ - --alb-ingress-access \ - --vpc-private-subnets "${PRIVATE_SUBNET_IDS}" \ - --vpc-public-subnets "${PUBLIC_SUBNET_IDS}" - -aws eks update-kubeconfig --name ${EKS_CLUSTER_NAME} -``` - -# Ingress -Deploy the AWS ALB Ingress Controller, using the docs at https://docs.aws.amazon.com/eks/latest/userguide/alb-ingress.html - -At the time of writing, this involves issuing the following commands: - -```bash -EKS_CLUSTER_NAME=${EKS_CLUSTER_NAME:-gaffer} - -[ "${ACCOUNT}" = "" ] && ACCOUNT=$(aws sts get-caller-identity --query Account --output text) -[ "${REGION}" = "" ] && REGION=$(aws configure get region) -[ "${REGION}" = "" ] && REGION=$(curl --silent -m 5 http://169.254.169.254/latest/dynamic/instance-identity/document | grep region | cut -d'"' -f 4) -[ "${REGION}" = "" ] && echo "Unable to detect AWS region, please set \$REGION" && exit 1 - -eksctl utils associate-iam-oidc-provider \ - --region "${REGION}" \ - --cluster "${EKS_CLUSTER_NAME}" \ - --approve - -aws iam create-policy \ - --policy-name ALBIngressControllerIAMPolicy \ - --policy-document https://raw.githubusercontent.com/kubernetes-sigs/aws-alb-ingress-controller/v1.1.4/docs/examples/iam-policy.json - -kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/aws-alb-ingress-controller/v1.1.4/docs/examples/rbac-role.yaml - -eksctl create iamserviceaccount \ - --region "${REGION}" \ - --name alb-ingress-controller \ - --namespace kube-system \ - --cluster "${EKS_CLUSTER_NAME}" \ - --attach-policy-arn arn:aws:iam::${ACCOUNT}:policy/ALBIngressControllerIAMPolicy \ - --override-existing-serviceaccounts \ - --approve - -curl https://raw.githubusercontent.com/kubernetes-sigs/aws-alb-ingress-controller/v1.1.4/docs/examples/alb-ingress-controller.yaml | sed "s/# - --cluster-name=devCluster/- --cluster-name=${EKS_CLUSTER_NAME}/" | kubectl apply -f - -``` - -# Deploy Helm Charts -* [HDFS](../hdfs/docs/aws-eks-deployment.md) -* [Gaffer](../gaffer/docs/aws-eks-deployment.md) -* [Example Gaffer Graph containing Road Traffic Dataset](../gaffer-road-traffic/docs/aws-eks-deployment.md) - - -# Access Web UIs -The AWS ALB Ingress Controller will create an application load balancer (ALB) for each Ingress resource deployed into the EKS cluster. - -You can find out the URL that you can use to access each ingress with `kubectl get ing` - -**⚠️ WARNING ⚠️**\ -By default, the security group assigned to the ALBs will allow anyone to access them. We highly recommend attaching a combination of the [other annotations available](https://kubernetes-sigs.github.io/aws-alb-ingress-controller/guide/ingress/annotation/#security-groups) to each of your Ingress resources to control access to them. - - -# Uninstall -```bash -EKS_CLUSTER_NAME=${EKS_CLUSTER_NAME:-gaffer} - -# Use helm to uninstall any deployed charts -for release in $(helm ls --short); do - helm uninstall ${release} -done - -# Ensure EBS volumes are deleted -kubectl get pvc --output name | xargs kubectl delete - -# Delete the EKS cluster -eksctl delete cluster --name "${EKS_CLUSTER_NAME}" -``` diff --git a/kubernetes/docs/change-accumulo-passwords.md b/kubernetes/docs/change-accumulo-passwords.md deleted file mode 100644 index f16ebe78..00000000 --- a/kubernetes/docs/change-accumulo-passwords.md +++ /dev/null @@ -1,37 +0,0 @@ -Changing the Accumulo Passwords -=============================== - -When deploying Accumulo - either as part of a Gaffer stack or as a standalone, the passwords for all the users and the instance.secret are set to default values and should be changed. The instance.secret cannot be changed once deployed as it is used in initalisation. - -When deploying the Accumulo helm chart, the following values are set. If you are using the Gaffer helm chart with the Accumulo integration, the values will be prefixed with "accumulo": - -| Name | value | default value -|----------------------|-----------------------------------------------|----------------- -| Instance Secret | `config.accumuloSite."instance.secret"` | "DEFAULT" -| Root password | `config.userManagement.rootPassword` | "root" -| Tracer user password | `config.userManagement.users.tracer.password` | "tracer" - -When you deploy the Gaffer Helm chart with Accumulo, a "gaffer" user with a password of "gaffer" is used by default following the same pattern as the tracer user. - -So to install a new Gaffer with Accumulo store, create an `accumulo-passwords.yaml` with the following contents: - -```yaml -accumulo: - enabled: true - config: - accumuloSite: - instance.secret: "changeme" - userManagement: - rootPassword: "changeme" - users: - tracer: - password: "changme" - gaffer: - password: "changeme" -``` - -You can install the graph with: - -```bash -helm install my-graph gaffer-docker/gaffer -f accumulo-passwords.yaml -``` diff --git a/kubernetes/docs/change-graph-metadata.md b/kubernetes/docs/change-graph-metadata.md deleted file mode 100644 index a8daa3b3..00000000 --- a/kubernetes/docs/change-graph-metadata.md +++ /dev/null @@ -1,61 +0,0 @@ -Changing the Graph Id and Description -======================================= -By default, the default Gaffer deployment ships with the Graph name "simpleGraph" and description "A graph for demo purposes" These are just placeholders and can be overwritten. This guide will show you how. - -The first thing you will need to do is [deploy an empty graph](./deploy-empty-graph.md). - -# Changing the description -Create a file called `graph-meta.yaml`. We will use this file to add our description and graph Id. -Changing the description is as easy as changing the `graph.config.description` value. -```yaml -graph: - config: - description: "My graph description" -``` -Feel free to be a bit more imaginative. - -# Deploy the new description -Upgrade your deployment using helm: - -```bash -helm upgrade my-graph gaffer-docker/gaffer -f graph-metadata.yaml --reuse-values -``` - -The `--reuse-values` argument means we do not override any passwords that we set in the initial construction. - -You can see your new description if you go to the Swagger UI and call the /graph/config/description endpoint. - -# Updating the Graph Id -This may be simple or complicated depending on your store type. If you are using the Map or Federated store, you can just set the `graph.config.graphId` value in the same way. Though if you are using a MapStore, the graph will be emptied as a result. - -However, if you are using the Accumulo store, updating the graph Id is a little more complicated since the Graph Id corresponds to an Accumulo table. We have to change the gaffer users permissions to read and write to that table. To do that update the `graph-meta.yaml` file with the following contents: -```yaml -graph: - config: - graphId: "MyGraph" - description: "My Graph description" - -accumulo: - config: - userManagement: - users: - gaffer: - permissions: - table: - MyGraph: - - READ - - WRITE - - BULK_IMPORT - - ALTER_TABLE -``` - -# Deploy your changes -Upgrade your deployment using Helm. -```bash -helm upgrade my-graph gaffer-docker/gaffer -f graph-metadata.yaml --reuse-values -``` - -If you take a look at the Accumulo monitor, you will see your new Accumulo table - -# What next? -See our [guides](./guides.md) for other things you can do with Gaffer on Kubernetes. diff --git a/kubernetes/docs/deploy-empty-graph.md b/kubernetes/docs/deploy-empty-graph.md deleted file mode 100644 index b8589ebe..00000000 --- a/kubernetes/docs/deploy-empty-graph.md +++ /dev/null @@ -1,69 +0,0 @@ -How to deploy a simple graph -============================= -This guide will describe how to deploy a simple empty graph with the minimum configuration. - -You will need: -1. Helm -2. Kubectl -3. A Kubernetes cluster (local or remote) -4. An ingress controller running (for accessing UIs) - -# Add the Gaffer Docker repo -To start with, you should add the Gaffer Docker repo to your helm repos. This will save the need for cloning this Git repository. If you have already done this, you can skip this step. -```bash -helm repo add gaffer-docker https://gchq.github.io/gaffer-docker -``` - -# Choose the store -Gaffer can be backed with a number of different technologies to back its store. Which one you want depends on the use case but as a rule of thumb: -* If you just want something to spin up quickly at small scale and are not worried about persistance: use the [Map Store](#deploy-the-map-store). -* If you want to back it with a key value datastore, you can deploy the [Accumulo Store](#deploy-the-accumulo-store) -* If you want to join two or more graphs together to query them as one, you will want to use the [Federated Store](#deploy-the-federated-store) - -Other stores such as parquet or hbase could be supported by this helm chart if you wanted, but support for it is not available yet. - -## Deploy the Map Store -The Map store is just an in memory store that can be used for demos or if you need something small scale short term. It is our default store so there is no need for any extra configuration. - -You can install a Map Store by just running: -```bash -helm install my-graph gaffer-docker/gaffer -``` - -## Deploy the Accumulo Store -If you want to deploy Accumulo with your graph, it is relatively easy to do so with some small additional configuration. -Create a file called `accumulo.yaml` and add the following: -```yaml -accumulo: - enabled: true -``` - -By default, the Gaffer user is created with a password of "gaffer" the `CREATE_TABLE` system permission with full access to the `simpleGraph` table which is coupled to the graphId. All the default Accumulo passwords are in place so if you were to deploy this in production, you should consider [changing the default accumulo passwords](./change-accumulo-passwords.md). - -You can stand up the accumulo store by running: -```bash -helm install my-graph gaffer-docker/gaffer -f accumulo.yaml -``` - -## Deploy the Federated Store -If you want to deploy the Federated Store, all that you really need to do is set the store.properties. -To do this add the following to a `federated.yaml` file: - -```yaml -graph: - storeProperties: - gaffer.store.class: uk.gov.gchq.gaffer.federatedstore.FederatedStore - gaffer.store.properties.class: uk.gov.gchq.gaffer.federatedstore.FederatedStoreProperties - gaffer.serialiser.json.modules: uk.gov.gchq.gaffer.sketches.serialisation.json.SketchesJsonModules -``` - -The addition of the SketchesJsonModules is just to ensure that if the FederatedStore was connecting to a store which used sketches, they could be rendered nicely in json. - -We can create the graph with: - -``` -helm install federated gaffer-docker/gaffer -f federated.yaml -``` - -# What next? -See our [guides](./guides.md) for other things you can do with Gaffer on Kubernetes. diff --git a/kubernetes/docs/guides.md b/kubernetes/docs/guides.md deleted file mode 100644 index c50747b1..00000000 --- a/kubernetes/docs/guides.md +++ /dev/null @@ -1,11 +0,0 @@ -Guides -======== -Here you will find all our guides for deploying Gaffer on Kubernetes. - -1. [Deploy on a KinD cluster locally](./kind-deployment.md) -2. [Deploy to EKS](./aws-eks-deployment.md) -3. [Deploy a simple empty graph](./deploy-empty-graph.md) -4. [Add your schema](./schema.md) -5. [Change the Graph Id and Description](./change-graph-metadata.md) -6. [Adding your own functions and libraries](./add-libraries.md) -7. [Changing the passwords on the Accumulo store](./changing-accumulo-passwords.md) \ No newline at end of file diff --git a/kubernetes/docs/kind-deployment.md b/kubernetes/docs/kind-deployment.md deleted file mode 100644 index da8ea47a..00000000 --- a/kubernetes/docs/kind-deployment.md +++ /dev/null @@ -1,63 +0,0 @@ -How to deploy a Kubernetes cluster using Kind -============================================= -The following instructions will guide you through provisioning and configuring a local Kubernetes cluster, using [kind](https://kind.sigs.k8s.io/) (Kubernetes IN Docker), that our Helm Charts can be deployed on. - -# Install CLI Tools -* [docker compose](https://github.com/docker/compose/releases/latest) -* [kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/) -* [helm](https://github.com/helm/helm/releases) -* [kind](https://kind.sigs.k8s.io/docs/user/quick-start/) - - -# Kubernetes Cluster -Simply run the following command to spin up a local Kubernetes cluster, running inside a Docker container: -``` -kind create cluster --image kindest/node:v1.24.4 -``` - - -# Container Images -If the versions of the container images you would like to deploy are not available in [Docker Hub](https://hub.docker.com/u/gchq) then you will need to build them yourself and import them into your kind cluster. - -To import the images, run this from the kubernetes directory: - -```bash -export HADOOP_VERSION=${HADOOP_VERSION:-3.3.3} -export GAFFER_VERSION=${GAFFER_VERSION:-2.0.0} - -docker compose --project-directory ../docker/accumulo/ -f ../docker/accumulo/docker-compose.yaml build -docker compose --project-directory ../docker/gaffer-road-traffic-loader/ -f ../docker/gaffer-road-traffic-loader/docker-compose.yaml build - -kind load docker-image gchq/hdfs:${HADOOP_VERSION} -kind load docker-image gchq/gaffer:${GAFFER_VERSION}-accumulo-${ACCUMULO_VERSION} -kind load docker-image gchq/gaffer-rest:${GAFFER_VERSION}-accumulo-${ACCUMULO_VERSION} -kind load docker-image gchq/gaffer-road-traffic-loader:${GAFFER_VERSION} -``` - -From here you should be able to follow the respective kind-deployment files for the services you would like to run. - -* [Accumulo](kubernetes/accumulo/) -* [Gaffer](kubernetes/gaffer/docs/kind-deployment.md) -* [Gaffer JupyterHub](kubernetes/gaffer-jhub/docs/kind-deployment.md) -* [Gaffer Road Traffic Dataset](kubernetes/gaffer-road-traffic/docs/kind-deployment.md) -* [HDFS](kubernetes/hdfs/docs/kind-deployment.md) - - -# Ingress -Deploy the Nginx Ingress Controller: -```bash -INGRESS_NGINX_VERSION="nginx-0.30.0" -kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/${INGRESS_NGINX_VERSION}/deploy/static/mandatory.yaml -kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/${INGRESS_NGINX_VERSION}/deploy/static/provider/baremetal/service-nodeport.yaml -``` - -# Deploy Helm Charts -* [HDFS](../hdfs/docs/kind-deployment.md) -* [Gaffer](../gaffer/docs/kind-deployment.md) -* [Example Gaffer Graph containing Road Traffic Dataset](../gaffer-road-traffic/docs/kind-deployment.md) - - -# Uninstall -``` -kind delete cluster -``` diff --git a/kubernetes/docs/schema.md b/kubernetes/docs/schema.md deleted file mode 100644 index 333c706b..00000000 --- a/kubernetes/docs/schema.md +++ /dev/null @@ -1,67 +0,0 @@ -How to Deploy your own Schema -============================== -Gaffer uses schema files to describe the data contained in a Graph. This guide will tell you how to deploy your own schemas with a Gaffer Graph. - -The first thing you will need to do is deploy a simple graph. We have a guide for how to do that [here](./deploy-empty-graph.md). - -Once you have that deployed, we can change the schema. - -# Edit the schema -If you run a GetSchema operation against the graph, you will notice that the count property is of type `java.lang.Integer` - change that property to be of type `java.lang.Long`. - -The easiest way to deploy a schema file is to use helms `--set-file` option which lets you set a value from the contents of a file. -Example of a `schema.json` file: - -```json -{ - "edges": { - "BasicEdge": { - "source": "vertex", - "destination": "vertex", - "directed": "true", - "properties": { - "count": "count" - } - } - }, - "entities": { - "BasicEntity": { - "vertex": "vertex", - "properties": { - "count": "count" - } - } - }, - "types": { - "vertex": { - "class": "java.lang.String" - }, - "count": { - "class": "java.lang.Long", - "aggregateFunction": { - "class": "uk.gov.gchq.koryphe.impl.binaryoperator.Sum" - } - }, - "true": { - "description": "A simple boolean that must always be true.", - "class": "java.lang.Boolean", - "validateFunctions": [ - { "class": "uk.gov.gchq.koryphe.impl.predicate.IsTrue" } - ] - } - } -} -``` - -# Update deployment with the new schema -For our deployment to pick up the changes, we need to run a helm upgrade: -```bash -helm upgrade my-graph gaffer-docker/gaffer --set-file graph.schema."schema\.json"=./schema.json --reuse-values -``` -The `--reuse-values` argument tells helm to re-use the passwords that we defined earlier. - -Now if we inspect the schema, you will see that the `count` property has changed to a Long. - - -# What next? -See our [guides](./guides.md) for other things you can do with Gaffer on Kubernetes. diff --git a/kubernetes/gaffer-jhub/docs/kind-deployment.md b/kubernetes/gaffer-jhub/docs/kind-deployment.md deleted file mode 100644 index e61bdfa4..00000000 --- a/kubernetes/gaffer-jhub/docs/kind-deployment.md +++ /dev/null @@ -1,56 +0,0 @@ -Deploying JupyterHub for Gaffer using kind -========================================== - -All scripts listed here are intended to be run from the `kubernetes/gaffer-jhub` folder. - -First, follow the [instructions here](../../gaffer-road-traffic/docs/kind-deployment.md) to provision and configure a local Kubernetes cluster, using [kind](https://kind.sigs.k8s.io/) (Kubernetes IN Docker), that has an instance of the Gaffer Road Traffic example graph deployed into it. - - -# Container Images - -Use the following commands to build and deploy the extra containers used by JupyterHub: -```bash -source ../../docker/gaffer-pyspark-notebook/.env -source ../../docker/gaffer-jhub-options-server/get-version.sh - -# Build Container Images -docker compose --project-directory ../../docker/gaffer-pyspark-notebook/ -f ../../docker/gaffer-pyspark-notebook/docker-compose.yaml build notebook -docker compose --project-directory ../../docker/spark-py/ -f ../../docker/spark-py/docker-compose.yaml build -docker compose --project-directory ../../docker/gaffer-jhub-options-server/ -f ../../docker/gaffer-jhub-options-server/docker-compose.yaml build - -# Deploy Images to Kind -kind load docker-image gchq/gaffer-pyspark-notebook:${GAFFER_VERSION} -kind load docker-image gchq/spark-py:${SPARK_VERSION} -kind load docker-image gchq/gaffer-jhub-options-server:${JHUB_OPTIONS_SERVER_VERSION} -``` - -# Deploy Helm Chart - -Once that's done, use the following commands to deploy a JupyterHub instance with Gaffer extensions: -```bash -helm dependency update -helm install jhub . -f ./values-insecure.yaml - -helm test jhub -``` - -# Accessing JupyterHub Web UIs (via `kubectl port-forward`) - -Run the following on the command line: -```bash -kubectl port-forward svc/proxy-public 8080:80 -``` - -Access the following URL in your browser: -http://localhost:8080 - -By default, JupyterHub's Dummy Authenticator is used so you can login using any username and password. - - -# Accessing example notebooks - -There are some example notebooks that demonstrate how to interact with HDFS, Gaffer and Spark. Copy them into your working directory, to make them easier to view and execute, by starting a Terminal tab and submitting the following command: -```bash -$ cp -r /examples . -``` - diff --git a/kubernetes/gaffer-road-traffic/docs/aws-eks-deployment.md b/kubernetes/gaffer-road-traffic/docs/aws-eks-deployment.md deleted file mode 100644 index ba376bc4..00000000 --- a/kubernetes/gaffer-road-traffic/docs/aws-eks-deployment.md +++ /dev/null @@ -1,52 +0,0 @@ -# Deploying Road Traffic Gaffer Graph on AWS EKS -All scripts listed here are intended to be run from the kubernetes/gaffer-road-traffic folder - -First follow the [instructions here](../../docs/aws-eks-deployment.md) to provision and configure an [AWS EKS](https://aws.amazon.com/eks/) cluster that the Gaffer Road Traffic Helm Chart can be deployed on. - -## Using ECR -If you are hosting the container images in your AWS account, using ECR, then run the following commands to configure the Helm Chart to use them: - -```bash -ACCOUNT=$(aws sts get-caller-identity --query Account --output text) -[ "${REGION}" = "" ] && REGION=$(aws configure get region) -[ "${REGION}" = "" ] && REGION=$(curl --silent -m 5 http://169.254.169.254/latest/dynamic/instance-identity/document | grep region | cut -d'"' -f 4) -if [ "${REGION}" = "" ]; then - echo "Unable to detect AWS region, please set \$REGION" -else - REPO_PREFIX="${ACCOUNT}.dkr.ecr.${REGION}.amazonaws.com/gchq" - - EXTRA_HELM_ARGS="" - EXTRA_HELM_ARGS+="--set gaffer.accumulo.hdfs.namenode.repository=${REPO_PREFIX}/hdfs " - EXTRA_HELM_ARGS+="--set gaffer.accumulo.hdfs.datanode.repository=${REPO_PREFIX}/hdfs " - EXTRA_HELM_ARGS+="--set gaffer.accumulo.hdfs.shell.repository=${REPO_PREFIX}/hdfs " - EXTRA_HELM_ARGS+="--set gaffer.accumulo.image.repository=${REPO_PREFIX}/gaffer " - EXTRA_HELM_ARGS+="--set gaffer.api.image.repository=${REPO_PREFIX}/gaffer-rest " - EXTRA_HELM_ARGS+="--set loader.image.repository=${REPO_PREFIX}/gaffer-road-traffic-loader " -fi -``` - -## Deploy Helm Chart - -The last thing before deploying is to set the passwords for the various accumulo users in the values.yaml file. These are found under `accumulo.config.userManagement`. - -Finally, deploy the Helm Chart by running this from the kubernetes/gaffer-road-traffic folder: - -``` -export HADOOP_VERSION=${HADOOP_VERSION:-3.3.3} -export GAFFER_VERSION=${GAFFER_VERSION:-2.0.0} - -helm dependency update ../accumulo/ -helm dependency update ../gaffer/ -helm dependency update - -helm install road-traffic . -f ./values-eks-alb.yaml \ - ${EXTRA_HELM_ARGS} \ - --set gaffer.hdfs.namenode.tag=${HADOOP_VERSION} \ - --set gaffer.hdfs.datanode.tag=${HADOOP_VERSION} \ - --set gaffer.hdfs.shell.tag=${HADOOP_VERSION} \ - --set gaffer.accumulo.image.tag=${GAFFER_VERSION} \ - --set gaffer.api.image.tag=${GAFFER_VERSION} \ - --set loader.image.tag=${GAFFER_VERSION} - -helm test road-traffic -``` diff --git a/kubernetes/gaffer-road-traffic/docs/kind-deployment.md b/kubernetes/gaffer-road-traffic/docs/kind-deployment.md deleted file mode 100644 index 4f91ab44..00000000 --- a/kubernetes/gaffer-road-traffic/docs/kind-deployment.md +++ /dev/null @@ -1,60 +0,0 @@ -# Deploying Road Traffic Gaffer Graph using kind -All scripts listed here are intended to be run from the kubernetes/gaffer-road-traffic folder. - -First follow the [instructions here](../../docs/kind-deployment.md) to provision and configure a local Kubernetes cluster, using [kind](https://kind.sigs.k8s.io/) (Kubernetes IN Docker), that the Gaffer Road Traffic Helm Chart can be deployed on. - -After the cluster is provisioned, update the values.yaml with the passwords for the various Accumulo users. These are found under `accumulo.config.userManagement`. - -Once that's done, run this to deploy and test the Road Traffic Graph. -```bash -export HADOOP_VERSION=${HADOOP_VERSION:-3.3.3} -export GAFFER_VERSION=${GAFFER_VERSION:-2.0.0} - -helm dependency update ../accumulo/ -helm dependency update ../gaffer/ -helm dependency update - -helm install road-traffic . \ - --set gaffer.hdfs.namenode.tag=${HADOOP_VERSION} \ - --set gaffer.hdfs.datanode.tag=${HADOOP_VERSION} \ - --set gaffer.hdfs.shell.tag=${HADOOP_VERSION} \ - --set gaffer.accumulo.image.tag=${GAFFER_VERSION} \ - --set gaffer.api.image.tag=${GAFFER_VERSION} \ - --set loader.image.tag=${GAFFER_VERSION} - -helm test road-traffic -``` - - -## Accessing Web UIs (via `kubectl port-forward`) - -| Component | Command | URL | -| ----------- | ---------------------------------------------------------------- | --------------------------- | -| HDFS | `kubectl port-forward svc/road-traffic-hdfs-namenodes 9870:9870` | http://localhost:9870/ | -| Accumulo | `kubectl port-forward svc/road-traffic-gaffer-monitor 9995:80` | http://localhost:9995/ | -| Gaffer REST | `kubectl port-forward svc/road-traffic-gaffer-api 8080:80` | http://localhost:8080/rest/ | - - -## Accessing Web UIs (via [Nginx Ingress Controller](https://github.com/kubernetes/ingress-nginx)) - -Register the FQDNs for each component in DNS e.g. -``` -echo "127.0.0.1 gaffer.k8s.local accumulo.k8s.local hdfs.k8s.local" | sudo tee -a /etc/hosts -``` - -Update the Gaffer deployment to route ingress based on FQDNs: -``` -helm upgrade road-traffic . -f ./values-host-based-ingress.yaml --reuse-values -``` - -Set up port forwarding to the nginx ingress controller: -``` -sudo KUBECONFIG=$HOME/.kube/config kubectl port-forward -n ingress-nginx svc/ingress-nginx 80:80 -``` - -Access the web UIs using the following URLs: -| Component | URL | -| ----------- | ----------------------------- | -| HDFS | http://hdfs.k8s.local/ | -| Accumulo | http://accumulo.k8s.local/ | -| Gaffer REST | http://gaffer.k8s.local/rest/ | diff --git a/kubernetes/gaffer/docs/aws-eks-deployment.md b/kubernetes/gaffer/docs/aws-eks-deployment.md deleted file mode 100644 index aa8597f3..00000000 --- a/kubernetes/gaffer/docs/aws-eks-deployment.md +++ /dev/null @@ -1,51 +0,0 @@ -# Deploying Gaffer on AWS EKS -All scripts listed here are intended to be run from the kubernetes/gaffer folder - -First follow the [instructions here](../../docs/aws-eks-deployment.md) to provision and configure an [AWS EKS](https://aws.amazon.com/eks/) cluster that the Gaffer Road Traffic Helm Chart can be deployed on. - -## Using ECR -If you are hosting the container images in your AWS account, using ECR, then run the following commands to configure the Helm Chart to use them: - -```bash -ACCOUNT=$(aws sts get-caller-identity --query Account --output text) -[ "${REGION}" = "" ] && REGION=$(aws configure get region) -[ "${REGION}" = "" ] && REGION=$(curl --silent -m 5 http://169.254.169.254/latest/dynamic/instance-identity/document | grep region | cut -d'"' -f 4) -if [ "${REGION}" = "" ]; then - echo "Unable to detect AWS region, please set \$REGION" -else - REPO_PREFIX="${ACCOUNT}.dkr.ecr.${REGION}.amazonaws.com/gchq" - - EXTRA_HELM_ARGS="" - EXTRA_HELM_ARGS+="--set gaffer.hdfs.namenode.repository=${REPO_PREFIX}/hdfs " - EXTRA_HELM_ARGS+="--set gaffer.hdfs.datanode.repository=${REPO_PREFIX}/hdfs " - EXTRA_HELM_ARGS+="--set gaffer.hdfs.shell.repository=${REPO_PREFIX}/hdfs " - EXTRA_HELM_ARGS+="--set gaffer.accumulo.image.repository=${REPO_PREFIX}/gaffer " - EXTRA_HELM_ARGS+="--set gaffer.api.image.repository=${REPO_PREFIX}/gaffer-rest " - EXTRA_HELM_ARGS+="--set loader.image.repository=${REPO_PREFIX}/gaffer-road-traffic-loader " -fi -``` - -## Deploy Helm Chart - -By default the gaffer graph uses the in-memory MapStore. If you want to use an alternative store, we have a guide for that [here](../../docs/deploy-empty-graph.md) - - -```bash -export HADOOP_VERSION=${HADOOP_VERSION:-3.3.3} -export GAFFER_VERSION=${GAFFER_VERSION:-2.0.0} - -helm dependency update ../accumulo/ -helm dependency update ../gaffer/ -helm dependency update - -helm install gaffer . -f ./values-eks-alb.yaml \ - ${EXTRA_HELM_ARGS} \ - --set gaffer.accumulo.hdfs.namenode.tag=${HADOOP_VERSION} \ - --set gaffer.accumulo.hdfs.datanode.tag=${HADOOP_VERSION} \ - --set gaffer.accumulo.hdfs.shell.tag=${HADOOP_VERSION} \ - --set gaffer.accumulo.image.tag=${GAFFER_VERSION} \ - --set gaffer.api.image.tag=${GAFFER_VERSION} \ - --set loader.image.tag=${GAFFER_VERSION} - -helm test road-traffic -``` diff --git a/kubernetes/gaffer/docs/kind-deployment.md b/kubernetes/gaffer/docs/kind-deployment.md deleted file mode 100644 index 21080ee7..00000000 --- a/kubernetes/gaffer/docs/kind-deployment.md +++ /dev/null @@ -1,54 +0,0 @@ -Deploying Gaffer using kind -============================ - -All the scripts found here are designed to be run from the kubernetes/gaffer folder. - -First follow the [instructions here](../../docs/kind-deployment.md) to provision and configure a local Kubernetes cluster, using [kind](https://kind.sigs.k8s.io/) (Kubernetes IN Docker), that the Gaffer Helm Chart can be deployed on. - -The standard Gaffer deployment will give you an in-memory store. To change this see [our comprehensive guide](../../docs/deploy-empty-graph.md) to change the store type. - -```bash -export HADOOP_VERSION=${HADOOP_VERSION:-3.3.3} -export GAFFER_VERSION=${GAFFER_VERSION:-2.0.0} - -helm dependency update - -helm install gaffer . \ - --set hdfs.namenode.tag=${HADOOP_VERSION} \ - --set hdfs.datanode.tag=${HADOOP_VERSION} \ - --set hdfs.shell.tag=${HADOOP_VERSION} \ - --set accumulo.image.tag=${GAFFER_VERSION} \ - --set api.image.tag=${GAFFER_VERSION} - -helm test gaffer -``` - - -# Accessing Web UIs (via `kubectl port-forward`) - -| Component | Command | URL | -| ----------- | ---------------------------------------------------------- | --------------------------- | -| Gaffer REST | `kubectl port-forward svc/gaffer-api 8080:80` | http://localhost:8080/rest/ | - - -# Accessing Web UIs (via [Nginx Ingress Controller](https://github.com/kubernetes/ingress-nginx)) - -Register the FQDNs for each component in DNS e.g. -``` -echo "127.0.0.1 gaffer.k8s.local accumulo.k8s.local hdfs.k8s.local" | sudo tee -a /etc/hosts -``` - -Update the Gaffer deployment to route ingress based on FQDNs: -``` -helm upgrade gaffer . -f ./values-host-based-ingress.yaml --reuse-values -``` - -Set up port forwarding to the nginx ingress controller: -``` -sudo KUBECONFIG=$HOME/.kube/config kubectl port-forward -n ingress-nginx svc/ingress-nginx 80:80 -``` - -Access the web UIs using the following URLs: -| Component | URL | -| ----------- | ----------------------------- | -| Gaffer REST | http://gaffer.k8s.local/rest/ | \ No newline at end of file diff --git a/kubernetes/hdfs/docs/aws-eks-deployment.md b/kubernetes/hdfs/docs/aws-eks-deployment.md deleted file mode 100644 index e423ba43..00000000 --- a/kubernetes/hdfs/docs/aws-eks-deployment.md +++ /dev/null @@ -1,37 +0,0 @@ -# Deploying HDFS on AWS EKS -All scripts listed here are intended to be run from the kubernetes/hdfs folder - -First follow the [instructions here](../../docs/aws-eks-deployment.md) to provision and configure an [AWS EKS](https://aws.amazon.com/eks/) cluster that the HDFS Helm Chart can be deployed on. - -## Using ECR -If you are hosting the container images in your AWS account, using ECR, then run the following commands to configure the Helm Chart to use them: - -```bash -ACCOUNT=$(aws sts get-caller-identity --query Account --output text) -[ "${REGION}" = "" ] && REGION=$(aws configure get region) -[ "${REGION}" = "" ] && REGION=$(curl --silent -m 5 http://169.254.169.254/latest/dynamic/instance-identity/document | grep region | cut -d'"' -f 4) -if [ "${REGION}" = "" ]; then - echo "Unable to detect AWS region, please set \$REGION" -else - REPO_PREFIX="${ACCOUNT}.dkr.ecr.${REGION}.amazonaws.com/gchq" - - EXTRA_HELM_ARGS="" - EXTRA_HELM_ARGS+="--set namenode.repository=${REPO_PREFIX}/hdfs " - EXTRA_HELM_ARGS+="--set datanode.repository=${REPO_PREFIX}/hdfs " - EXTRA_HELM_ARGS+="--set shell.repository=${REPO_PREFIX}/hdfs " -fi -``` - -## Deploy Helm chart - -```bash -export HADOOP_VERSION=${HADOOP_VERSION:-3.3.3} - -helm install hdfs . -f ./values-eks-alb.yaml \ - ${EXTRA_HELM_ARGS} \ - --set hdfs.namenode.tag=${HADOOP_VERSION} \ - --set hdfs.datanode.tag=${HADOOP_VERSION} \ - --set hdfs.shell.tag=${HADOOP_VERSION} - -helm test hdfs -``` diff --git a/kubernetes/hdfs/docs/kind-deployment.md b/kubernetes/hdfs/docs/kind-deployment.md deleted file mode 100644 index e21ced73..00000000 --- a/kubernetes/hdfs/docs/kind-deployment.md +++ /dev/null @@ -1,45 +0,0 @@ -# Deploying HDFS using kind -All scripts listed here are intended to be run from the kubernetes/hdfs folder - -First follow the [instructions here](../../docs/kind-deployment.md) to provision and configure a local Kubernetes cluster, using [kind](https://kind.sigs.k8s.io/) (Kubernetes IN Docker), that the HDFS Helm Chart can be deployed on. - -## Deploying Helm charts - -```bash -export HADOOP_VERSION=${HADOOP_VERSION:-3.3.3} - -helm install hdfs . \ - --set hdfs.namenode.tag=${HADOOP_VERSION} \ - --set hdfs.datanode.tag=${HADOOP_VERSION} \ - --set hdfs.shell.tag=${HADOOP_VERSION} - -helm test hdfs -``` - -## Accessing Web UI (via `kubectl port-forward`) - -``` -kubectl port-forward svc/hdfs-namenodes 9870:9870 -``` - -Then browse to: http://localhost:9870 - - -## Accessing Web UI (via [Nginx Ingress Controller](https://github.com/kubernetes/ingress-nginx)) - -Register the FQDNs for each component in DNS e.g. -```bash -echo "127.0.0.1 hdfs.k8s.local" | sudo tee -a /etc/hosts -``` - -Update the HDFS deployment to route ingress based on FQDNs: -```bash -helm upgrade hdfs . -f ./values-host-based-ingress.yaml --reuse-values -``` - -Set up port forwarding to the nginx ingress controller: -```bash -sudo KUBECONFIG=$HOME/.kube/config kubectl port-forward -n ingress-nginx svc/ingress-nginx 80:80 -``` - -Then browse to: http://hdfs.k8s.local