Skip to content

sadminriley/python-swish-r

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

52 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Riley Swish Challenge - DevSecOps Style Automated Respository

Index

Objective

The objective of this repo is to create an automated DevSecOps styled repo for use in k8s. This should contain automated code reporting and fixes, and the ability to build docker images in a workflow pipeline via Github Actions

Technical Requirements

  • Image must contain python2, python3, and R runtimes.
  • Image must be k8s compatabile.
  • Automated Github Actions to build,scan,and push image to dockerhub.
  • Scan for CVEs and remedy them. As much as it is incredibly simple to use docker scout here, it's probably more efficient to use Trivy imo. It would practically be a no brainer decision in automated pipelines/GH Actions via Trivy and dependabot PRs.
  • I included a bad version of python 'requests' in my requirements.txt to show a HIGH vuln. CVE-2018-18074
  • Repo must offer automated code reporting and fixing.

Docker Overview

This image uses the following on base ubuntu22.04:

  • R base
  • Python 2.7
  • Python 3.10

User Story / Implementation Notes

Why am I using ubuntu22.04 and not a multi-stage build?

Still LTS and supports python2 + python3. You could definitely do a multi-stage build but for the sake of having something to talk about, I wanted to talk about how this could be improved on.

Right now with no cache, the image builds locally in about 35s according to docker buildkit. Obviously, if I was not using shared git runners and was in enterprise Github Org - the runners may indeed be much faster using self-hosted runners.

I generally find myself leaning on the Actions Runner controller helm chart for increased build times on the dedicated runners in the Action's runner itself.

At some level with this challenge, there is a few limitations not having access to -

  • A real production grade k8s cluster
  • Enterprise Github Org(Github Security SARIF report posting only works in Enterprise Orgs within private repos). It'd be nice to use Trivy to post to this.
  • Some kind of ALB, ingress route setup,etc publicly exposable endpoint for the Service that goes to the Deploy. (the challenge specifically asked for me to touch on this).
  • Probably Argo to deploy things from Helm charts

Why did you leave our GPU requirement out?

TBH - it's poor practice without a real templating system like Helm/Kustomize/ETC(chef's choice here really,whatever templating you may be using.) to put in the GPU requirement. You could indeed make a completely seperate template file for GPU nodes, but for simplicty sake - I left it out and wanted to explain myself. I wanted to show some Github Actions purism on purpose here.

aka - 'Here is what I can do in a private k8s + docker msvc repo with automated sec. using nothing but Github Actions' :)

You could of course use nvidia.com/gpu: "0" and define them as 0,but depending on what you are using I wouldn't want to present this as GPU worlkload when it is not. In any sense, now that I think about it. Obviously nodeselectors + taints/tolerations could be used too(probably), but again it sort of lands on "hacky" by definition.

Minikube setup

This can be run locally with minikube for testing purposes, and to verify the k8s comptability and run forever pod. Had to do it from minikube for this demo,imo.

  • Please follow the approrpirate minikube install for your OS from the official source
  • Enable metrics-server via minikube addons enable metrics-server - This is for our HPA step later.
  • Load the image with minikube image load sadminriley/python-test
  • Verify you've loaded the image locally if needed with the following cmds:
minikube ssh
docker@minikube:~$ crictl images|grep python-test
  • Apply the k8s Deployment + Service with kubectl apply -f ops/
  • Expose the service via the Deployment itself:
 kubectl expose deployment python-swish-r-deploy --port 8080
service/python-swish-r-deploy exposed
  • You can also forward the port via kubectl port-forward svc/python-swish-r-deploy 8080:8080
  • Optionally, open a shell on the container and run python to verify the port works

kubectl exec -it deploy/python-swish-r-deploy -- bash

I just launched a basic http.server in the image via shell to demonstrate this.

appuser@python-swish-r-deploy-6959f9c5c6-ktv58:/app$ python3 -m http.server 8080 Serving HTTP on 0.0.0.0 port 8080 (http://0.0.0.0:8080/) ...

HPA

HPA works alongside metrics-server as usual with minikube,however obviously I cannot really demo it with a real k8s workload.

kubectl get hpa                                                                      [10:56:26]
NAME                 REFERENCE                          TARGETS       MINPODS   MAXPODS   REPLICAS   AGE
python-swish-r-hpa   Deployment/python-swish-r-deploy   cpu: 2%/50%   1         3         1          67m

HPA is for scaling based on pod metrics, and is provided by metrics-server.

How do we scale resources based on events rather than resources?

  • I mostly go for KEDA in this situation. Remember, HPA is for scaling based on your pods metrics reported usage and only scales replicasets(this is a k8s managed resource from our Deployments defined replica count). KEDA scales based on external events and works really well with HPA from my experience.

Some reasons to scale with KEDA:

  • Cron schedules
  • Custom external metrics

How do we scale nodes though?!

On EKS??! We're always going to handle nodes with Karpenter nodepools. It's an absolute must have in most EKS clusters, in my opinion. The money saving measures combined with KEDA are just plain awesome(that's not really opinion as much as well known now) ...Or just cluster-autoscaler for incredibly basic clusters

Observability/Logging

Most real enterprise orgs are going to be using Datadog(my personal favorite), Newrelic, or the prometheus/grafana stack we see in pure open source k8s. The native prometheus alertmanager can also be used in that last one.

Bringing data to Memory?

I feel like you would probably default to a shared memory volume of some kind here - but to be honest, I'd love to hear the actual answer to this.

Room for improvement

  • Lots probably! For respect of everyones time(yours and mine both), I wanted to keep this mostly simple.

Actions and How to Use This Repo

There is a total of 3 usable actions in this workflow.

Ephemeral Docker Build and Push + Generate K8s artifacts

  • dev_dispatch.yml - This is a manual dispatch workflow that builds base ubuntu/debian images with python2,python3,R installed, per our Technical Requirements. This builds with no arguments, and you can click Run Workflow if you do not need to change the defaults. It looks like the below -
Image

This also produces the k8s manifests you rendered in the pipeline within the outputted build artifact - Image

This outputs the following rendered with your values: deploy.yaml, hpa.yaml, service.yaml

You definitely could output the k8s manifests in the summary too, but it just looks messy.

Build and Push to Dockerhub

  • dockerhub.yml - This actions builds and pushes to dockerhub on every push to the main branch. You may want to make this a manual trigger depending on the use case, or where you are placing the governance in the git repo. I made this automatically push to hub for automation demo reasons.

Trivy + Dependabot

  • trivy.yml - This scans every new PR for CVE's on the image, and opens a dependabot PR if any are found related to the image's open source packages, github actions, and Docker bases.
Image

There is no code in this repo, so there is no need for code scanning enabled. In a more real prod setup, we would obviously want to enable scanning on the code.

Going to Insights > Dependency Graph will show you not only the Dependency Graph, but also generate an SBOM if you need it.

Dependabot config is pretty basic, as explained above for what it scans. It opens PRs based on recommended fixes, and you can certainly set them to auto-merge but I prefer to review them.

Image

In theory, as long as we stay vigilant and actually use dependanbot PRs to remedy the CVEs it finds and opens, we are in a decent spot for automated security.

About

Swish challenge. Automated + Secure devsecops styled microservices repo for k8s.

Resources

Stars

Watchers

Forks

Packages

No packages published