Riley Swish Challenge - DevSecOps Style Automated Respository

Index

Objective
Technical Requirements
Docker Overview
User Story / Implementation Notes
- Why am I using ubuntu22.04 and not a multi-stage build?
Why did you leave our GPU requirement out?
Minikube setup
Room for improvement
Actions and How to Use This Repo
Trivy + Dependabot

Objective

The objective of this repo is to create an automated DevSecOps styled repo for use in k8s. This should contain automated code reporting and fixes, and the ability to build docker images in a workflow pipeline via Github Actions

Technical Requirements

Image must contain python2, python3, and R runtimes.
Image must be k8s compatabile.
Automated Github Actions to build,scan,and push image to dockerhub.
Scan for CVEs and remedy them. As much as it is incredibly simple to use docker scout here, it's probably more efficient to use Trivy imo. It would practically be a no brainer decision in automated pipelines/GH Actions via Trivy and dependabot PRs.
I included a bad version of python 'requests' in my requirements.txt to show a HIGH vuln. CVE-2018-18074
Repo must offer automated code reporting and fixing.

Docker Overview

This image uses the following on base ubuntu22.04:

R base
Python 2.7
Python 3.10

User Story / Implementation Notes

Why am I using ubuntu22.04 and not a multi-stage build?

Still LTS and supports python2 + python3. You could definitely do a multi-stage build but for the sake of having something to talk about, I wanted to talk about how this could be improved on.

Right now with no cache, the image builds locally in about 35s according to docker buildkit. Obviously, if I was not using shared git runners and was in enterprise Github Org - the runners may indeed be much faster using self-hosted runners.

I generally find myself leaning on the Actions Runner controller helm chart for increased build times on the dedicated runners in the Action's runner itself.

At some level with this challenge, there is a few limitations not having access to -

A real production grade k8s cluster
Enterprise Github Org(Github Security SARIF report posting only works in Enterprise Orgs within private repos). It'd be nice to use Trivy to post to this.
Some kind of ALB, ingress route setup,etc publicly exposable endpoint for the Service that goes to the Deploy. (the challenge specifically asked for me to touch on this).
Probably Argo to deploy things from Helm charts

Why did you leave our GPU requirement out?

TBH - it's poor practice without a real templating system like Helm/Kustomize/ETC(chef's choice here really,whatever templating you may be using.) to put in the GPU requirement. You could indeed make a completely seperate template file for GPU nodes, but for simplicty sake - I left it out and wanted to explain myself. I wanted to show some Github Actions purism on purpose here.

aka - 'Here is what I can do in a private k8s + docker msvc repo with automated sec. using nothing but Github Actions' :)

You could of course use nvidia.com/gpu: "0" and define them as 0,but depending on what you are using I wouldn't want to present this as GPU worlkload when it is not. In any sense, now that I think about it. Obviously nodeselectors + taints/tolerations could be used too(probably), but again it sort of lands on "hacky" by definition.

Minikube setup

This can be run locally with minikube for testing purposes, and to verify the k8s comptability and run forever pod. Had to do it from minikube for this demo,imo.

Please follow the approrpirate minikube install for your OS from the official source
Enable metrics-server via minikube addons enable metrics-server - This is for our HPA step later.
Load the image with minikube image load sadminriley/python-test
Verify you've loaded the image locally if needed with the following cmds:

minikube ssh
docker@minikube:~$ crictl images|grep python-test

Apply the k8s Deployment + Service with kubectl apply -f ops/
Expose the service via the Deployment itself:

 kubectl expose deployment python-swish-r-deploy --port 8080
service/python-swish-r-deploy exposed

You can also forward the port via kubectl port-forward svc/python-swish-r-deploy 8080:8080
Optionally, open a shell on the container and run python to verify the port works

kubectl exec -it deploy/python-swish-r-deploy -- bash

I just launched a basic http.server in the image via shell to demonstrate this.

appuser@python-swish-r-deploy-6959f9c5c6-ktv58:/app$ python3 -m http.server 8080 Serving HTTP on 0.0.0.0 port 8080 (http://0.0.0.0:8080/) ...

HPA

HPA works alongside metrics-server as usual with minikube,however obviously I cannot really demo it with a real k8s workload.

kubectl get hpa                                                                      [10:56:26]
NAME                 REFERENCE                          TARGETS       MINPODS   MAXPODS   REPLICAS   AGE
python-swish-r-hpa   Deployment/python-swish-r-deploy   cpu: 2%/50%   1         3         1          67m

HPA is for scaling based on pod metrics, and is provided by metrics-server.

How do we scale resources based on events rather than resources?

I mostly go for KEDA in this situation. Remember, HPA is for scaling based on your pods metrics reported usage and only scales replicasets(this is a k8s managed resource from our Deployments defined replica count). KEDA scales based on external events and works really well with HPA from my experience.

Some reasons to scale with KEDA:

Cron schedules
Custom external metrics

How do we scale nodes though?!

On EKS??! We're always going to handle nodes with Karpenter nodepools. It's an absolute must have in most EKS clusters, in my opinion. The money saving measures combined with KEDA are just plain awesome(that's not really opinion as much as well known now) ...Or just cluster-autoscaler for incredibly basic clusters

Observability/Logging

Most real enterprise orgs are going to be using Datadog(my personal favorite), Newrelic, or the prometheus/grafana stack we see in pure open source k8s. The native prometheus alertmanager can also be used in that last one.

Bringing data to Memory?

I feel like you would probably default to a shared memory volume of some kind here - but to be honest, I'd love to hear the actual answer to this.

Room for improvement

Lots probably! For respect of everyones time(yours and mine both), I wanted to keep this mostly simple.

Actions and How to Use This Repo

There is a total of 3 usable actions in this workflow.

Ephemeral Docker Build and Push + Generate K8s artifacts

dev_dispatch.yml - This is a manual dispatch workflow that builds base ubuntu/debian images with python2,python3,R installed, per our Technical Requirements. This builds with no arguments, and you can click Run Workflow if you do not need to change the defaults. It looks like the below -

This also produces the k8s manifests you rendered in the pipeline within the outputted build artifact -

This outputs the following rendered with your values: deploy.yaml, hpa.yaml, service.yaml

You definitely could output the k8s manifests in the summary too, but it just looks messy.

Build and Push to Dockerhub

dockerhub.yml - This actions builds and pushes to dockerhub on every push to the main branch. You may want to make this a manual trigger depending on the use case, or where you are placing the governance in the git repo. I made this automatically push to hub for automation demo reasons.

Trivy + Dependabot

trivy.yml - This scans every new PR for CVE's on the image, and opens a dependabot PR if any are found related to the image's open source packages, github actions, and Docker bases.

There is no code in this repo, so there is no need for code scanning enabled. In a more real prod setup, we would obviously want to enable scanning on the code.

Going to Insights > Dependency Graph will show you not only the Dependency Graph, but also generate an SBOM if you need it.

Dependabot config is pretty basic, as explained above for what it scans. It opens PRs based on recommended fixes, and you can certainly set them to auto-merge but I prefer to review them.

In theory, as long as we stay vigilant and actually use dependanbot PRs to remedy the CVEs it finds and opens, we are in a decent spot for automated security.

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
.github		.github
ops		ops
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Riley Swish Challenge - DevSecOps Style Automated Respository

Index

Objective

Technical Requirements

Docker Overview

User Story / Implementation Notes

Why am I using ubuntu22.04 and not a multi-stage build?

Why did you leave our GPU requirement out?

Minikube setup

HPA

How do we scale resources based on events rather than resources?

How do we scale nodes though?!

Observability/Logging

Bringing data to Memory?

Room for improvement

Actions and How to Use This Repo

Ephemeral Docker Build and Push + Generate K8s artifacts

Build and Push to Dockerhub

Trivy + Dependabot

About

Uh oh!

Releases 1

Packages

Uh oh!

Languages

sadminriley/python-swish-r

Folders and files

Latest commit

History

Repository files navigation

Riley Swish Challenge - DevSecOps Style Automated Respository

Index

Objective

Technical Requirements

Docker Overview

User Story / Implementation Notes

Why am I using ubuntu22.04 and not a multi-stage build?

Why did you leave our GPU requirement out?

Minikube setup

HPA

How do we scale resources based on events rather than resources?

How do we scale nodes though?!

Observability/Logging

Bringing data to Memory?

Room for improvement

Actions and How to Use This Repo

Ephemeral Docker Build and Push + Generate K8s artifacts

Build and Push to Dockerhub

Trivy + Dependabot

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Languages

Packages