Skip to content

Commit

Permalink
Added demo.
Browse files Browse the repository at this point in the history
- Added helper scripts for running the apps.
- Added templates to express affinity and anti-affinity.
- Updated/Added readme appropriately.
- Added asciicast where appropriate.
  • Loading branch information
balajismaniam authored and ConnorDoyle committed Sep 30, 2016
1 parent 41abde1 commit 320845f
Show file tree
Hide file tree
Showing 15 changed files with 445 additions and 15 deletions.
1 change: 1 addition & 0 deletions .dockerignore
@@ -1,3 +1,4 @@
node-feature-discovery
node-feature-discovery-job.json
vendor/
demo/
2 changes: 2 additions & 0 deletions .gitignore
@@ -1,6 +1,8 @@
node-feature-discovery
node-feature-discovery-job.json
vendor/
demo/helper-scripts/*.pdf
demo/helper-scripts/*.log
intel-cmt-cat/
rdt-discovery/l2-alloc-discovery
rdt-discovery/l3-alloc-discovery
Expand Down
24 changes: 9 additions & 15 deletions README.md
Expand Up @@ -14,6 +14,7 @@
- [Targeting nodes with specific features](#targeting-nodes-with-specific-features)
- [References](#references)
- [License](#license)
- [Demo](#demo)

_**NOTE:** We are gathering evidence in order to graduate from the Kubernetes incubator. If you are a user of the project, please add yourself to [this list](https://github.com/kubernetes-incubator/node-feature-discovery/wiki/Users) with as much detail as you are comfortable providing (name and email optional)._

Expand All @@ -25,20 +26,7 @@ those features using node labels.

## Command line interface

```
node-feature-discovery.
Usage:
node-feature-discovery [--no-publish --sources=<sources>]
node-feature-discovery -h | --help
node-feature-discovery --version
Options:
-h --help Show this screen.
--version Output version and exit.
--sources=<sources> Comma separated list of feature sources. [Default: cpuid,rdt,pstate]
--no-publish Do not publish discovered features to the cluster-local Kubernetes API server.
```
[![asciicast](https://asciinema.org/a/arabw7ch52jev90sjjk242vh9.png)](https://asciinema.org/a/arabw7ch52jev90sjjk242vh9)

## Feature discovery

Expand Down Expand Up @@ -115,7 +103,9 @@ repo that demonstrates how to deploy the job to unlabeled nodes.

The discovery script will launch a job on each each unlabeled node in the
cluster. When the job runs, it contacts the Kubernetes API server to add labels
to the node to advertise hardware features (initially, from `cpuid` and RDT).
to the node to advertise hardware features (initially, from `cpuid`, RDT and p-state).

[![asciicast](https://asciinema.org/a/11wir751y89617oemwnsgli4a.png)](https://asciinema.org/a/11wir751y89617oemwnsgli4a)

## Building from source

Expand Down Expand Up @@ -202,6 +192,10 @@ This is a [Kubernetes Incubator project](https://github.com/kubernetes/community

This is open source software released under the [Apache 2.0 License](LICENSE).

## Demo

A demo on the benefits of using node feature discovery can be found in [demo](demo/).

<!-- Links -->
[cpuid]: http://man7.org/linux/man-pages/man4/cpuid.4.html
[intel-rdt]: http://www.intel.com/content/www/us/en/architecture-and-technology/resource-director-technology.html
Expand Down
88 changes: 88 additions & 0 deletions demo/README.md
@@ -0,0 +1,88 @@
# Demo on Node Feature Discovery
- [Demo Overview](#demo-overview)
- [Instructions to Reproduce the Demo](#instructions-to-reproduce-the-demo)

## Demo Overview
In order to show the potential performance benefit from the node feature discovery project, we ran an experiment on three identical Kubernetes nodes. Each node consists of a single-socket Intel(R) Xeon(R) D-1521 with eight cores.

We wanted to demonstrate how to target nodes with turbo boost using node feature discovery. Turbo boost is a hardware feature which allows dynamic overclocking of CPUs. Using this feature, can result in a potential performance benefit. But naively using turbo boost can be detrimental to performance for some applications [[1]][ref-1]. The ability to target nodes with or without turbo boost depending on the application can be useful in such scenarios. We intentionally disabled turbo boost in two of these nodes for demo purposes.

Our experiment involved running the same application ten times with and without node feature discovery. We use the Ferret benchmark from the PARSEC benchmark suite [[2]][parsec] as our application. The benchmark implements an image similarity search. It is expected to benefit from turbo boost as it is CPU intensive [[3][ref-3], [4][ref-4]].

Without node feature discovery, two-thirds of the application instances will run on nodes without turbo boost and as a result be less-performant. By using feature discovery, we are able to target the node with turbo boost and gain performance. A pod template to express affinity to nodes with turbo boost can be found [here](helper-scripts/demo-pod-with-discovery.json.parsec.template).

The figure below shows box plots that illustrates the variability in normalized execution time of running ten application instances with and without node feature discovery. The execution time of the runs are normalized to the best-performing run and the change in the normalized execution time is shown (0 represents the best performing run). With node feature discovery, under this experimental setup, we can see significant improvement in performance. Moreover, we also reduce the performance variability between different application instances.

![Performance benefit from affinity to nodes with turbo boost.](docs/performance-comparison-parsec-norm.png)

In contrast, some applications do not benefit from using turbo boost. We use CloverLeaf mini-app from the Mantevo benchmark suite [[5]][ref-5] to show how certain applications will benefit from using nodes without turbo boost [[6]][ref-6]. The mini-app solves compressible Euler equations on a Cartesian grid. Our experiment was the same as before (i.e., run ten instances of CloverLeaf with and without node feature discovery).

Without node feature discovery, one-third of the instances are expected to be less-performant. We use node anti-affinity to target nodes without turbo boost and improve the performance. A pod template to express anti-affinity to nodes with turbo boost can be found [here](helper-scripts/demo-pod-with-discovery.yaml.cloverleaf.template).

The figure below shows the variation of normalized performance (execution time) of running the application instances with and without node feature discovery. Similar to our previous example, with anti-affinity to nodes without turbo boost we gain performance and decrease performance variability.

![Performance benefit from anti-affinity to nodes with turbo boost](docs/performance-comparison-cloverleaf-norm.png)

While our example illustrates the benefits of using node feature discovery with and without turbo boost, it can be used to gain performance predictability and improvement for other applications by targeting nodes with other features and configurations in a Kubernetes cluster. For example, many scientific and machine learning applications can benefit from targeting nodes with AVX instruction set [[7][ref-7], [8][ref-8]] and many web services can take advantage of the AES-NI instruction set [[9][ref-9]]. Moreover, complex user requirements can be expressed by targeting nodes with multiple features and a combination of configurations.

## Instructions to Reproduce the Demo

Scripts to reproduce our demo results can be found in [helper-scripts](helper-scripts/).
### Prerequisites
1. `Kubectl` should be configured properly to work with your Kubernetes cluster.
2. Node feature discovery should have been already deployed on your Kubernetes cluster.

### Instructions
Follow these easy steps to reproduce the demo.

1. `cd <helper-script-root>`
2. `./run-with-discovery.sh -v <node-feature-discovery-version> -a parsec`
3. `./run-with-discovery.sh -v <node-feature-discovery-version> -a cloverleaf`
4. `/aggregate-logs-and-plot.sh -a parsec`
5. `/aggregate-logs-and-plot.sh -a cloverleaf`

Following the above steps will produce the performance and normalized performance logs and their corresponding plots for each application.

### Script Documentation
`run-with-discovery.sh` takes the node feature discovery version and the application name as the input and runs ten application instances using node feature discovery.
```sh
> ./run-with-discovery.sh -h
Usage: run-with-discovery.sh [-v DISCOVERY_VERSION] [-a APPLICATION_NAME]
Runs pods ten times with discovery enabled.

-v DISCOVERY_VERSION target discovery version DISCOVERY_VERSION.
-a APPLICATION_NAME run the pods with APPLICATION_NAME application.
APPLICATION_NAME can be one of parsec or cloverleaf.
```

`run-without-discovery.sh` takes the application name as the input and runs ten application instances without using node feature discovery.
```sh
>./run-without-discovery.sh -h
Usage: run-without-discovery.sh [-a APPLICATION_NAME]
Runs ten pods without discovery enabled with the specified application.

-a APPLICATION_NAME run the pods with APPLICATION_NAME application.
APPLICATION_NAME can be one of parsec or cloverleaf.
```

`aggregate-logs-and-plot.sh` takes the application name as the input and aggregates the logs from the runs with and without node feature discovery and plots the result.
```sh
>./aggregate-logs-and-plot.sh -h
Usage: aggregate-logs-and-plot.sh [-a APPLICATION_NAME]
Aggregate the results from the specified application and plot the result.

-a APPLICATION_NAME run the pods with APPLICATION_NAME application.
APPLICATION_NAME can be one of parsec or cloverleaf
```

<!-- Links -->
[parsec]: http://parsec.cs.princeton.edu/
[ref-1]: http://csl.stanford.edu/~christos/publications/2014.autoturbo.hpca.pdf
[ref-3]: http://parsec.cs.princeton.edu/publications/bienia08characterization.pdf
[ref-4]: http://parsec.cs.princeton.edu/publications/bienia08comparison.pdf
[ref-5]: https://mantevo.org
[ref-6]: https://mantevo.org/about/publications
[ref-7]: https://software.intel.com/en-us/intel-mkl
[ref-8]: https://software.intel.com/en-us/blogs/daal
[ref-9]: https://software.intel.com/en-us/articles/intel-aes-ni-performance-enhancements-hytrust-datacontrol-case-study

Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo/docs/performance-comparison-parsec-norm.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
65 changes: 65 additions & 0 deletions demo/helper-scripts/aggregate-logs-and-plot.sh
@@ -0,0 +1,65 @@
#!/usr/bin/env bash
show_help() {
cat << EOF
Usage: ${0##*/} [-a APPLICATION_NAME]
Aggregate the results from the specified application and plot the result.
-a APPLICATION_NAME run the pods with APPLICATION_NAME application.
APPLICATION_NAME can be one of parsec or cloverleaf.
EOF
}

if [ $# -eq 0 ]
then
show_help
exit 1
fi

app="parsec"

OPTIND=1
options="ha:"
while getopts $options option
do
case $option in
a)
if [ "$OPTARG" == "parsec" ] || [ "$OPTARG" == "cloverleaf" ]
then
app=$OPTARG
else
echo "Invalid application name."
show_help
exit 0
fi
;;
h)
show_help
exit 0
;;
'?')
show_help
exit 1
;;
esac
done

for i in {1..10}
do
kubectl logs -f demo-$app-$i-wo-discovery | grep real | cut -f2 | sed -e "s/m/*60+/" -e "s/s//" | bc >> temp.log
done
paste <(cat labels-without-discovery-$app.log) <(cat temp.log) > performance.log
rm -f temp.log labels-without-discovery-$app.log

for i in {1..10}
do
kubectl logs -f demo-$app-$i-with-discovery | grep real | cut -f2 | sed -e "s/m/*60+/" -e "s/s//" | bc >> temp.log
done
paste <(cat labels-with-discovery-$app.log) <(cat temp.log) >> performance.log
rm -f temp.log labels-with-discovery-$app.log

minimum=$(awk 'min=="" || $2 < min {min=$2} END {print min}' performance.log)
awk -v min=$minimum '{print $1,((($2/min)*100))-100}' performance.log > performance-norm.log
./box-plot.R performance.log performance-comparison-$app.pdf
./box-plot-norm.R performance-norm.log performance-comparison-$app-norm.pdf

./clean-up.sh -a $app
10 changes: 10 additions & 0 deletions demo/helper-scripts/box-plot-norm.R
@@ -0,0 +1,10 @@
#!/usr/bin/env Rscript
library(ggplot2)

argv <- commandArgs(T)
inFile <- argv[1]
outFile <- argv[2]
tab = read.table(inFile)
dat <- data.frame(Discovery = tab[,1], Time = tab[,2])
bplot <- ggplot(dat, aes(x=Discovery, y=Time, fill=Discovery)) + geom_boxplot() + guides(fill=FALSE) + ggtitle("Performance Comparison With and Without Discovery Enabled") + xlab("") + ylab("% Variation in Normalized Execution Time") + expand_limits(y=c(0,50))
ggsave(outFile, device="pdf")
10 changes: 10 additions & 0 deletions demo/helper-scripts/box-plot.R
@@ -0,0 +1,10 @@
#!/usr/bin/env Rscript
library(ggplot2)

argv <- commandArgs(T)
inFile <- argv[1]
outFile <- argv[2]
tab = read.table(inFile)
dat <- data.frame(Discovery = tab[,1], Time = tab[,2])
bplot <- ggplot(dat, aes(x=Discovery, y=Time, fill=Discovery)) + geom_boxplot() + guides(fill=FALSE) + ggtitle("Performance Comparison With and Without Discovery Enabled") + xlab("") + ylab("Time in Seconds") + expand_limits(y=0)
ggsave(outFile, device="pdf")
55 changes: 55 additions & 0 deletions demo/helper-scripts/clean-up.sh
@@ -0,0 +1,55 @@
#!/usr/bin/env bash
show_help() {
cat << EOF
Usage: ${0##*/} [-a APPLICATION_NAME]
Clean-up pods with and without discovery enabled for the specified application.
-a APPLICATION_NAME clean-up the pods with APPLICATION_NAME application.
APPLICATION_NAME can be one of parsec or cloverleaf.
EOF
}

if [ $# -eq 0 ]
then
show_help
exit 1
fi

app="parsec"

OPTIND=1
options="ha:"
while getopts $options option
do
case $option in
a)
if [ "$OPTARG" == "parsec" ] || [ "$OPTARG" == "cloverleaf" ]
then
app=$OPTARG
else
echo "Invalid application name."
show_help
exit 0
fi
;;
h)
show_help
exit 0
;;
'?')
show_help
exit 1
;;
esac
done

echo "Using application name = $app."
for i in {1..10}
do
kubectl delete po demo-$app-$i-wo-discovery
done

for i in {1..10}
do
kubectl delete po demo-$app-$i-with-discovery
done
25 changes: 25 additions & 0 deletions demo/helper-scripts/demo-pod-with-discovery.json.parsec.template
@@ -0,0 +1,25 @@
{
"apiVersion": "v1",
"kind": "Pod",
"metadata": {
"name": "demo-parsec-NUM"
},
"spec": {
"containers": [
{
"image": "intelsdi/node-feature-discovery-APP",
"name": "demo-container-parsec-NUM",
"ports": [
{
"containerPort": 3351,
"hostPort": 10001
}
]
}
],
"nodeSelector": {
"node.alpha.intel.com/VER-pstate-turbo": "true"
},
"restartPolicy": "Never"
}
}
@@ -0,0 +1,30 @@
apiVersion: v1
kind: Pod
metadata:
name: demo-cloverleaf-NUM
annotations:
scheduler.alpha.kubernetes.io/affinity: |
{
"nodeAffinity": {
"requiredDuringSchedulingIgnoredDuringExecution": {
"nodeSelectorTerms": [
{
"matchExpressions": [
{
"key": "node.alpha.intel.com/VER-pstate-turbo",
"operator": "DoesNotExist"
}
]
}
]
}
}
}
spec:
containers:
- name: demo-container-cloverleaf-NUM
image: intelsdi/node-feature-discovery-APP
ports:
- containerPort: 3551
hostPort: 10001
restartPolicy: Never
22 changes: 22 additions & 0 deletions demo/helper-scripts/demo-pod-without-discovery.json.template
@@ -0,0 +1,22 @@
{
"apiVersion": "v1",
"kind": "Pod",
"metadata": {
"name": "demo-APP-NUM"
},
"spec": {
"containers": [
{
"image": "intelsdi/node-feature-discovery-IMG",
"name": "demo-container-APP-NUM",
"ports": [
{
"containerPort": 3351,
"hostPort": 10001
}
]
}
],
"restartPolicy": "Never"
}
}

0 comments on commit 320845f

Please sign in to comment.