Skip to content

Metrics Exporter

gvdongen edited this page Aug 11, 2021 · 5 revisions

Mainly copied from metrics exporter README on 9/7/21

The code can be found here.

Description

This component scrapes the JMX endpoints of the workers of each of the frameworks and publishes these onto a Kafka topic. By default, it scrapes every second. The data that is published onto Kafka is later read by the output consumer component which publishes it onto S3. It also scrapes some cAdvisor metrics.

It collects the following metrics:

  • Heap memory: used and committed
  • Off-heap memory: used and committed
  • System CPU load, process CPU load and system load average
  • GC metrics per memory pool (eden, survivor and old generation):
    • collection time
    • collection count
    • duration of last GC
    • memory before and after last GC

The CPU and load metrics are not used in our evaluation suite since we use the ones on a container level coming from cadvisor.

Deployment

This component requires the following environment variables:

  • FRAMEWORK: either FLINK/KAFKASTREAMS/SPARK/STRUCTUREDSTREAMIGNG

  • TOPICNAME: name of the topic currently used for the benchmark run. This component will publish to the topic metrics-$TOPICNAME.

  • JMX_HOSTS: the hosts from which metrics should be scraped, so the host of each framework cluster component.

  • CLUSTER_URL: IP of the DC/OS cluster. We retrieve this in our scripts with:

    DCOS_DNS_ADDRESS=$(aws cloudformation describe-stacks --region eu-west-1 --stack-name=streaming-benchmark | jq '.Stacks[0].Outputs | .[] | select(.Description=="Master") | .OutputValue' |  awk '{print tolower($0)}')
    export CLUSTER_URL=http://${DCOS_DNS_ADDRESS//\"}
    echo $CLUSTER_UR
    
  • DCOS_ACCESS_TOKEN: token to access DC/OS. We retrieve this in our scripts with:

    dcos config show core.dcos_acs_token
    
  • CADVISOR_HOSTS: cadvisor hosts

  • KAFKA_BOOTSTRAP_SERVERS: Kafka brokers

You can run this component as a Docker container next to cAdvisor and a framework cluster.