Skip to content

Extension of the Yahoo Streaming Benchmark with task-level performance analysis and detailed CPU measurments

Notifications You must be signed in to change notification settings

rankj/YSB-task-level

 
 

Repository files navigation

[TLC - Task Level CPU Evaluation for Distributed Stream Processing Systems ]

Mission

Our mission is to empower developers to optimize the performance of big data systems through innovative performance tools and awareness. As the complexity of IT systems continues to increase, it is crucial to continuously improve and advance the ways in which we monitor and improve performance in order to ensure efficient and effective operation.

Background

This repository provides a CPU monitoring concept and benchmark integration to allow fine-grained CPU evaluations of Distributed Stream Processing Systems (DSPS). While traditional performance analysis of DSPS focuses on latency and throughput to evaluate the performance of a system, our goal is to evaluate CPU efficiency, which becomes increasingly important in the context of energy savings, restricted resource environments (e.g. IoT edge computing) or cost savings in pay-as-you-go cloud deployments. A special feature of this performance evaluation approach is that we do not measure CPU performance on process-level but for each streaming task individually. This allows detailed insights into the actual performance behavior. This way you can monitor how different factors affect the performance of individual parts of the code.

The monitoring uses stacktrace sampling based on the extended Berkley Package Filter (eBPF) in combination with performance monitoring counters (PMC). The performance monitoring can be integrated into any streaming system. For demonstration purpose we provide an integration with an extended Version of the Yahoo Streaming Benchmark

Benchmark Integration

Our benchmark integration supports the following engines:

  • Apache Flink
  • Apache Spark Structured Streaming
  • Apache Spark Structured Streaming Continous Processing Mode

For other engines we do not provide a YSB.jar right now. However, you may use the monitoring capabilities for other DSPS as well.

Instructions for Running the Example Benchmark (Video)

Stay tuned for further updates on this project!

Related Work

Yahoo Streaming Benchmark: https://github.com/yahoo/streaming-benchmarks Perf-map-agent: https://github.com/jvm-profiling-tools/perf-map-agent Bpftrace: https://github.com/iovisor/bpftrace

About

Extension of the Yahoo Streaming Benchmark with task-level performance analysis and detailed CPU measurments

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Shell 53.6%
  • Java 22.8%
  • Clojure 20.5%
  • Scala 3.1%