LoadWeaver is a lightweight and flexible Hadoop workload generator.
The current Hadoop workload generators is either trace-driven(e.g. ) or model-driven(e.g. ). Both of these two kinds enforce the workload arrival pattern, along with some statistical models or previously generated traces. However, during our works on optimization on Hadoop systems, we always need to make out particular pattern of arrives of Hadoop jobs, which are very hard to extract from huge production trace file or described by certain stochastic model.
LoadWeaver fills the hole. LoadWeaver cannot only generates workloads along with statistical models to provide similar workloads in production clusters, but also enable users to give their synthetic trace to generate workloads with specified arrival pattern, size, resource using characteristics, etc.