Currently benchmarks are only run manually, would be nice to have a history and possibly fail if there are significant regressions.