An extensible benchmark for evaluating large language models on planning
-
Updated
Mar 16, 2025 - PDDL
An extensible benchmark for evaluating large language models on planning
A Benchmark Suite for Cloud Services.
The Turing Change Point Detection Benchmark: An Extensive Benchmark Evaluation of Change Point Detection Algorithms on real-world data
Optical Flow Dataset and Benchmark for Visual Crowd Analysis
RiVEC Bencmark Suite
Machine Learning Benchmark Scripts
A selection of ANSI C benchmarks and programs useful as benchmarks
Featherlight benchmark framework, drop-in replacement for criterion and gauge.
Generate performance reports from your django database performance tests.
Tartan: Evaluating Modern GPU Interconnect via a Multi-GPU Benchmark Suite
LagrangeBench: A Lagrangian Fluid Mechanics Benchmarking Suite
This repository contains the code base for the Open Stream Processing Benchmark.
A list of benchmark suites used in the research related to compilers, program performance, scientific computations etc.
Controlgym: Large-Scale Control Environments for Benchmarking Reinforcement Learning Algorithms
A framework for benchmarking clustering algorithms
A JuMP-based library of Non-Linear and Mixed-Integer Non-Linear Programs
A CLI for benchmarking Scrapy.
GARDENIA: Graph Analytics Repository for Designing Efficient Next-generation Accelerators
Add a description, image, and links to the benchmark-suite topic page so that developers can more easily learn about it.
To associate your repository with the benchmark-suite topic, visit your repo's landing page and select "manage topics."