Skip to content
#

benchmarking-framework

Here are 11 public repositories matching this topic...

Python Multi-Process Execution Pool: concurrent asynchronous execution pool with custom resource constraints (memory, timeouts, affinity, CPU cores and caching), load balancing and profiling capabilities of the external apps on NUMA architecture

  • Updated Aug 28, 2019
  • Python

MixEval, a ground-truth-based dynamic benchmark derived from off-the-shelf benchmark mixtures, which evaluates LLMs with a highly capable model ranking (i.e., 0.96 correlation with Chatbot Arena) while running locally and quickly (6% the time and cost of running MMLU), with its queries being stably updated every month to avoid contamination.

  • Updated Jun 1, 2024
  • Python

Improve this page

Add a description, image, and links to the benchmarking-framework topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the benchmarking-framework topic, visit your repo's landing page and select "manage topics."

Learn more