Skip to content

Files

Latest commit

 

History

History

docs

Eval Harness Documentation

Welcome to the docs for the LM Evaluation Harness!

Table of Contents

  • To learn about the public interface of the library, as well as how to evaluate via the command line or as integrated into an external library, see the Interface.
  • To learn how to add a new library, API, or model type to the library, as well as a quick explainer on the types of ways to evaluate an LM, see the Model Guide.
    • For an extended description of how to extend the library to new model classes served over an API, see the API Guide.
  • For a crash course on adding new tasks to the library, see our New Task Guide.
  • To learn more about pushing the limits of task configuration that the Eval Harness supports, see the Task Configuration Guide.