Skip to content

Commit

Permalink
docs: Add documentation for evaluate module
Browse files Browse the repository at this point in the history
  • Loading branch information
sweep-ai[bot] committed Dec 22, 2023
1 parent c7a522a commit 04ec956
Showing 1 changed file with 75 additions and 0 deletions.
75 changes: 75 additions & 0 deletions docs/evaluate.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
# Evaluate Module

The `evaluate` module in the DSPy framework is designed to evaluate the performance of the models. It plays a crucial role in the framework by providing a way to measure the effectiveness of the models and their configurations.

## Evaluate Class

The `Evaluate` class is the main class in the `evaluate` module. It provides methods to execute the evaluation in single or multiple threads, update the progress of the evaluation, and call the `Evaluate` class instance as a function.

### __init__(self, *, devset, metric=None, num_threads=1, display_progress=False, display_table=False, display=True, max_errors=5)

This method initializes the `Evaluate` class with the following parameters:

- `devset`: The development set for evaluation.
- `metric`: The metric used for evaluation. Defaults to `None`.
- `num_threads`: The number of threads used for evaluation. Defaults to `1`.
- `display_progress`: A boolean indicating whether to display the progress of the evaluation. Defaults to `False`.
- `display_table`: A boolean indicating whether to display the evaluation results in a table. Defaults to `False`.
- `display`: A boolean indicating whether to display the evaluation results. Defaults to `True`.
- `max_errors`: The maximum number of errors allowed during the evaluation. Defaults to `5`.

### _execute_single_thread(self, wrapped_program, devset, display_progress)

This method executes the evaluation in a single thread. It takes the following parameters:

- `wrapped_program`: The program to be evaluated.
- `devset`: The development set for evaluation.
- `display_progress`: A boolean indicating whether to display the progress of the evaluation.

### _execute_multi_thread(self, wrapped_program, devset, num_threads, display_progress)

This method executes the evaluation in multiple threads. It takes the following parameters:

- `wrapped_program`: The program to be evaluated.
- `devset`: The development set for evaluation.
- `num_threads`: The number of threads used for evaluation.
- `display_progress`: A boolean indicating whether to display the progress of the evaluation.

### _update_progress(self, pbar, ncorrect, ntotal)

This method updates the progress bar during the evaluation. It takes the following parameters:

- `pbar`: The progress bar.
- `ncorrect`: The number of correct predictions.
- `ntotal`: The total number of predictions.

### __call__(self, program, metric=None, devset=None, num_threads=None, display_progress=None, display_table=None, display=None, return_all_scores=False)

This method is used to call the `Evaluate` class instance as a function. It takes the following parameters:

- `program`: The program to be evaluated.
- `metric`: The metric used for evaluation. Defaults to `None`.
- `devset`: The development set for evaluation. Defaults to `None`.
- `num_threads`: The number of threads used for evaluation. Defaults to `None`.
- `display_progress`: A boolean indicating whether to display the progress of the evaluation. Defaults to `None`.
- `display_table`: A boolean indicating whether to display the evaluation results in a table. Defaults to `None`.
- `display`: A boolean indicating whether to display the evaluation results. Defaults to `None`.
- `return_all_scores`: A boolean indicating whether to return all scores. Defaults to `False`.

## Usage

Here is an example of how to use the `evaluate` module:

```python
from dspy.evaluate import Evaluate

# Initialize the Evaluate class
evaluator = Evaluate(devset=devset, metric=metric)

# Call the Evaluate class instance as a function
score = evaluator(program)
```

## Conclusion

The `evaluate` module is a powerful tool in the DSPy framework. It provides a flexible and efficient way to evaluate the performance of the models and their configurations. By using the `evaluate` module, you can ensure that your models are performing as expected and make necessary adjustments to improve their performance.

0 comments on commit 04ec956

Please sign in to comment.