Skip to content

BurnyCoder/llm-evals

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

llm-evals

This repository contains a web-based application for evaluating Large Language Models.

image

Features

  • Generate test questions on a given topic.
  • Provide your own custom questions.
  • Customize the evaluation prompt.
  • View evaluation results and average scores.

Setup

  1. Clone the repository:

    git clone https://github.com/BurnyCoder/llm-evals.git
    cd llm-evals
  2. Create a virtual environment and install dependencies:

    python -m venv venv
    venv\Scripts\activate  # On Windows
    # source venv/bin/activate  # On macOS/Linux
    pip install -r requirements.txt
  3. Create a .env file in the root directory and add your OpenAI API key:

    OPENAI_API_KEY=your_api_key_here
    

Usage

Run the Flask application:

python app.py

Open your web browser and go to http://127.0.0.1:5000.

About

Evaluating Large Language Models. Generate test questions on a given topic. Provide your own custom questions. Customize the evaluation prompt. View evaluation results and average scores.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors