GitHub - LHL3341/api

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
api_tool		api_tool
examples		examples
.gitignore		.gitignore
README		README
requirements.txt		requirements.txt
setup.py		setup.py

Repository files navigation

# 🤖 LLM-as-Judge

> 自动化的模型输出评估系统，基于 LLM 自评（LLM-as-a-Judge）理念。  
> 支持文本与多模态输入，兼容 OpenAI API。

---

## 🚀 功能特性

- ✅ **流式推理**（实时消费输出）
- ✅ **并发任务执行**（asyncio + semaphore）
- ✅ **多模态支持**（图像 Base64 编码）
- ✅ **Token 统计**（自动计算输入/输出 Token）
- ✅ **可视化输出分布**（matplotlib 绘制得分直方图）
- ✅ **易扩展架构**（Evaluator 插拔式设计）

---

## 🧩 安装

```bash
git clone https://github.com/LHL3341/api_tool.git
cd llm-as-judge
pip install -r requirements.txt


## 项目结构

llm_as_judge/
├── __init__.py
├── main.py                    # CLI 启动器
├── config.py                  # 配置加载 & OpenAI 客户端
│
├── evaluator/                 # 核心评估逻辑
│   ├── __init__.py
│   ├── base.py                # 抽象 Evaluator 基类
│   ├── stream_handler.py      # 流式响应消费器
│   └── llm_evaluator.py       # LLM 评测主逻辑
│
├── utils/                     # 通用工具函数
│   ├── __init__.py
│   ├── io_utils.py            # 文件读写（JSONL）
│   ├── image_utils.py         # 图像编码 & 缩放
│   ├── token_utils.py         # Token 统计
│   ├── prompt_utils.py        # Prompt 模板安全填充
│   ├── plot_utils.py          # 绘制得分分布图
│   └── progress_utils.py      # Rich 进度条封装
│
├── prompts/
│   └── eval_prompt.txt        # 模板示例（可自定义）
│
├── data/
│   └── dataset.jsonl          # 评测数据（示例）
│
├── outputs/                   # 评测结果输出
│   └── results.jsonl
│
├── config.yaml                # 主配置文件
└── config_template.yaml       # 模板配置