Skip to content

holotherapper/adlib

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

English | 日本語

ADLIB

A language-aware ASR (Automatic Speech Recognition) benchmark framework for Japanese.

Existing Japanese ASR benchmarks apply English evaluation frameworks directly, ignoring language-specific challenges like notation variance, code-switching, and the use of four writing systems simultaneously. ADLIB addresses these issues with minimal normalization and structured handling of orthographic variation.

Domains

Domain Description Status
devterm Software development terminology v1.0.0

Quick Start

1. Install

git clone https://github.com/holotherapper/adlib
cd adlib
pip install .

2. Download Dataset

Audio files are hosted on HuggingFace.

pip install huggingface-hub
hf download holotherapper/adlib-devterm --repo-type dataset --local-dir domains/devterm/dataset/ --include "data/audio/*"

3. Prepare Predictions

Create a JSONL file with id and text fields:

{"id": "dev-0000", "text": "Dockerのコンテナをデプロイした。"}
{"id": "dev-0001", "text": "Next.jsでApp Routerを使う。"}

4. Run Evaluation

adlib predictions.jsonl --test-cases domains/devterm/dataset/test_cases.jsonl --model-name "your-model"

Metrics

CER (Character Error Rate)

Character-level error rate between reference text and model output. Lower is better.

  • Micro-averaged (pooled across all test cases)
  • Flexible term substitution applied before computation
  • Bootstrap 95% CI (B=10,000, seed=42)

Term Accuracy

Whether technical terms appear correctly in the model output. Higher is better.

Type Description Examples
exact English form only Docker, useEffect, package.json
flexible English or katakana deploy/デプロイ, component/コンポーネント

Composite Score

Composite = 0.4 × (1 - CER) + 0.6 × Term Accuracy

Normalization Rules

Normalized:

  • NFC normalization (canonical composition)
  • Newline removal
  • Flexible term alternative substitution (before CER only)

NOT normalized (evaluated as-is):

  • Punctuation (、。!?)
  • Whitespace presence/position
  • Case sensitivity (A vs a)
  • Fullwidth/halfwidth distinction
  • Long vowel mark presence (サーバー vs サーバ)

Adding New Domains

Add a new directory under domains/ with config/ and dataset/. The evaluation engine (adlib/) is domain-agnostic.

domains/
├── devterm/     ← software development (current)
├── medical/     ← medical terminology (future)
└── general/     ← general Japanese (future)

Term lists can be added or removed via community PRs.

Running Tests

python -m pytest tests/ -v

License

  • Code (adlib/, tests/): Apache License 2.0
  • Data (domains/): CC BY-NC-SA 4.0

For commercial use of the data, please contact us for a separate license.

About

ADLIB: Japanese ASR benchmark framework with language-aware evaluation

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages