Skip to content

iKala/ievals

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

iEvals : iKala's Evaluator for Large Language Models

iEvals is a framework for evaluating chinese large language models (LLMs), especially performance in traditional chinese domain. Our goal was to provide an easy to setup and fast evaluation library for guiding the performance/use on existing chinese LLMs.

Currently, we only support evaluation for TMMLU+, however in the future we are exploring more domain, ie knowledge extensive dataset (CMMLU, C-Eval) as well as context retrieval and multi-conversation dataset.

Installation

pip install git+https://github.com/ikala-corp/ievals.git

Usage

ieval <model name> <series: optional> --top_k <numbers of incontext examples>

For more details please refer to models section

Coming soon

  • Chain of Thought (CoT) with few shot

  • Arxiv paper : detailed analysis on model interior and exterior relations

  • More tasks

Citation

@article{ikala2023eval,
  title={An Improved Traditional Chinese Evaluation Suite for Foundation Model},
  author={Tam, Zhi-Rui and Pai, Ya-Ting},
  journal={arXiv},
  year={2023}
}

Disclaimer

This is not an officially supported iKala product.

This research code is provided "as-is" to the broader research community. iKala does not promise to maintain or otherwise support this code in any way.

About

Official github repo for TMMLU+, Large scale traditional chinese massive multitask language understanding

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages