Skip to content

Model Score is an open-source rating system designed to evaluate LLMs (large language models) based on a variety of criteria.

License

Notifications You must be signed in to change notification settings

morgendigital/model-score

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 

Repository files navigation

⭐ Model Score

--------- ATTENTION ---------- This repo is work in progress!

The Model Score is a proposal to an open-source rating system that evaluates LLMs (large language models) based on a variety of criteria. It aims to provide accurate quantitative measures to compare different language models and assessing their capabilities. It also aims to help with legal compliance of an AI model, based on regulatory frameworks like the EU AI Act.

Contributions for improving the Model Score are welcome! Commit your ideas to the Model Score repository on Github.

Evaluation Criteria

Here, we are currently collecting ideas on what factors of the LLM to evaluate. After this list will be complete, each evaluation point needs to be examined in more detail.

Scoring Model

Capabilities

Capability Metric Type
Natural Language Understanding 0 to 100 Score (%)
Code Generation 0 to 100 Score (%)
...and more 0 to 100 Score (%)

Performance

Category Metric Type
Reasoning 0 to 100 Score (%)
Logic 0 to 100 Score (%)
Math 0 to 100 Score (%)
...more 0 to 100 Score (%)

Performance (Crowd Preference)

Category Metric Type
Reasoning 0 to 100 Score (%)
Logic 0 to 100 Score (%)
Math 0 to 100 Score (%)
...more 0 to 100 Score (%)

Ethical Considerations

Consideration Metric Type
Bias 9 Score (0 to 10)
Transparency 8 Score (0 to 10)

Legal Compliance

Regulation Compliance Score (%)
EU AI Act 60

Usability

Aspect Rating (1-10)
Documentation 7
Compatibility 6

Adaptability

Aspect Rating (1-10)
Language Support 8
Domain Adaptation 7

Technical Specifications

Specification Rating (1-10)
Model Size 6
Compute Efficiency 7

Honorable Mentions & Credits

The Model Score rating system is proposed by Localmind and open to contributions.

About

Model Score is an open-source rating system designed to evaluate LLMs (large language models) based on a variety of criteria.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published