⭐ Model Score

--------- ATTENTION ---------- This repo is work in progress!

The Model Score is a proposal to an open-source rating system that evaluates LLMs (large language models) based on a variety of criteria. It aims to provide accurate quantitative measures to compare different language models and assessing their capabilities. It also aims to help with legal compliance of an AI model, based on regulatory frameworks like the EU AI Act.

Contributions for improving the Model Score are welcome! Commit your ideas to the Model Score repository on Github.

Evaluation Criteria

Here, we are currently collecting ideas on what factors of the LLM to evaluate. After this list will be complete, each evaluation point needs to be examined in more detail.

Scoring Model

Capabilities

Capability	Metric	Type
Natural Language Understanding	0 to 100	Score (%)
Code Generation	0 to 100	Score (%)
...and more	0 to 100	Score (%)

Performance

Category	Metric	Type
Reasoning	0 to 100	Score (%)
Logic	0 to 100	Score (%)
Math	0 to 100	Score (%)
...more	0 to 100	Score (%)

Performance (Crowd Preference)

Category	Metric	Type
Reasoning	0 to 100	Score (%)
Logic	0 to 100	Score (%)
Math	0 to 100	Score (%)
...more	0 to 100	Score (%)

Ethical Considerations

Consideration	Metric	Type
Bias	9	Score (0 to 10)
Transparency	8	Score (0 to 10)

Legal Compliance

Regulation	Compliance Score (%)
EU AI Act	60

Usability

Aspect	Rating (1-10)
Documentation	7
Compatibility	6

Adaptability

Aspect	Rating (1-10)
Language Support	8
Domain Adaptation	7

Technical Specifications

Specification	Rating (1-10)
Model Size	6
Compute Efficiency	7

Honorable Mentions & Credits

The Model Score rating system is proposed by Localmind and open to contributions.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

⭐ Model Score

Evaluation Criteria

Scoring Model

Capabilities

Performance

Performance (Crowd Preference)

Ethical Considerations

Legal Compliance

Usability

Adaptability

Technical Specifications

Honorable Mentions & Credits

About

Releases

Packages

License

morgendigital/model-score

Folders and files

Latest commit

History

Repository files navigation

⭐ Model Score

Evaluation Criteria

Scoring Model

Capabilities

Performance

Performance (Crowd Preference)

Ethical Considerations

Legal Compliance

Usability

Adaptability

Technical Specifications

Honorable Mentions & Credits

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages