GitHub - shruti-singh/modelcards: Dataset of Model Cards for Language Models.

Unlocking LLM Insights: A Dataset for Automatic Model Card Generation

Language models (LMs) are no longer restricted to the ML community, and instruction-following LMs have led to a rise in autonomous AI agents. As the accessibility of LMs grows, it is imperative that an understanding of their capabilities, intended usage, and development cycle also improves. Model cards are a widespread practice for documenting detailed information about an ML model. To automate model card generation, we introduce a dataset of 500 question-answer pairs for 25 LMs that cover crucial aspects of the model, such as its training configurations, datasets, biases, architecture details, and training resources. We employ annotators to extract the answers from the original paper. Further, we explore the capabilities of LMs in generating model cards by answering questions. We experiment with three configurations: zero-shot generation, retrieval-augmented generation, and fine-tuning on our dataset. The fine-tuned Llama 3 model shows an improvement of 7 points over the retrieval-augmented generation setup. This indicates that our dataset can be used to train models to automatically generate model cards from paper text and reduce the human effort in the model card curation process.

Directory Organization

The qa directory contains the code to reproduce zeroshot, retrieval-augmented generation (rag), and supervised fine-tuning (sft) results.
The evals directory contains the code to compute metrics.

The data directory contains the modelcard dataset. The train and test splits are in directory data/modelcards.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
data		data
evals		evals
qa		qa
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Unlocking LLM Insights: A Dataset for Automatic Model Card Generation

Directory Organization

About

Releases

Packages

Contributors 3

Languages

shruti-singh/modelcards

Folders and files

Latest commit

History

Repository files navigation

Unlocking LLM Insights: A Dataset for Automatic Model Card Generation

Directory Organization

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages