Alignment on modelcard metadata specification #39

LysandreJik · 2021-05-06T12:44:30Z

Hi all, opening this PR so that we can all align on the metadata spec for model cards. This metadata is important as it closes the bridge between tasks, datasets, and metrics for a given checkpoint. This will eventually allow programmatic analysis and handling of model cards' metadata on the hub.

The metadata was drafted during the collaboration with papers-with-code in order to have a ranked leaderboard for the XLSR sprint. The following format was adopted:

model-index:  
- name: {model_id}
  results:
  - task: 
      name: {task_name}
      type: {task_type}
    dataset:
      name: {dataset_name}
      type: {dataset_type}
      args: {arg_0}
    metrics:
      - name: {metric_name} 
        type: {metric_type}
        value: {metric_value}
        args: {arg_0}

This format should allow for multiple tasks and multiple metrics within each task.

Some existing examples of modelcards with this format are the modelcards uploaded during the XLSR sprint, like the following ydshieh/wav2vec2-large-xlsr-53-chinese-zh-cn-gpt

Looking forward to your feedback.

Transformers: @sgugger @patrickvonplaten
Datasets: @lhoestq
AutoNLP: @abhi1thakur @SBrandeis
Evaluation: @lewtun

@Pierrci @julien-c @thomwolf

lhoestq · 2021-05-06T13:29:26Z

Cool thanks !

One note regarding metrics: they might need args as well.
For example model-based metrics such a BLEURT, BERTScore or COMET may require a model_name argument.
Another example is BLEU that have a max_order parameter (max n-gram order to use when computing BLEU score), as well smoothing parameters, or the tokenizer to use.

julien-c · 2021-05-06T13:31:20Z

Very excited about this, and let's also ping the Paperswithcode team here for validation?

LysandreJik · 2021-05-06T13:41:48Z

Absolutely! Pinged them offline for a review.

jspisak · 2021-05-06T13:42:01Z

validation and integration? :)

rstojnic · 2021-05-06T16:02:24Z

Thanks for looping us in! Yeah the format looks good. Since we already have the integration for wav2vec2 running, it should be pretty easy from our side to extend it to any result on any benchmark.

A couple of things to consider:

To make sure we are using the same task/data names, I think it would be great to ask users to check the task/dataset name on https://paperswithcode.com/sota and https://paperswithcode.com/datasets . This would ensure all the results are put in the correct leaderboard on Papers with Code, and that it remains interoperable with anyone else wanting to use this data.
It would be great if there was an automated badge or something like that would link to the Papers with Code leaderboard on the model page. We could create a URL you could easily construct from the metadata, e.g. for the example above it could be https://paperswithcode.com/sota/?task=Speech Recognition&dataset=Common Voice zh-CN that would redirect you to the correct leaderboard Or, there could also be a small embeddable graph or just a badge with ranking.
For us to be able to efficiently track this metadata it would be useful to have an API endpoint where we can access all the latest modelcard changes. I.e. something similar to what we are using in the current integration (https://huggingface.co/api/models), but just with the ability to order by last changed.

lewtun · 2021-05-11T08:05:51Z

One question re the front-end: can we use this schema to enable sorting by metric value on the Hub?

For example, it would be cool to help users answer questions like "Which model achieves the highest metric value X on dataset Y?" I realise this overlaps with PWC's leaderboards (example), but still think there's value in providing this kind of overview to Hub users.

julien-c · 2021-05-11T12:26:16Z

@lewtun At some point we might want to display some sort of leaderboard-lite on the hf.co hub, but for now I feel like our main goal is on the data side, i.e. to ensure that as many models as possible contain the correct metadata in a format that's easily validated/leveraged by tools including Paperswithcode

LysandreJik · 2021-05-19T17:54:24Z

Will add an args field to the metrics as mentioned by @lhoestq. Will add and merge by Friday if no one is opposed to that change.

lhoestq · 2021-05-20T12:55:11Z

One question regarding dataset versioning: is this also something we want to include in the dataset card ? IMO this would be nice for reproducibility.

julien-c · 2021-05-20T13:06:49Z

@rstojnic Regarding your point 3. you'll have this here when #41 is merged (or you're welcome to just call the underlying API if it's simpler – its params should be mostly stable now)

LysandreJik added 2 commits May 6, 2021 11:41

Add a ModelCard spec

eb9ea47

Add context

a63cb57

julien-c mentioned this pull request May 11, 2021

HF <> Paperswithcode dataset identifiers mapping #43

Closed

LysandreJik mentioned this pull request May 19, 2021

Open-source /docs #46

Merged

Add args to metric

cb588a5

LysandreJik merged commit 2828953 into main May 21, 2021

LysandreJik deleted the modelcard-spec branch May 21, 2021 07:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Alignment on modelcard metadata specification #39

Alignment on modelcard metadata specification #39

LysandreJik commented May 6, 2021 •

edited

Loading

lhoestq commented May 6, 2021

julien-c commented May 6, 2021

LysandreJik commented May 6, 2021

jspisak commented May 6, 2021

rstojnic commented May 6, 2021

lewtun commented May 11, 2021

julien-c commented May 11, 2021

LysandreJik commented May 19, 2021

lhoestq commented May 20, 2021

julien-c commented May 20, 2021

Alignment on modelcard metadata specification #39

Alignment on modelcard metadata specification #39

Conversation

LysandreJik commented May 6, 2021 • edited Loading

lhoestq commented May 6, 2021

julien-c commented May 6, 2021

LysandreJik commented May 6, 2021

jspisak commented May 6, 2021

rstojnic commented May 6, 2021

lewtun commented May 11, 2021

julien-c commented May 11, 2021

LysandreJik commented May 19, 2021

lhoestq commented May 20, 2021

julien-c commented May 20, 2021

LysandreJik commented May 6, 2021 •

edited

Loading