Evaluation of encoder and decoder models on SuperGLUE

Hi guys,

I want to evaluate models like ModernBERT, Llama and many others on SuperGLUE and my own benchmark. In my setting, every model has to be fine-tuned for the specific task, even decoder models.

Is this currently supported by LightEval? Looking at the code, my feeling is that evaluations are only done by prompting.

Thanks.