Skip to content

csisc/BoolV

Repository files navigation

BoolV

A method to evaluate the response of lightweight LLMs to TRUE-FALSE questions

Source Code for the Data Visualization: https://github.com/csisc/BoolV-Analysis.

To Cite the Work: Turki, H., Dossou, B. F. P., Nebli, A., & Valdelli, I. (2025). Evaluating the Behavior of Small Language Models in Answering Binary Questions. In 3rd International Workshop on Generalizing from Limited Resources in the Open World (GLOW@IJCAI 2025).

Models

Model Hyperparameters
llama-3.2-1b-instruct-q8_0 1.24 B
llama-3.2-3b-instruct-q8_0 3.21 B
Phi-3.5-mini-instruct.Q8_0 3.82 B
Mistral-7B-Instruct-v0.3.Q8_0 7.25 B
llama-3.2-8b-instruct-q8_0 8.03 B

Dataset

Dependencies

  • llama-cpp-python
  • pathlib
  • pandas
  • math
  • jsonlines

Funding

This research work has been done thanks to the computer resources of Wikimedia Switzerland.

About

A method to evaluate the response of lightweight LLMs to TRUE-FALSE questions

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages