# LM-eval-harness

In this notebook, we compute the unnormalized and byte-level normalized likelihoods of moral actions, treating our task as a multiple choice, using the following input: Norm + Context + Intention.
We integrate our datasets into the framework to ensure compatibility with other multiple-choice benchmarks: https://github.com/EleutherAI/lm-evaluation-harness.  
We conduct these experiments on French and English datasets, using Mistral and Croissant base models.
Please note, that any model can be used that is suppported with lm-eval-harness: https://github.com/EleutherAI/lm-evaluation-harness#model-apis-and-inference-servers.

Mistral models was developped by The Mistral AI Team. More information can be found in the model card: https://huggingface.co/mistralai/Mistral-7B-v0.1.
Information about CroissantBase also can be found in the model card: https://huggingface.co/croissantllm/CroissantLLMBase.

Mistral and croissant models have 7B and 1.3B parameters, respectively.
We choose these models because of their good performance on FrenchBench and English benchmarks. That results can be found in Table 3 and 5 in the following paper (Croissant paper): https://arxiv.org/pdf/2402.00786.

We report the obtained results in the Section 6.1.2 and Table 4 Results for moral action choice on HISTORES-
MORALES and MORALSTORIES.

We provide code to be integrated into the framework.
We run these experiments on a single GPU A100. Running this notebook takes around 20 minutes.

In [None]:
!git clone https://github.com/EleutherAI/lm-evaluation-harness
%cd lm-evaluation-harness
!pip install -e .

In [None]:
!pip install omegaconf -q # that is essential for logging examples

In [None]:
%cd ..
!ls
!mkdir ./lm-evaluation-harness/lm_eval/tasks/moral_stories
!mv ./lm-eval-harness-moral-stories/* ./lm-evaluation-harness/lm_eval/tasks/moral_stories

In [7]:
%cd lm-evaluation-harness

/content/lm-evaluation-harness


In [None]:
# !huggingface-cli login

In [None]:
!mkdir croissant mistral

# CroissantLLMBase Evaluation on MoralStories

In [None]:
!lm_eval --model hf --model_args pretrained="croissantllm/CroissantLLMBase" --tasks moral_stories --device cuda:0 --batch_size 8 --log_samples --output_path "./croissant"

2024-05-18 15:48:43.048463: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-05-18 15:48:43.048522: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-05-18 15:48:43.050117: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-05-18:15:48:47,684 INFO     [__main__.py:254] Verbosity set to INFO
2024-05-18:15:49:02,560 INFO     [__main__.py:341] Selected Tasks: ['moral_stories']
2024-05-18:15:49:02,565 INFO     [evaluator.py:141] Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234
2024-05-18:15:49:02,565 INFO     [evaluator.py:178] Initializing 

# Mistral-7B Evaluation on MoralStories

In [None]:
!lm_eval --model hf --model_args pretrained="mistralai/Mistral-7B-v0.1" --tasks moral_stories --device cuda:0 --batch_size 8 --log_samples --output_path "./mistralai"

2024-05-18 15:53:59.541901: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-05-18 15:53:59.541952: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-05-18 15:53:59.543471: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-05-18:15:54:04,302 INFO     [__main__.py:254] Verbosity set to INFO
2024-05-18:15:54:18,929 INFO     [__main__.py:341] Selected Tasks: ['moral_stories']
2024-05-18:15:54:18,933 INFO     [evaluator.py:141] Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234
2024-05-18:15:54:18,933 INFO     [evaluator.py:178] Initializing 

# Mistral-7B Evaluation on HistoiresMorales

In [None]:
!lm_eval --model hf --model_args pretrained="mistralai/Mistral-7B-v0.1" --tasks moral_stories_fr --device cuda:0 --batch_size 8 --log_samples --output_path "./mistralai"

2024-05-18 15:59:48.467489: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-05-18 15:59:48.467545: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-05-18 15:59:48.469195: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-05-18:15:59:53,107 INFO     [__main__.py:254] Verbosity set to INFO
2024-05-18:16:00:08,096 INFO     [__main__.py:341] Selected Tasks: ['moral_stories_fr']
2024-05-18:16:00:08,102 INFO     [evaluator.py:141] Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234
2024-05-18:16:00:08,102 INFO     [evaluator.py:178] Initializi

# CroissantLLMBase Evaluation on HistoiresMorales

In [None]:
!lm_eval --model hf --model_args pretrained="croissantllm/CroissantLLMBase" --tasks moral_stories_fr --device cuda:0 --batch_size 8 --log_samples --output_path "./croissant"

2024-05-18 15:39:26.913910: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-05-18 15:39:26.913961: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-05-18 15:39:26.915681: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-05-18:15:39:31,560 INFO     [__main__.py:254] Verbosity set to INFO
2024-05-18:15:39:45,756 INFO     [__main__.py:341] Selected Tasks: ['moral_stories_fr']
2024-05-18:15:39:45,761 INFO     [evaluator.py:141] Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234
2024-05-18:15:39:45,761 INFO     [evaluator.py:178] Initializi