## OpenCompass Installation

- Init OpenCompass environment
```bash
conda create -p /root/autodl-tmp/open-compass-env python==3.10 -y
conda init
conda activate /root/autodl-tmp/open-compass-env
```

- Install OpenCompass
```bash
git clone https://github.com/open-compass/opencompass opencompass
cd opencompass
pip install -e .
```

## OpenCompass Dataset Preparation

- Download the dataset
```bash
cd /root/autodl-tmp/opencompass
wget https://github.com/open-compass/opencompass/releases/download/0.2.2.rc1/OpenCompassData-core-20240207.zip
unzip OpenCompassData-core-20240207.zip
```

## Run OpenCompass

```bash
cd /root/autodl-tmp/opencompass

opencompass --models hf_internlm2_5_1_8b_chat --datasets demo_gsm8k_chat_gen

python run.py --datasets demo_gsm8k_chat_gen demo_math_chat_gen \
    --hf-type chat \
    --hf-path internlm/internlm2-chat-1_8b \
    --debug

#    --max-out-len 1024 \
#    --min-out-len 1 \
#    --hf-num-gpus 2 \
#    --generation-kwargs do_sample=True temperature=0.6 \
#    --stop-words '<|im_end|>' '<|im_start|>' \
```

The output will be saved in `outputs/` directory.

```bash
02/02 18:48:21 - OpenCompass - INFO - Partitioned into 2 tasks.
Parameter 'function'=<function OpenICLEvalTask._score.<locals>.postprocess at 0x7f8a846cec20> of the transform datasets.arrow_dataset.Dataset._map_single couldn't be hashed properly, a random hash was used instead. Make sure your transforms and parameters are serializable with pickle or dill for the dataset fingerprinting and caching to work. If you reuse this transform, the caching mechanism will consider it to be different from the previous calls and recompute everything. This warning is only showed once. Subsequent hashing failures won't be showed.
Map: 100%|███████████████████████████████████████████████████████████████████████████████████████████████| 64/64 [00:00<00:00, 14663.80 examples/s]
02/02 18:48:23 - OpenCompass - INFO - Task [internlm2-chat-1_8b_hf/demo_gsm8k]: {'accuracy': 31.25}
02/02 18:48:24 - OpenCompass - INFO - Task [internlm2-chat-1_8b_hf/demo_math]: {'accuracy': 14.0625}
dataset     version    metric    mode      internlm2-chat-1_8b_hf
----------  ---------  --------  ------  ------------------------
demo_gsm8k  1d7fe4     accuracy  gen                        31.25
demo_math   393424     accuracy  gen                        14.06
02/02 18:48:24 - OpenCompass - INFO - write summary to /root/autodl-tmp/opencompass/outputs/default/20250202_184237/summary/summary_20250202_184237.txt
02/02 18:48:24 - OpenCompass - INFO - write csv to /root/autodl-tmp/opencompass/outputs/default/20250202_184237/summary/summary_20250202_184237.csv


The markdown format results is as below:

| dataset | version | metric | mode | internlm2-chat-1_8b_hf |
|----- | ----- | ----- | ----- | -----|
| demo_gsm8k | 1d7fe4 | accuracy | gen | 31.25 |
| demo_math | 393424 | accuracy | gen | 14.06 |

02/02 18:48:24 - OpenCompass - INFO - write markdown summary to /root/autodl-tmp/opencompass/outputs/default/20250202_184237/summary/summary_20250202_184237.md
```

In [None]:
# A code to convert .json file with new field names in a .jsonl file
import json
import os

def convert_json_to_jsonl(json_file, jsonl_file):
    with open(json_file, 'r') as f:
        data = json.load(f)
    with open(jsonl_file, 'w') as f:
        for d in data:
            f.write(json.dumps(d) + '\n')
            