# AskQE Pipeline - Qwen2.5-0.5B-Instruct

This notebook runs the complete AskQE pipeline using the **Qwen/Qwen2.5-0.5B-Instruct** model.

## 1. Question Generation (QG)

Generate questions for each variant: vanilla, atomic, and semantic.

In [56]:
import subprocess
import sys
subprocess.run([sys.executable, '-m', 'pip', 'install', 'transformers', 'torch', 'accelerate'], check=True)

CompletedProcess(args=['c:\\Users\\andos\\AppData\\Local\\Programs\\Python\\Python310\\python.exe', '-m', 'pip', 'install', 'transformers', 'torch', 'accelerate'], returncode=0)

In [36]:
!cd

c:\Users\andos\DNLP-Project\askqe\evaluation\desiderata


In [None]:
# Run QG for vanilla prompt
import os
import subprocess
import sys

os.chdir(r'C:\Users\andos\DNLP-Project\askqe\QG\code')
result = subprocess.run([sys.executable, '-u', 'qwen-0.5b.py', '--output_path', '../qwen-0.5b/vanilla_qwen-0.5b.jsonl', '--prompt', 'vanilla'], 
                       capture_output=True, text=True)
print(result.stdout)
if result.stderr:
    print("STDERR:", result.stderr)
if result.returncode != 0:
    print(f"Exit code: {result.returncode}")

In [2]:
# Run QG for atomic prompt
!python -u qwen-0.5b.py --output_path ../qwen-0.5b/atomic_qwen-0.5b.jsonl --prompt atomic

Traceback (most recent call last):
  File "c:\Users\andos\DNLP-Project\askqe\QG\code\qwen-0.5b.py", line 1, in <module>
    from transformers import AutoTokenizer, AutoModelForCausalLM
ModuleNotFoundError: No module named 'transformers'


In [3]:
# Run QG for semantic prompt
!python -u qwen-0.5b.py --output_path ../qwen-0.5b/semantic_qwen-0.5b.jsonl --prompt semantic

Traceback (most recent call last):
  File "c:\Users\andos\DNLP-Project\askqe\QG\code\qwen-0.5b.py", line 1, in <module>
    from transformers import AutoTokenizer, AutoModelForCausalLM
ModuleNotFoundError: No module named 'transformers'


## 2. Question Answering (QA)

Answer questions based on source sentences and backtranslated MT.

In [4]:
# Run QA based on source sentences
%cd ../../QA/code
!python -u qwen-0.5b.py --output_path ../qwen-0.5b/source --sentence_type source

c:\Users\andos\DNLP-Project\askqe\QA\code


Traceback (most recent call last):
  File "c:\Users\andos\DNLP-Project\askqe\QA\code\qwen-0.5b.py", line 1, in <module>
    import torch
ModuleNotFoundError: No module named 'torch'


In [5]:
# Run QA based on backtranslated MT
!python -u qwen-0.5b.py --output_path ../qwen-0.5b/bt --sentence_type bt

Traceback (most recent call last):
  File "c:\Users\andos\DNLP-Project\askqe\QA\code\qwen-0.5b.py", line 1, in <module>
    import torch
ModuleNotFoundError: No module named 'torch'


## 3. BioMQM Pipeline

In [6]:
# Run BioMQM pipeline
%cd ../../biomqm/askqe
!python -u qwen-0.5b.py --output_path askqe_qg_qwen.jsonl --prompt atomic

c:\Users\andos\DNLP-Project\askqe\biomqm\askqe


Traceback (most recent call last):
  File "c:\Users\andos\DNLP-Project\askqe\biomqm\askqe\qwen-0.5b.py", line 1, in <module>
    import torch
ModuleNotFoundError: No module named 'torch'


---

## 4. Evaluation Metrics

### 4.1 SBERT (Sentence-BERT Cosine Similarity)

In [7]:
# Run SBERT evaluation
%cd ../../evaluation/sbert
!python sbert.py --model qwen-0.5b --output_file qwen-0.5b.csv

c:\Users\andos\DNLP-Project\askqe\evaluation\sbert


Traceback (most recent call last):
  File "c:\Users\andos\DNLP-Project\askqe\evaluation\sbert\sbert.py", line 2, in <module>
    import nltk
ModuleNotFoundError: No module named 'nltk'


### 4.2 String Comparison (F1, EM, BLEU, chrF)

In [8]:
# Run String Comparison evaluation
# Note: Edit string_comparison.py to change model path to qwen-0.5b if needed
%cd ../string-comparison
!python string_comparison.py

c:\Users\andos\DNLP-Project\askqe\evaluation\string-comparison


Traceback (most recent call last):
  File "c:\Users\andos\DNLP-Project\askqe\evaluation\string-comparison\string_comparison.py", line 2, in <module>
    import nltk
ModuleNotFoundError: No module named 'nltk'


---

## 5. Baseline Metrics (QE)

### 5.1 BT-Score (BERTScore on backtranslation)

In [9]:
# Run BT-Score
%cd ../bt-score
!python run_bt.py

c:\Users\andos\DNLP-Project\askqe\evaluation\bt-score


Traceback (most recent call last):
  File "c:\Users\andos\DNLP-Project\askqe\evaluation\bt-score\run_bt.py", line 3, in <module>
    import torch
ModuleNotFoundError: No module named 'torch'


### 5.2 xCOMET-QE

In [10]:
# Run xCOMET-QE
%cd ../xcomet-qe
!python xcomet.py

c:\Users\andos\DNLP-Project\askqe\evaluation\xcomet-qe


Traceback (most recent call last):
  File "c:\Users\andos\DNLP-Project\askqe\evaluation\xcomet-qe\xcomet.py", line 1, in <module>
    from comet import download_model, load_from_checkpoint
ModuleNotFoundError: No module named 'comet'


---

## 6. Desiderata Evaluation (Question Quality)

### 6.1 Empty Questions Count

In [11]:
# Run Empty Questions evaluation
%cd ../desiderata
!python i_avg_questions.py

c:\Users\andos\DNLP-Project\askqe\evaluation\desiderata


Traceback (most recent call last):
  File "c:\Users\andos\DNLP-Project\askqe\evaluation\desiderata\i_avg_questions.py", line 16, in <module>
    with open(jsonl_file, "r", encoding="utf-8") as file:
FileNotFoundError: [Errno 2] No such file or directory: '../QG/gemma-9b/atomic_gemma-9b.jsonl'


File:  ../QG/gemma-9b/atomic_gemma-9b.jsonl


### 6.2 Duplicate Questions

In [12]:
# Run Duplicate Questions evaluation
!python i_duplicate.py

File:  ../QG/gemma-9b/atomic_gemma-9b.jsonl


Traceback (most recent call last):
  File "c:\Users\andos\DNLP-Project\askqe\evaluation\desiderata\i_duplicate.py", line 15, in <module>
    with open(jsonl_file, "r", encoding="utf-8") as file:
FileNotFoundError: [Errno 2] No such file or directory: '../QG/gemma-9b/atomic_gemma-9b.jsonl'


### 6.3 Diversity

In [13]:
# Run Diversity evaluation
!python i_diversity.py

Traceback (most recent call last):
  File "c:\Users\andos\DNLP-Project\askqe\evaluation\desiderata\i_diversity.py", line 2, in <module>
    import torch
ModuleNotFoundError: No module named 'torch'


### 6.4 Answerability

In [14]:
# Run Answerability evaluation
!python q_answerability.py

Traceback (most recent call last):
  File "c:\Users\andos\DNLP-Project\askqe\evaluation\desiderata\q_answerability.py", line 1, in <module>
    import torch
ModuleNotFoundError: No module named 'torch'


### 6.5 Readability

In [15]:
# Run Readability evaluation
!python q_readability.py

Traceback (most recent call last):
  File "c:\Users\andos\DNLP-Project\askqe\evaluation\desiderata\q_readability.py", line 2, in <module>
    import textstat
ModuleNotFoundError: No module named 'textstat'
