<a href="https://colab.research.google.com/github/Reemaalt/Detection-of-Hallucination-in-Arabic/blob/main/Semantic_Entropy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!huggingface-cli login


    _|    _|  _|    _|    _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|_|_|_|    _|_|      _|_|_|  _|_|_|_|
    _|    _|  _|    _|  _|        _|          _|    _|_|    _|  _|            _|        _|    _|  _|        _|
    _|_|_|_|  _|    _|  _|  _|_|  _|  _|_|    _|    _|  _|  _|  _|  _|_|      _|_|_|    _|_|_|_|  _|        _|_|_|
    _|    _|  _|    _|  _|    _|  _|    _|    _|    _|    _|_|  _|    _|      _|        _|    _|  _|        _|
    _|    _|    _|_|      _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|        _|    _|    _|_|_|  _|_|_|_|

    To log in, `huggingface_hub` requires a token generated from https://huggingface.co/settings/tokens .
Enter your token (input will not be visible): 
Add token as git credential? (Y/n) N
Token is valid (permission: fineGrained).
The token `week1 test` has been saved to /root/.cache/huggingface/stored_tokens
Your token has been saved to /root/.cache/huggingface/token
Login successful.
The current active token is: `week1 t

In [2]:
# Semantic Entropy code Based on the original implementation by the new githup
import json
import os
import pickle
import random
from functools import lru_cache
from tqdm import tqdm
import zipfile
import numpy as np
import torch
import torch.nn.functional as F
from transformers import ElectraTokenizerFast, ElectraForSequenceClassification, AutoModelForCausalLM, AutoTokenizer
from google.colab import files
from tqdm.notebook import tqdm
import os
import json
from google.colab import files
# Set up device and logging
DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using device: {DEVICE}")


Using device: cuda


In [3]:
# Optimize TensorFlow and PyTorch operations
os.environ["TF_ENABLE_ONEDNN_OPTS"] = "1"
os.environ["TF_GPU_THREAD_MODE"] = "gpu_private"
os.environ["TF_GPU_THREAD_COUNT"] = "4"

# For PyTorch
torch.backends.cudnn.benchmark = True
torch.set_float32_matmul_precision('high')

#EntailmentModel

In [4]:
#ues our trained fine-tunied model
#get the finetuned model from drive
from google.colab import drive
drive.mount('/content/drive')

zip_path = "/content/drive/My Drive/araelectra-nli-finetuned.zip"  # Adjust the path if needed
extract_path = "/content/araelectra"

# Extract the zip file
with zipfile.ZipFile(zip_path, 'r') as zip_ref:
    zip_ref.extractall(extract_path)

print("Extraction complete.")


Mounted at /content/drive
Extraction complete.


In [5]:
class ArabicEntailmentModel:
    """Arabic entailment checker using AraELECTRA model."""
    def __init__(self, model_path="/content/araelectra/araelectra-nli-finetuned"):
        """Initialize the model with better caching."""
        print("Loading AraELECTRA model for Arabic entailment checking...")
        self.tokenizer = ElectraTokenizerFast.from_pretrained(model_path)
        self.model = ElectraForSequenceClassification.from_pretrained(model_path)
        self.model = self.model.to(DEVICE)
        self.model.eval()  # Set to evaluation mode

        # More efficient cache implementation
        self.cache_file = "entailment_cache.pkl"
        self.cache = {}
        self._load_cache()

        # Add batch processing capabilities
        self.batch_size = 16
        print("AraELECTRA model loaded successfully")

    def _load_cache(self):
        try:
            if os.path.exists(self.cache_file):
                with open(self.cache_file, 'rb') as f:
                    self.cache = pickle.load(f)
                print(f"Loaded {len(self.cache)} cached entailment results")
        except Exception as e:
            print(f"Cache loading failed: {e}")
            self.cache = {}

    def _save_cache(self):
        try:
            with open(self.cache_file, 'wb') as f:
                pickle.dump(self.cache, f)
        except Exception as e:
            print(f"Cache saving failed: {e}")

    def check_implications_batch(self, text_pairs):
        """Process multiple text pairs in one batch."""
        results = []
        uncached_pairs = []
        uncached_indices = []

        # Check cache first
        for i, (text1, text2) in enumerate(text_pairs):
            cache_key = (text1, text2)
            if cache_key in self.cache:
                results.append(self.cache[cache_key])
            else:
                uncached_pairs.append((text1, text2))
                uncached_indices.append(i)

        # Process uncached pairs in batches
        if uncached_pairs:
            batch_results = []
            for i in range(0, len(uncached_pairs), self.batch_size):
                batch = uncached_pairs[i:i+self.batch_size]
                batch_inputs = []

                for text1, text2 in batch:
                    encoded = self.tokenizer(
                        text1,
                        text2,
                        padding=True,
                        truncation=True,
                        max_length=128,
                        return_tensors="pt"
                    )
                    batch_inputs.append({k: v.unsqueeze(0) for k, v in encoded.items()})

                # Concatenate all inputs into one batch tensor
                batch_tensors = {
                    k: torch.cat([inp[k] for inp in batch_inputs], dim=0).to(DEVICE)
                    for k in batch_inputs[0].keys()
                }

                # Process the batch
                with torch.no_grad():
                    outputs = self.model(**batch_tensors)
                    logits = outputs.logits
                    probs = F.softmax(logits, dim=1)
                    predictions = torch.argmax(probs, dim=1).tolist()

                # Convert to correct format and cache results
                for j, pred in enumerate(predictions):
                    text1, text2 = batch[j]
                    result_map = {0: 2, 1: 1, 2: 0}
                    result = result_map[pred]
                    self.cache[(text1, text2)] = result
                    batch_results.append(result)

            # Insert batch results back into the right positions
            for i, index in enumerate(uncached_indices):
                results.insert(index, batch_results[i])

            # Save cache after processing batch
            if len(self.cache) % 100 == 0:
                self._save_cache()

        return results

    def check_implication(self, text1, text2, example=None):
        """Check entailment for a single pair (backward compatible)."""
        cache_key = (text1, text2)
        if cache_key in self.cache:
            return self.cache[cache_key]

        # Prepare input
        inputs = self.tokenizer(
            text1,
            text2,
            padding=True,
            truncation=True,
            max_length=128,
            return_tensors="pt"
        )

        # Move inputs to device
        inputs = {k: v.to(DEVICE) for k, v in inputs.items()}

        # Get prediction
        with torch.no_grad():
            outputs = self.model(**inputs)
            logits = outputs.logits
            probs = F.softmax(logits, dim=1)
            predicted_class = torch.argmax(probs, dim=1).item()

        # Map prediction
        result_map = {0: 2, 1: 1, 2: 0}
        result = result_map[predicted_class]

        # Cache the result
        self.cache[cache_key] = result

        return result

# Clustering

#also from their github got this

```
def get_semantic_ids(strings_list, model, strict_entailment=False, example=None):
    """Group list of predictions into semantic meaning."""

    def are_equivalent(text1, text2):

        implication_1 = model.check_implication(text1, text2, example=example)
        implication_2 = model.check_implication(text2, text1, example=example)  # pylint: disable=arguments-out-of-order
        assert (implication_1 in [0, 1, 2]) and (implication_2 in [0, 1, 2])

        if strict_entailment:
            semantically_equivalent = (implication_1 == 2) and (implication_2 == 2)

        else:
            implications = [implication_1, implication_2]
            # Check if none of the implications are 0 (contradiction) and not both of them are neutral.
            semantically_equivalent = (0 not in implications) and ([1, 1] != implications)

        return semantically_equivalent

    # Initialise all ids with -1.
    semantic_set_ids = [-1] * len(strings_list)
    # Keep track of current id.
    next_id = 0
    for i, string1 in enumerate(strings_list):
        # Check if string1 already has an id assigned.
        if semantic_set_ids[i] == -1:
            # If string1 has not been assigned an id, assign it next_id.
            semantic_set_ids[i] = next_id
            for j in range(i+1, len(strings_list)):
                # Search through all remaining strings. If they are equivalent to string1, assign them the same id.
                if are_equivalent(string1, strings_list[j]):
                    semantic_set_ids[j] = next_id
            next_id += 1

    assert -1 not in semantic_set_ids

    return semantic_set_ids


def logsumexp_by_id(semantic_ids, log_likelihoods, agg='sum_normalized'):
    """Sum probabilities with the same semantic id.

    Log-Sum-Exp because input and output probabilities in log space.
    """
    unique_ids = sorted(list(set(semantic_ids)))
    assert unique_ids == list(range(len(unique_ids)))
    log_likelihood_per_semantic_id = []

    for uid in unique_ids:
        # Find positions in `semantic_ids` which belong to the active `uid`.
        id_indices = [pos for pos, x in enumerate(semantic_ids) if x == uid]
        # Gather log likelihoods at these indices.
        id_log_likelihoods = [log_likelihoods[i] for i in id_indices]
        if agg == 'sum_normalized':
            # log_lik_norm = id_log_likelihoods - np.prod(log_likelihoods)
            log_lik_norm = id_log_likelihoods - np.log(np.sum(np.exp(log_likelihoods)))
            logsumexp_value = np.log(np.sum(np.exp(log_lik_norm)))
        else:
            raise ValueError
        log_likelihood_per_semantic_id.append(logsumexp_value)

    return log_likelihood_per_semantic_id
```



In [6]:
def get_semantic_ids(strings_list, model, strict_entailment=False, example=None):

    # Group list of predictions into semantic meaning

    @lru_cache(maxsize=None)
    def are_equivalent(text1, text2):
        # Check if text1 entails text2
        implication_1 = model.check_implication(text1, text2, example=example)
        # Check if text2 entails text1
        implication_2 = model.check_implication(text2, text1, example=example)
        assert (implication_1 in [0, 1, 2]) and (implication_2 in [0, 1, 2])

        if strict_entailment:
            # Both must indicate entailment (2) for semantic equivalence
            semantically_equivalent = (implication_1 == 2) and (implication_2 == 2)
        else:
            implications = [implication_1, implication_2]
            # Check if none of the implications are 0 (contradiction) and not both of them are neutral.)
            semantically_equivalent = (0 not in implications) and ([1, 1] != implications)

        return semantically_equivalent

    # Initialize all ids with -1
    semantic_set_ids = [-1] * len(strings_list)
    # Keep track of current id
    next_id = 0

    for i, string1 in enumerate(strings_list):
        # Check if string1 already has an id assigned
        if semantic_set_ids[i] == -1:
            # If string1 has not been assigned an id, assign it next_id
            semantic_set_ids[i] = next_id
            for j in range(i+1, len(strings_list)):
                # Search through all remaining strings. If they are equivalent to string1, assign them the same id.
                if semantic_set_ids[j] == -1 and are_equivalent(string1, strings_list[j]):
                    semantic_set_ids[j] = next_id
            next_id += 1

    assert -1 not in semantic_set_ids

    return semantic_set_ids

def logsumexp_by_id(semantic_ids, log_likelihoods, agg='sum_normalized'):
    """
    Sum probabilities with the same semantic ID.
    Log-Sum-Exp because input and output probabilities in log space.
    """
    unique_ids = sorted(list(set(semantic_ids)))
    assert unique_ids == list(range(len(unique_ids)))
    log_likelihood_per_semantic_id = []

    for uid in unique_ids:
        # Find positions in `semantic_ids` which belong to the active `uid`
        id_indices = [pos for pos, x in enumerate(semantic_ids) if x == uid]
        # Gather log likelihoods at these indices
        id_log_likelihoods = [log_likelihoods[i] for i in id_indices]

        if agg == 'sum_normalized':
            # Normalize by subtracting the log sum exp of all log likelihoods
            # log_lik_norm = id_log_likelihoods - np.prod(log_likelihoods)
            log_lik_norm = id_log_likelihoods - np.log(np.sum(np.exp(log_likelihoods)))
            logsumexp_value = np.log(np.sum(np.exp(log_lik_norm)))
        else:
            raise ValueError(f"Unknown aggregation method: {agg}")

        log_likelihood_per_semantic_id.append(logsumexp_value)

    return log_likelihood_per_semantic_id

#entropy

#from their github i got this code

```

def predictive_entropy(log_probs):
    """Compute MC estimate of entropy.

    `E[-log p(x)] ~= -1/N sum_i log p(x_i)`, i.e. the average token likelihood.
    """

    entropy = -np.sum(log_probs) / len(log_probs)

    return entropy


def predictive_entropy_rao(log_probs):
    entropy = -np.sum(np.exp(log_probs) * log_probs)
    return entropy


def cluster_assignment_entropy(semantic_ids):
    """Estimate semantic uncertainty from how often different clusters get assigned.

    We estimate the categorical distribution over cluster assignments from the
    semantic ids. The uncertainty is then given by the entropy of that
    distribution. This estimate does not use token likelihoods, it relies soley
    on the cluster assignments. If probability mass is spread of between many
    clusters, entropy is larger. If probability mass is concentrated on a few
    clusters, entropy is small.

    Input:
        semantic_ids: List of semantic ids, e.g. [0, 1, 2, 1].
    Output:
        cluster_entropy: Entropy, e.g. (-p log p).sum() for p = [1/4, 2/4, 1/4].
    """

    n_generations = len(semantic_ids)
    counts = np.bincount(semantic_ids)
    probabilities = counts/n_generations
    assert np.isclose(probabilities.sum(), 1)
    entropy = - (probabilities * np.log(probabilities)).sum()
    return entropy
```



In [7]:
def predictive_entropy_rao(log_probs):
    """
    Compute entropy from log probabilities.

    Parameters:
    - log_probs: Log probabilities

    Returns:
    - Entropy value
    """
    entropy = -np.sum(np.exp(log_probs) * log_probs)
    return entropy

def predictive_entropy(log_probs):
    """Compute MC estimate of entropy.

    `E[-log p(x)] ~= -1/N sum_i log p(x_i)`, i.e. the average token likelihood.
    """

    entropy = -np.sum(log_probs) / len(log_probs)

    return entropy

# call function
The actual full computation in the original repo happens in compute_uncertainty_measures.py:
 This  code is where they call the functions to compute the entropy measures

```

if args.compute_predictive_entropy:
    # Token log likelihoods. Shape = (n_sample, n_tokens)
    if not args.use_all_generations:
        log_liks = [r[1] for r in full_responses[:args.use_num_generations]]
    else:
        log_liks = [r[1] for r in full_responses]

    for i in log_liks:
        assert i

    if args.compute_context_entails_response:
        # Compute context entails answer baseline.
        entropies['context_entails_response'].append(context_entails_response(
            context, responses, entailment_model))

    if args.condition_on_question and args.entailment_model == 'deberta':
        responses = [f'{question} {r}' for r in responses]

    # Compute semantic ids.
    semantic_ids = get_semantic_ids(
        responses, model=entailment_model,
        strict_entailment=args.strict_entailment, example=example)

    result_dict['semantic_ids'].append(semantic_ids)

    # Compute entropy from frequencies of cluster assignments.
    entropies['cluster_assignment_entropy'].append(cluster_assignment_entropy(semantic_ids))

    # Length normalization of generation probabilities.
    log_liks_agg = [np.mean(log_lik) for log_lik in log_liks]

    # Compute naive entropy.
    entropies['regular_entropy'].append(predictive_entropy(log_liks_agg))

    # Compute semantic entropy.
    log_likelihood_per_semantic_id = logsumexp_by_id(semantic_ids, log_liks_agg, agg='sum_normalized')
    pe = predictive_entropy_rao(log_likelihood_per_semantic_id)
    entropies['semantic_entropy'].append(pe)
```



In [8]:
def experiment_semantic_entropy(question, original_answer ,llm_model, llm_tokenizer, entailment_model, num_samples):
    """
    Compute semantic entropy for a given question.
    Returns:
        Dictionary containing results of the experiment
    """
    # Step 1: Generate multiple answers with their token log likelihoods
    #print(f"Generating {num_samples} answers for question: {question}")
    results = generate_answer(question, num_samples, llm_model, llm_tokenizer)

    # Step 2: Extract answers and their log likelihoods
    answers = [result['text'] for result in results]
    # length normalization of the log probabilities
    log_likelihoods = [np.mean(result['token_log_probs']) for result in results]

   # print(f"Generated {len(answers)} answers")

    # Step 3: Create an example dictionary for entailment checking
    example = {'question': question}

    # Step 4: Compute semantic clusters
   # print("Computing semantic clusters...")
    semantic_ids = get_semantic_ids(answers, entailment_model, strict_entailment=False, example=example)
    unique_clusters = len(set(semantic_ids))
   # print(f"Found {unique_clusters} semantic clusters")

    # Step 5: Calculate entropy measures
    # naive_entropy calculation (based on log likelihoods only)
    naive_entropy = predictive_entropy(log_likelihoods)

    # Semantic entropy calculation (based on semantic clusters and log likelihoods)
    log_likelihood_per_semantic_id = logsumexp_by_id(semantic_ids, log_likelihoods, agg='sum_normalized')
    semantic_entropy = predictive_entropy_rao(log_likelihood_per_semantic_id)

    # Step 6: Print results
    print(f"\nEntropy Analysis for: '{question}'")
    print(f"Generated {len(answers)} answers in {unique_clusters} semantic clusters")
    #print the entropy values with 4 decimal places
    print(f" naive Entropy: {naive_entropy:.4f}")
    print(f"Semantic Entropy: {semantic_entropy:.4f}")


    # Step 7: Display cluster information
    #print("\nSemantic Clusters:")
    unique_clusters_list = sorted(list(set(semantic_ids)))
    for cluster_id in unique_clusters_list:
        cluster_items = [answers[i] for i, sid in enumerate(semantic_ids) if sid == cluster_id]
        count = len(cluster_items)
        '''
        print(f"\nCluster {cluster_id} ({count} items):")
        for item in cluster_items:
            print(f"  - {item}")
        '''
    # Format answers in the  structure
    formatted_answers = []
    for cluster_id in unique_clusters_list:
        cluster_answers = [answers[i] for i, sid in enumerate(semantic_ids) if sid == cluster_id]
        formatted_answers.append(cluster_answers)

    # Return  results
    # number like 2.220446049250313e-16 is extremely close to zero (basically floating-point noise).
    # so rounds the number to 4 decimal places
  # Return results with details and formatted answers
    return {
        'question': question,
        'original_answer': original_answer,  # Include the original answer from the dataset
        'answers': formatted_answers,  # Answers formatted
        'log_likelihoods': log_likelihoods,
        'semantic_ids': semantic_ids,
        'naive_entropy': round(naive_entropy, 4),
        'semantic_entropy': round(semantic_entropy, 4),
        'num_clusters': unique_clusters,
    }

#get likelihood

#from their code



```
#at this function
def predict(self, input_data, temperature, return_full=False):


        # Get log_likelihoods.
        # outputs.scores are the logits for the generated token.
        # outputs.scores is a tuple of len = n_generated_tokens.
        # Each entry is shape (bs, vocabulary size).
        # outputs.sequences is the sequence of all tokens: input and generated.
        transition_scores = self.model.compute_transition_scores(
            outputs.sequences, outputs.scores, normalize_logits=True)
        # Transition_scores[0] only contains the scores for the first generated tokens.

        log_likelihoods = [score.item() for score in transition_scores[0]]
        if len(log_likelihoods) == 1:
            logging.warning('Taking first and only generation for log likelihood!')
            log_likelihoods = log_likelihoods
        else:
            log_likelihoods = log_likelihoods[:n_generated]

        if len(log_likelihoods) == self.max_new_tokens:
            logging.warning('Generation interrupted by max_token limit.')

        if len(log_likelihoods) == 0:
            raise ValueError

        return sliced_answer, log_likelihoods, last_token_embedding


all function if needed
f predict(self, input_data, temperature, return_full=False):

        # Implement prediction.
        inputs = self.tokenizer(input_data, return_tensors="pt").to("cuda")

        if 'llama' in self.model_name.lower() or 'falcon' in self.model_name or 'mistral' in self.model_name.lower():
            if 'token_type_ids' in inputs:  # Some HF models have changed.
                del inputs['token_type_ids']
            pad_token_id = self.tokenizer.eos_token_id
        else:
            pad_token_id = None

        if self.stop_sequences is not None:
            stopping_criteria = StoppingCriteriaList([StoppingCriteriaSub(
                stops=self.stop_sequences,
                initial_length=len(inputs['input_ids'][0]),
                tokenizer=self.tokenizer)])
        else:
            stopping_criteria = None

        logging.debug('temperature: %f', temperature)
        with torch.no_grad():
            outputs = self.model.generate(
                **inputs,
                max_new_tokens=self.max_new_tokens,
                return_dict_in_generate=True,
                output_scores=True,
                output_hidden_states=True,
                temperature=temperature,
                do_sample=True,
                stopping_criteria=stopping_criteria,
                pad_token_id=pad_token_id,
            )

        if len(outputs.sequences[0]) > self.token_limit:
            raise ValueError(
                'Generation exceeding token limit %d > %d',
                len(outputs.sequences[0]), self.token_limit)

        full_answer = self.tokenizer.decode(
            outputs.sequences[0], skip_special_tokens=True)

        if return_full:
            return full_answer

        # For some models, we need to remove the input_data from the answer.
        if full_answer.startswith(input_data):
            input_data_offset = len(input_data)
        else:
            raise ValueError('Have not tested this in a while.')

        # Remove input from answer.
        answer = full_answer[input_data_offset:]

        # Remove stop_words from answer.
        stop_at = len(answer)
        sliced_answer = answer
        if self.stop_sequences is not None:
            for stop in self.stop_sequences:
                if answer.endswith(stop):
                    stop_at = len(answer) - len(stop)
                    sliced_answer = answer[:stop_at]
                    break
            if not all([stop not in sliced_answer for stop in self.stop_sequences]):
                error_msg = 'Error: Stop words not removed successfully!'
                error_msg += f'Answer: >{answer}< '
                error_msg += f'Sliced Answer: >{sliced_answer}<'
                if 'falcon' not in self.model_name.lower():
                    raise ValueError(error_msg)
                else:
                    logging.error(error_msg)

        # Remove whitespaces from answer (in particular from beginning.)
        sliced_answer = sliced_answer.strip()

        # Get the number of tokens until the stop word comes up.
        # Note: Indexing with `stop_at` already excludes the stop_token.
        # Note: It's important we do this with full answer, since there might be
        # non-trivial interactions between the input_data and generated part
        # in tokenization (particularly around whitespaces.)
        token_stop_index = self.tokenizer(full_answer[:input_data_offset + stop_at], return_tensors="pt")['input_ids'].shape[1]
        n_input_token = len(inputs['input_ids'][0])
        n_generated = token_stop_index - n_input_token

        if n_generated == 0:
            logging.warning('Only stop_words were generated. For likelihoods and embeddings, taking stop word instead.')
            n_generated = 1

        # Get the last hidden state (last layer) and the last token's embedding of the answer.
        # Note: We do not want this to be the stop token.

        # outputs.hidden_state is a tuple of len = n_generated_tokens.
        # The first hidden state is for the input tokens and is of shape
        #     (n_layers) x (batch_size, input_size, hidden_size).
        # (Note this includes the first generated token!)
        # The remaining hidden states are for the remaining generated tokens and is of shape
        #    (n_layers) x (batch_size, 1, hidden_size).

        # Note: The output embeddings have the shape (batch_size, generated_length, hidden_size).
        # We do not get embeddings for input_data! We thus subtract the n_tokens_in_input from
        # token_stop_index to arrive at the right output.

        if 'decoder_hidden_states' in outputs.keys():
            hidden = outputs.decoder_hidden_states
        else:
            hidden = outputs.hidden_states

        if len(hidden) == 1:
            logging.warning(
                'Taking first and only generation for hidden! '
                'n_generated: %d, n_input_token: %d, token_stop_index %d, '
                'last_token: %s, generation was: %s',
                n_generated, n_input_token, token_stop_index,
                self.tokenizer.decode(outputs['sequences'][0][-1]),
                full_answer,
                )
            last_input = hidden[0]
        elif ((n_generated - 1) >= len(hidden)):
            # If access idx is larger/equal.
            logging.error(
                'Taking last state because n_generated is too large'
                'n_generated: %d, n_input_token: %d, token_stop_index %d, '
                'last_token: %s, generation was: %s, slice_answer: %s',
                n_generated, n_input_token, token_stop_index,
                self.tokenizer.decode(outputs['sequences'][0][-1]),
                full_answer, sliced_answer
                )
            last_input = hidden[-1]
        else:
            last_input = hidden[n_generated - 1]

        # Then access last layer for input
        last_layer = last_input[-1]
        # Then access last token in input.
        last_token_embedding = last_layer[:, -1, :].cpu()

        # Get log_likelihoods.
        # outputs.scores are the logits for the generated token.
        # outputs.scores is a tuple of len = n_generated_tokens.
        # Each entry is shape (bs, vocabulary size).
        # outputs.sequences is the sequence of all tokens: input and generated.
        transition_scores = self.model.compute_transition_scores(
            outputs.sequences, outputs.scores, normalize_logits=True)
        # Transition_scores[0] only contains the scores for the first generated tokens.

        log_likelihoods = [score.item() for score in transition_scores[0]]
        if len(log_likelihoods) == 1:
            logging.warning('Taking first and only generation for log likelihood!')
            log_likelihoods = log_likelihoods
        else:
            log_likelihoods = log_likelihoods[:n_generated]

        if len(log_likelihoods) == self.max_new_tokens:
            logging.warning('Generation interrupted by max_token limit.')

        if len(log_likelihoods) == 0:
            raise ValueError

        return sliced_answer, log_likelihoods, last_token_embedding

```



##log probabilities vs negative log likelihoods:

- Log probabilities are the logarithm of the probability: log(p)
- Negative log likelihoods are the negative logarithm of the probability: -log(p)

The original code is working with log probabilities, not negative log likelihoods. We see normalize_logits=True in compute_transition_scores, it means the model is returning log probabilities.


**The original code is using predictive_entropy_rao, which expects log probabilities as input.

In [9]:
def generate_answer(question, num_samples, model, tokenizer):
    """
    Generate multiple answers to a question using LLM with direct token log likelihoods.
    """
    # Create the prompt with the question
    prompt = f"أجب على السؤال التالي بجملة واحدة فقط موجزة ولكن كاملة باللغة العربية\nQuestion: {question}\nAnswer:"

    results = []

    for _ in range(num_samples):
        # Tokenize the prompt
        inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
        prompt_length = inputs.input_ids.shape[1]  # Number of tokens in the prompt

        # Generate with return_dict_in_generate=True and output_scores=True to get scores
        with torch.no_grad():
            outputs = model.generate(
                inputs.input_ids,
                max_new_tokens=100,
                do_sample=True,
                temperature=0.5,
                return_dict_in_generate=True,
                output_scores=True,
            )

        # Calculate token log probabilities using compute_transition_scores
        # normalize_logits=True ensures we get log probabilities
        transition_scores = model.compute_transition_scores(
            outputs.sequences,
            outputs.scores,
            normalize_logits=True
        )

        # Extract log likelihoods like they did exactly, but no handeling of cases
        log_likelihoods = [score.item() for score in transition_scores[0]]


        # Decode the generated text
        generated_text = tokenizer.decode(outputs.sequences[0], skip_special_tokens=True)
        answer = generated_text.split("Answer:")[-1].strip()

        # Clean the output
        strings_to_filter_on = ['.', '\n', 'Q:', 'A:', 'question:', 'answer:', 'Question:', 'Answer:',
                               'Questions:', 'questions:', 'QUESTION:', 'ANSWER:']
        for string in strings_to_filter_on:
            if string in answer:
                answer = answer.split(string)[0].strip()

        results.append({
            'text': answer,
            'token_log_probs': log_likelihoods,  # Store raw log probabilities

        })

    return results # Return results AFTER the loop completes

# Main

In [10]:
def load_qa_dataset(dataset_name, file_path=None):
    data = []

    try:
        if dataset_name == 'arabicaqa' and file_path:
            if os.path.exists(file_path):  # Use the loaded file
                print("Using ArabicaQA")
                with open(file_path, 'r', encoding='utf-8') as f:
                    custom_data = json.load(f)
                    for idx, item in enumerate(custom_data):
                        data.append({
                            "question_id": idx,
                            "Question": item["question"],
                            "Answer": item["answer"]
                        })
            else:
                raise FileNotFoundError(f"ArabicaQA file not found at {file_path}")

        elif dataset_name == 'xor_tydiqa' and file_path:
            if os.path.exists(file_path):  # Use the loaded file
                print("Using XOR-TyDiQA")
                print("Filtering Arabic QA pairs from XOR-TyDi...")

                # Load the jsonl dataset
                with open(file_path, 'r', encoding='utf-8') as f:
                    custom_data = [json.loads(line) for line in f]  # Handling jsonl

                # Filter Arabic samples ("lang" = "ar")
                arabic_data = []
                for idx, item in enumerate(custom_data):
                    if item["lang"] == "ar":  # Arabic language code
                        arabic_data.append({
                            "question_id": item["id"],
                            "Question": item["question"],
                            "Answer": item["answers"][0]  # First answer in list
                        })

                data.extend(arabic_data)
            else:
                raise FileNotFoundError(f"XOR-TyDiQA file not found at {file_path}")
        else:
            raise ValueError(f"Unsupported dataset: {dataset_name}")

    except Exception as e:
        print(f"Error loading {dataset_name}: {str(e)}")
        return []

    return data

In [None]:
# Function to safely load existing partial results
def load_partial_results(partial_file_path):
    if os.path.exists(partial_file_path):
        with open(partial_file_path, 'r', encoding='utf-8') as f:
            saved_results = json.load(f)
        print(f"✓ Loaded {len(saved_results)} saved results from checkpoint.")
        return saved_results
    else:
        print("✓ No previous checkpoint found. Starting fresh.")
        return []


# Main
if __name__ == "__main__":
    MODEL_CHOICE = "jais"  # Options: "llama", "allam", "jais", "qwen"
    START_FROM_QUESTION = 2202

    print("Loading language model...")

    if MODEL_CHOICE == "llama":
        model_id = "meta-llama/Llama-3.1-8B-Instruct"
        llm_model = AutoModelForCausalLM.from_pretrained(
            model_id,
            torch_dtype=torch.float16,
            device_map="auto"
        )
        llm_tokenizer = AutoTokenizer.from_pretrained(model_id)
        print(f"Loaded Llama 3.1 model successfully")

    elif MODEL_CHOICE == "allam":
        model_id = "ALLaM-AI/ALLaM-7B-Instruct-preview"
        llm_tokenizer = AutoTokenizer.from_pretrained(model_id)
        llm_model = AutoModelForCausalLM.from_pretrained(
            model_id,
            torch_dtype=torch.float16,
            device_map="auto"
        )
        print(f"Loaded ALLaM model successfully")

    elif MODEL_CHOICE == "jais":
        model_id = "inceptionai/jais-family-6p7b-chat"
        llm_tokenizer = AutoTokenizer.from_pretrained(model_id)
        llm_model = AutoModelForCausalLM.from_pretrained(
            model_id,
            torch_dtype=torch.float16,
            device_map="auto"
        )
        print(f"Loaded Jais model successfully")

    elif MODEL_CHOICE == "qwen":
        model_id = "Qwen/Qwen2-7B-Instruct"
        llm_tokenizer = AutoTokenizer.from_pretrained(model_id)
        llm_model = AutoModelForCausalLM.from_pretrained(
            model_id,
            torch_dtype=torch.float16,
            device_map="auto"
        )
        print(f"Loaded Qwen2 model successfully")
    else:
        raise ValueError(f"Unknown model choice: {MODEL_CHOICE}")

    print("Loading entailment model...")
    entailment_model = ArabicEntailmentModel()
    print("Entailment model loaded successfully")

    print("Loading dataset...")
    DATASET_CHOICE = 'arabicaqa'  # Options: 'arabicaqa', 'xor_tydiqa'
    FILE_PATH = '/content/test-open.json'
    data = load_qa_dataset(DATASET_CHOICE, FILE_PATH)
    print(f"Loaded {len(data)} questions from {DATASET_CHOICE}")

    print("\nStarting semantic entropy experiments...")

    partial_save_path = f'semantic_entropy_{MODEL_CHOICE}_partial_results.json'

    # Load previous partial results if available
    results = load_partial_results(partial_save_path)

    #  starting point
    if START_FROM_QUESTION is not None:

        already_processed = START_FROM_QUESTION
        # Trim results to match the starting point if needed
        if len(results) > already_processed:
            results = results[:already_processed]
            print(f"Truncated results to match starting point at question {already_processed}")
    else:
        # Use checkpoint
        already_processed = len(results)

    print(f"Starting from question {already_processed + 1}")

    MAX_QUESTIONS = 6000  # the last question in a run
    test_questions = data[:MAX_QUESTIONS] if MAX_QUESTIONS else data

    for i, item in tqdm(enumerate(test_questions[already_processed:], start=already_processed),
                         total=len(test_questions)-already_processed,
                         desc="Processing questions"):
        question = item["Question"]
        original_answer = item["Answer"]
        print(f"\n[{i+1}/{len(test_questions)}] Processing question")

        result = experiment_semantic_entropy(
            question=question,
            original_answer=original_answer,
            llm_model=llm_model,
            llm_tokenizer=llm_tokenizer,
            entailment_model=entailment_model,
            num_samples=10
        )
        results.append(result)

        # Save progress after every 5 questions
        if (i+1) % 5 == 0:
            with open(partial_save_path, 'w', encoding='utf-8') as f:
                json.dump(results, f, ensure_ascii=False, indent=2)
            print(f"✓ Saved partial progress after {i+1} questions.")

    # Save final results
    final_save_path = f'semantic_entropy_{MODEL_CHOICE}_{DATASET_CHOICE}_resultsP2.json'
    print("\nSaving final results...")
    with open(final_save_path, 'w', encoding='utf-8') as f:
        json.dump(results, f, ensure_ascii=False, indent=2)


    files.download(final_save_path)
    print(f"✓ Experiment completed successfully with {MODEL_CHOICE} model!")

In [12]:
# Function to safely load existing partial results
def load_partial_results(partial_file_path):
    if os.path.exists(partial_file_path):
        with open(partial_file_path, 'r', encoding='utf-8') as f:
            saved_results = json.load(f)
        print(f"✓ Loaded {len(saved_results)} saved results from checkpoint.")
        return saved_results
    else:
        print("✓ No previous checkpoint found. Starting fresh.")
        return []


# Main
if __name__ == "__main__":
    MODEL_CHOICE = "jais"  # Options: "llama", "allam", "jais", "qwen"

    print("Loading language model...")
    if MODEL_CHOICE == "llama":
        model_id = "meta-llama/Llama-3.1-8B-Instruct"
        llm_model = AutoModelForCausalLM.from_pretrained(
            model_id,
            torch_dtype=torch.float16,
            device_map="auto"
        )
        llm_tokenizer = AutoTokenizer.from_pretrained(model_id)
        print(f"Loaded Llama 3.1 model successfully")

    elif MODEL_CHOICE == "allam":
        model_id = "ALLaM-AI/ALLaM-7B-Instruct-preview"
        llm_tokenizer = AutoTokenizer.from_pretrained(model_id)
        llm_model = AutoModelForCausalLM.from_pretrained(
            model_id,
            torch_dtype=torch.float16,
            device_map="auto"
        )
        print(f"Loaded ALLaM model successfully")

    elif MODEL_CHOICE == "jais":
        model_id = "inceptionai/jais-family-6p7b-chat"
        llm_tokenizer = AutoTokenizer.from_pretrained(model_id)
        llm_model = AutoModelForCausalLM.from_pretrained(
            model_id,
            torch_dtype=torch.float16,
            device_map="auto"
        )
        print(f"Loaded Jais model successfully")

    elif MODEL_CHOICE == "qwen":
        model_id = "Qwen/Qwen2-7B-Instruct"
        llm_tokenizer = AutoTokenizer.from_pretrained(model_id)
        llm_model = AutoModelForCausalLM.from_pretrained(
            model_id,
            torch_dtype=torch.float16,
            device_map="auto"
        )
        print(f"Loaded Qwen2 model successfully")
    else:
        raise ValueError(f"Unknown model choice: {MODEL_CHOICE}")

    print("Loading entailment model...")
    entailment_model = ArabicEntailmentModel()
    print("Entailment model loaded successfully")

    print("Loading dataset...")
    DATASET_CHOICE = 'arabicaqa'  # Options: 'arabicaqa', 'xor_tydiqa'
    FILE_PATH = '/content/test-open.json'
    data = load_qa_dataset(DATASET_CHOICE, FILE_PATH)
    print(f"Loaded {len(data)} questions from {DATASET_CHOICE}")

    print("\nStarting semantic entropy experiments...")
    partial_save_path = f'semantic_entropy_{MODEL_CHOICE}_partial_results.json'

    # Load previous partial results if available
    results = load_partial_results(partial_save_path)

    already_processed = len(results)
    print(f"Starting from question {already_processed + 1}")

    MAX_QUESTIONS = 6000  # Set to a number or None
    test_questions = data[:MAX_QUESTIONS] if MAX_QUESTIONS else data

    for i, item in tqdm(enumerate(test_questions[already_processed:], start=already_processed),
                         total=len(test_questions)-already_processed,
                         desc="Processing questions"):
        question = item["Question"]
        original_answer = item["Answer"]
        print(f"\n[{i+1}/{len(test_questions)}] Processing question")

        result = experiment_semantic_entropy(
            question=question,
            original_answer=original_answer,
            llm_model=llm_model,
            llm_tokenizer=llm_tokenizer,
            entailment_model=entailment_model,
            num_samples=10
        )
        results.append(result)

        # Save progress after every 5 questions
        if (i+1) % 5 == 0:
            with open(partial_save_path, 'w', encoding='utf-8') as f:
                json.dump(results, f, ensure_ascii=False, indent=2)
            print(f"✓ Saved partial progress after {i+1} questions.")

    # Save final results
    final_save_path = f'semantic_entropy_{MODEL_CHOICE}_{DATASET_CHOICE}_results.json'
    print("\nSaving final results...")
    with open(final_save_path, 'w', encoding='utf-8') as f:
        json.dump(results, f, ensure_ascii=False, indent=2)


    files.download(final_save_path)
    print(f"✓ Experiment completed successfully with {MODEL_CHOICE} model!")


Loading language model...
The repository for inceptionai/jais-family-6p7b-chat contains custom code which must be executed to correctly load the model. You can inspect the repository content at https://hf.co/inceptionai/jais-family-6p7b-chat.
You can avoid this prompt in future by passing the argument `trust_remote_code=True`.

Do you wish to run the custom code? [y/N] Y


configuration_jais.py:   0%|          | 0.00/9.52k [00:00<?, ?B/s]

A new version of the following files was downloaded from https://huggingface.co/inceptionai/jais-family-6p7b-chat:
- configuration_jais.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.


The repository for inceptionai/jais-family-6p7b-chat contains custom code which must be executed to correctly load the model. You can inspect the repository content at https://hf.co/inceptionai/jais-family-6p7b-chat.
You can avoid this prompt in future by passing the argument `trust_remote_code=True`.

Do you wish to run the custom code? [y/N] Y


modeling_jais.py:   0%|          | 0.00/71.8k [00:00<?, ?B/s]

A new version of the following files was downloaded from https://huggingface.co/inceptionai/jais-family-6p7b-chat:
- modeling_jais.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.


model.safetensors.index.json:   0%|          | 0.00/35.8k [00:00<?, ?B/s]

Fetching 3 files:   0%|          | 0/3 [00:00<?, ?it/s]

model-00002-of-00003.safetensors:   0%|          | 0.00/9.85G [00:00<?, ?B/s]

model-00003-of-00003.safetensors:   0%|          | 0.00/8.82G [00:00<?, ?B/s]

model-00001-of-00003.safetensors:   0%|          | 0.00/9.90G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

Loaded Jais model successfully
Loading entailment model...
Loading AraELECTRA model for Arabic entailment checking...
AraELECTRA model loaded successfully
Entailment model loaded successfully
Loading dataset...
Using ArabicaQA
Loaded 12592 questions from arabicaqa

Starting semantic entropy experiments...
✓ No previous checkpoint found. Starting fresh.
Starting from question 1


Processing questions:   0%|          | 0/6000 [00:00<?, ?it/s]

The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.


[1;30;43mStreaming output truncated to the last 5000 lines.[0m
Semantic Entropy: 1.0626

[5368/6000] Processing question

Entropy Analysis for: 'متى يتم إغلاق الدرز بين العظام في الجمجمة؟
'
Generated 10 answers in 1 semantic clusters
 naive Entropy: 0.2722
Semantic Entropy: -0.0000

[5369/6000] Processing question

Entropy Analysis for: 'ما هي الطرق التي استخدمت في المجتمعات القديمة لتغيير شكل الرأس؟
'
Generated 10 answers in 5 semantic clusters
 naive Entropy: 0.6398
Semantic Entropy: 1.2649

[5370/6000] Processing question

Entropy Analysis for: 'كيف يختلف القبو القحفي في البرمائيات والزواحف عن تلك في الثدييات والطيور؟
'
Generated 10 answers in 3 semantic clusters
 naive Entropy: 0.2573
Semantic Entropy: 0.6294
✓ Saved partial progress after 5370 questions.

[5371/6000] Processing question

Entropy Analysis for: 'ما هو دور مدينة هافانا في كوبا؟'
Generated 10 answers in 3 semantic clusters
 naive Entropy: 0.2651
Semantic Entropy: 0.6520

[5372/6000] Processing question

Entropy Anal

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

✓ Experiment completed successfully with jais model!


# **extra**

In [None]:
files.download(f'semantic_entropy_{MODEL_CHOICE}_{DATASET_CHOICE}_results.json')
