<a href="https://colab.research.google.com/github/tubagokhan/RegNLPDataset/blob/main/RIRAGEvaluationMetricFull.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!gdown 1BB21XL5geQ0Uw6gk0_uRFk6tthPQ0vSh
!gdown 1BGb3EYmhmTMVx3qTihCBpUrJ3WO2lM8H

Downloading...
From: https://drive.google.com/uc?id=1BB21XL5geQ0Uw6gk0_uRFk6tthPQ0vSh
To: /content/ObligationClassificationDataset.json
100% 545k/545k [00:00<00:00, 114MB/s]
Downloading...
From: https://drive.google.com/uc?id=1BGb3EYmhmTMVx3qTihCBpUrJ3WO2lM8H
To: /content/selected_test_samples.json
100% 135k/135k [00:00<00:00, 120MB/s]


In [2]:
import json
import torch
from torch.utils.data import Dataset
from transformers import AutoTokenizer, AutoModelForSequenceClassification, Trainer, TrainingArguments, EarlyStoppingCallback
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, precision_recall_fscore_support

# Check if CUDA is available and use it if possible
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

# Step 1: Load and preprocess the data
json_path = "./ObligationClassificationDataset.json"
with open(json_path, 'r') as file:
    data = json.load(file)

texts = [item['Text'] for item in data]
labels = [1 if item['Obligation'] else 0 for item in data]  # Converting True/False to 1/0

# Step 2: Tokenization using LegalBERT tokenizer
tokenizer = AutoTokenizer.from_pretrained('nlpaueb/legal-bert-base-uncased')

class ObligationDataset(Dataset):
    def __init__(self, texts, labels, tokenizer, max_len=128):
        self.texts = texts
        self.labels = labels
        self.tokenizer = tokenizer
        self.max_len = max_len

    def __len__(self):
        return len(self.texts)

    def __getitem__(self, idx):
        text = self.texts[idx]
        label = self.labels[idx]
        encoding = self.tokenizer.encode_plus(
            text,
            add_special_tokens=True,
            max_length=self.max_len,
            return_token_type_ids=False,
            truncation=True,
            padding='max_length',
            return_attention_mask=True,
            return_tensors='pt',
        )
        return {
            'text': text,
            'input_ids': encoding['input_ids'].flatten(),
            'attention_mask': encoding['attention_mask'].flatten(),
            'labels': torch.tensor(label, dtype=torch.long)
        }

# Splitting data into training and validation sets
X_train, X_val, y_train, y_val = train_test_split(texts, labels, test_size=0.2, random_state=42)

train_dataset = ObligationDataset(X_train, y_train, tokenizer)
val_dataset = ObligationDataset(X_val, y_val, tokenizer)

# Step 3: Fine-tuning LegalBERT for sequence classification
model = AutoModelForSequenceClassification.from_pretrained('nlpaueb/legal-bert-base-uncased', num_labels=2)
model.to(device)  # Move model to the GPU

training_args = TrainingArguments(
    output_dir='./results',
    num_train_epochs=10,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    warmup_steps=500,
    weight_decay=0.01,
    logging_dir='./logs',
    logging_steps=10,
    eval_strategy="epoch",
    save_strategy="epoch",
    load_best_model_at_end=True,
)

def compute_metrics(pred):
    labels = pred.label_ids
    preds = pred.predictions.argmax(-1)
    precision, recall, f1, _ = precision_recall_fscore_support(labels, preds, average='binary')
    acc = accuracy_score(labels, preds)
    return {
        'accuracy': acc,
        'f1': f1,
        'precision': precision,
        'recall': recall
    }

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=val_dataset,
    compute_metrics=compute_metrics,
    callbacks=[EarlyStoppingCallback(early_stopping_patience=3)]
)

# Step 4: Train the model
trainer.train()

# Step 5: Evaluate the model
trainer.evaluate()

# Step 6: Save the model and tokenizer for future use
model.save_pretrained('./obligation-classifier-legalbert')
tokenizer.save_pretrained('./obligation-classifier-legalbert')

print("Model fine-tuning and evaluation completed.")


Using device: cuda


tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/1.02k [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/222k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/440M [00:00<?, ?B/s]

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at nlpaueb/legal-bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,F1,Precision,Recall
1,0.4774,0.384716,0.85,0.88707,0.813814,0.97482
2,0.0512,0.080557,0.980435,0.983957,0.975265,0.992806
3,0.0662,0.02558,0.991304,0.992806,0.992806,0.992806
4,0.0002,0.034434,0.993478,0.994595,0.99639,0.992806
5,0.0208,0.100885,0.984783,0.98725,1.0,0.97482
6,0.0478,0.057023,0.991304,0.992754,1.0,0.985612


Model fine-tuning and evaluation completed.


In [3]:
!pip install sentence_transformers

Collecting sentence_transformers
  Downloading sentence_transformers-3.0.1-py3-none-any.whl.metadata (10 kB)
Downloading sentence_transformers-3.0.1-py3-none-any.whl (227 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m227.1/227.1 kB[0m [31m3.8 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: sentence_transformers
Successfully installed sentence_transformers-3.0.1


In [67]:
import json
import pandas as pd
import numpy as np
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
from nltk.tokenize import sent_tokenize as sent_tokenize_uncached
import nltk
from functools import cache
import tqdm

nltk.download('punkt')

# Set up device for torch operations
device = 'cuda' if torch.cuda.is_available() else 'cpu'

# Load the tokenizer and model for obligation detection
model_name = './obligation-classifier-legalbert'
obligation_tokenizer = AutoTokenizer.from_pretrained(model_name)
obligation_model = AutoModelForSequenceClassification.from_pretrained(model_name)
obligation_model.to(device)
obligation_model.eval()

# Load NLI model and tokenizer for obligation coverage using Microsoft's model
coverage_nli_model = pipeline("text-classification", model="microsoft/deberta-large-mnli", device=device)

# Load NLI model and tokenizer for entailment and contradiction checks
nli_tokenizer = AutoTokenizer.from_pretrained('cross-encoder/nli-deberta-v3-xsmall')
nli_model = AutoModelForSequenceClassification.from_pretrained('cross-encoder/nli-deberta-v3-xsmall')
nli_model.to(device)
nli_model.eval()

# Define a cached version of sentence tokenization
@cache
def sent_tokenize(passage: str):
  return sent_tokenize_uncached(passage)

def softmax(logits):
    e_logits = np.exp(logits - np.max(logits, axis=1, keepdims=True))
    return e_logits / np.sum(e_logits, axis=1, keepdims=True)

def get_nli_probabilities(premises, hypotheses):
    features = tokenizer(premises, hypotheses, padding=True, truncation=True, return_tensors="pt").to("cuda")
    nli_model.eval()
    with torch.no_grad():
        logits = nli_model(**features).logits.cpu().numpy()
    probabilities = softmax(logits)
    return probabilities

def get_nli_matrix(passages, answers):
    print(f"{len(passages)} passages and {len(answers)} answers.")
    entailment_matrix = np.zeros((len(passages), len(answers)))
    contradiction_matrix = np.zeros((len(passages), len(answers)))

    batch_size = 16
    for i, pas in enumerate(tqdm.tqdm(passages)):
      for b in range(0, len(answers), batch_size):
        e = b + batch_size
        probs = get_nli_probabilities([pas] * len(answers[b:e]), answers[b:e])  # Get NLI probabilities
        entailment_matrix[i, b:e] = probs[:, 1]
        contradiction_matrix[i, b:e] = probs[:, 0]
    return entailment_matrix, contradiction_matrix

def calculate_scores_from_matrix(nli_matrix, score_type='entailment'):
    if nli_matrix.size == 0:
        print("Warning: NLI matrix is empty. Returning default score of 0.")
        return 0.0  # or some other default score or handling as appropriate for your use case

    if score_type == 'entailment':
        reduced_vector = np.max(nli_matrix, axis=0)
    elif score_type == 'contradiction':
        reduced_vector = np.min(nli_matrix, axis=0)
    score = np.round(np.mean(reduced_vector), 5)
    return score

def calculate_obligation_coverage_score(passages, answers):
    obligation_sentences_source = [sent for passage in passages for sent in sent_tokenize(passage)]
    obligation_sentences_answer = [sent for answer in answers for sent in sent_tokenize(answer)]
    covered_count = 0

    for obligation in obligation_sentences_source:
        obligation_covered = False
        for answer_sentence in obligation_sentences_answer:
            nli_result = coverage_nli_model(f"{answer_sentence} [SEP] {obligation}")
            if nli_result[0]['label'].lower() == 'entailment' and nli_result[0]['score'] > 0.7:
                covered_count += 1
                obligation_covered = True
                break


    coverage_score = covered_count / len(obligation_sentences_source) if obligation_sentences_source else 0
    return coverage_score

def calculate_final_composite_score(passages, answers):
    passage_sentences = [sent for passage in passages for sent in sent_tokenize(passage)]
    answer_sentences = [sent for answer in answers for sent in sent_tokenize(answer)]
    entailment_matrix, contradiction_matrix = get_nli_matrix(passage_sentences, answer_sentences)
    entailment_score = calculate_scores_from_matrix(entailment_matrix, 'entailment')
    contradiction_score = calculate_scores_from_matrix(contradiction_matrix, 'contradiction')
    obligation_coverage_score = calculate_obligation_coverage_score(passages, answers)
    print(f"Entailment Score: {entailment_score}")
    print(f"Contradiction Score: {contradiction_score}")
    print(f"Obligation Coverage Score: {obligation_coverage_score}")


    # New formula: (O + E - C + 1) / 3
    composite_score = (obligation_coverage_score + entailment_score - contradiction_score + 1) / 3
    print(f"Final Composite Score: {composite_score}")
    return np.round(composite_score, 5)



[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
Some weights of the model checkpoint at microsoft/deberta-large-mnli were not used when initializing DebertaForSequenceClassification: ['config']
- This IS expected if you are initializing DebertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing DebertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


In [74]:
test_data=[
    {
    "QuestionID": "1",
    "Question": "What can I do sometimes",
    "RetrievedPassage": "The proposed merger aims to combine resources. While this move is expected to result in significant market influence and revenue increase, it also brings risks such as integration challenges, potential layoffs, and regulatory scrutiny.",
    "Answer": "to be or not to be. Is there anybody there? You may carefully apply the rules."
  },
  {
    "QuestionID": "1",
    "Question": "What are the potential risks and benefits associated with the new merger proposal?",
    "RetrievedPassage": "The proposed merger aims to combine resources of two leading tech firms to drive innovation and market expansion. While this move is expected to result in significant market influence and revenue increase, it also brings risks such as integration challenges, potential layoffs, and regulatory scrutiny.",
    "Answer": "The merger is poised to significantly enhance our market position by pooling resources and expertise. However, we must navigate challenges such as the harmonization of corporate cultures and potential regulatory issues that may arise."
  },
  {
    "QuestionID": "2",
    "Question": "How will the new environmental regulation impact our manufacturing processes?",
    "RetrievedPassage": "The new environmental regulation mandates reductions in carbon emissions and requires the use of cleaner technologies. This will necessitate a shift in our manufacturing processes, involving substantial investment in new technology and potential restructuring of our facilities.",
    "Answer": "To comply with the new regulation, we will be overhauling our current manufacturing processes, introducing energy-efficient technologies, and potentially facing short-term disruptions but ultimately improving our sustainability and compliance."
  },
  {
    "QuestionID": "3",
    "Question": "What steps are being taken to improve employee satisfaction and retention?",
    "RetrievedPassage": "In response to recent surveys indicating low employee satisfaction, the company plans to implement several initiatives. These include increasing compensation, offering more flexible working conditions, enhancing career development opportunities, and improving workplace culture.",
    "Answer": "We are committed to improving employee satisfaction through comprehensive measures including competitive salary adjustments, flexible work policies, expanded professional development, and fostering a more inclusive and supportive work environment."
  },
  {
    "QuestionID": "4",
    "Question": "Can you outline the new product development plan for the next year?",
    "RetrievedPassage": "The product development plan for the next year focuses on leveraging advanced technologies to introduce a new line of smart devices. This plan includes phases of market research, prototype development, user testing, and a final market launch aimed for the fourth quarter.",
    "Answer": "Our strategy involves a step-by-step rollout of new smart devices, starting with intensive market research, moving through several stages of design refinement and user testing, and culminating in a targeted launch by year-end to capture holiday sales."
  },
  {
    "QuestionID": "5",
    "Question": "What are the primary objectives of our latest strategic initiative?",
    "RetrievedPassage": "The primary objectives of our latest strategic initiative include increasing global market share, diversifying our product line to include eco-friendly options, and enhancing digital transformation across all departments to improve operational efficiency.",
    "Answer": "Our strategic initiative aims to position us as a leader in the global market by expanding our reach, introducing sustainable products that meet modern environmental standards, and adopting cutting-edge digital solutions to streamline operations and reduce costs."
  },
  {
    "QuestionID": "6",
    "Question": "What are the challenges facing the company's expansion into Asian markets?",
    "RetrievedPassage": "Expanding into Asian markets presents several challenges including navigating diverse regulatory landscapes, understanding cultural nuances in consumer behavior, and competing with established local brands. Additionally, logistical complexities and the need for local partnerships complicate market entry.",
    "Answer": "Our expansion strategy must address the intricate regulatory environments and cultural diversity in Asia. We plan to forge strong local partnerships, adapt our marketing strategies to align with consumer preferences, and build a logistics network that supports efficient distribution."
  },
  {
    "QuestionID": "7",
    "Question": "Describe the cybersecurity measures in place to protect client data.",
    "RetrievedPassage": "Our cybersecurity measures include multi-layered security protocols, regular system audits, and compliance with international data protection regulations. We utilize advanced encryption techniques, continuous monitoring systems, and have implemented strict access controls and training programs for all staff.",
    "Answer": "We ensure the highest level of data security through robust encryption standards, ongoing surveillance of our IT infrastructure, and adherence to global best practices in data protection. Our proactive approach involves regular audits and updates to our security protocols to guard against emerging threats."
  },
  {
    "QuestionID": "8",
    "Question": "How do current economic conditions affect our business strategy?",
    "RetrievedPassage": "The current economic conditions, characterized by fluctuating market trends and consumer spending patterns, necessitate a flexible approach to our business strategy. This involves adapting our product offerings and pricing models to meet changing demands, optimizing supply chains to reduce costs, and enhancing our online presence.",
    "Answer": "In response to volatile economic conditions, we are adjusting our business model to be more agile. This includes revising our pricing strategies, expanding our digital marketing efforts, and streamlining operations to maintain profitability despite economic pressures."
  },
  {
    "QuestionID": "9",
    "Question": "What improvements have been made in customer service this year?",
    "RetrievedPassage": "This year, we've made significant improvements in customer service by integrating AI technology into our support systems, training staff extensively, and expanding our online service platforms to ensure 24/7 availability. These initiatives have been aimed at reducing response times and enhancing user satisfaction.",
    "Answer": "Our customer service has been transformed through the adoption of AI-driven support tools that offer real-time assistance and predictive problem-solving. Enhanced training programs for our team ensure a high standard of service, while expanded online platforms provide constant accessibility and support."
  },
  {
    "QuestionID": "10",
    "Question": "Explain the financial implications of the recent merger.",
    "RetrievedPassage": "The recent merger is expected to streamline operations and achieve cost synergies by combining resources, which should result in increased profitability. However, it also involves significant upfront costs related to integration, potential redundancy payouts, and increased debt load.",
    "Answer": "While the merger presents initial financial challenges, including integration costs and the assumption of some debt, it positions us for greater long-term profitability through operational efficiencies and a broader market presence. Cost synergies should start to be realized within the first two years post-merger."
  }
]


In [76]:
# Load the selected test samples JSON
input_file_path = '/content/GoldenQuestionAnswerforEvaluationMetric.json'
with open(input_file_path, 'r') as file:
    test_data = json.load(file)

# Prepare the data
composite_scores = []
entailment_scores = []
contradiction_scores = []
obligation_coverage_scores = []

for item in test_data:
    question = [item['Question']]
    passages = [item['RetrievedPassage']]
    answers = [item['Answer']]
    print(f"Question: {question}")

    # Calculate and store scores
    passage_sentences = [sent for passage in passages for sent in sent_tokenize(passage)]
    answer_sentences = [sent for answer in answers for sent in sent_tokenize(answer)]
    entailment_matrix, contradiction_matrix = get_nli_matrix(passage_sentences, answer_sentences)
    entailment_score = calculate_scores_from_matrix(entailment_matrix, 'entailment')
    contradiction_score = calculate_scores_from_matrix(contradiction_matrix, 'contradiction')
    obligation_coverage_score = calculate_obligation_coverage_score(passages, answers)
    final_composite_score = (obligation_coverage_score + entailment_score - contradiction_score + 1) / 3

    # Append to respective lists
    entailment_scores.append(entailment_score)
    contradiction_scores.append(contradiction_score)
    obligation_coverage_scores.append(obligation_coverage_score)
    composite_scores.append(final_composite_score)

    print("\n")

# Calculate averages
avg_entailment = np.mean(entailment_scores)
avg_contradiction = np.mean(contradiction_scores)
avg_obligation_coverage = np.mean(obligation_coverage_scores)
avg_composite = np.mean(composite_scores)

print("\n")
print("Average Entailment Score:", avg_entailment)
print("Average Contradiction Score:", avg_contradiction)
print("Average Obligation Coverage Score:", avg_obligation_coverage)
print("Average Final Composite Score:", avg_composite)


Question: ["Do we regulate a CFD based upon a spot trade 'rolling a spot forex contract'?"]
3 passages and 4 answers.


100%|██████████| 3/3 [00:00<00:00, 26.77it/s]




Question: ['Does a set of threshold conditions set in a written undertaking be legally binding?']
6 passages and 7 answers.


100%|██████████| 6/6 [00:00<00:00, 24.42it/s]




Question: ['Can the FSRA exercise its powers to object to a change in control based on the unsuitability of the Controller?']
11 passages and 12 answers.


100%|██████████| 11/11 [00:00<00:00, 24.64it/s]




Question: ['For the calculation of EBCM and AAE must the firm use the multiplier (18/52nds) or the multiplier of (13/52nds)?']
3 passages and 4 answers.


100%|██████████| 3/3 [00:00<00:00, 32.46it/s]




Question: ['Would an incorporation of an ADGM holding company that resulted from a subsequent share swap from shareholders of another firm constitute a Public Offer?']
4 passages and 5 answers.


100%|██████████| 4/4 [00:00<00:00, 37.62it/s]




Question: ['Are onshore banks (banks licensed by CBUAE) allowed to market their products and services in ADGM without triggering an ADGM licensing requirement?']
1 passages and 2 answers.


100%|██████████| 1/1 [00:00<00:00, 38.29it/s]




Question: ['Can a firm with an IPA apply for a CIC?']
9 passages and 10 answers.


100%|██████████| 9/9 [00:00<00:00, 11.42it/s]




Question: ['Is offering UAE Dirham escrow account facilities (in this particular case, through a UAE Central Bank licensed Bank for the settlement of Virtual Asset exchange-based transactions in AED) required to be authorised to Provide Money Services?']
2 passages and 3 answers.


100%|██████████| 2/2 [00:00<00:00, 42.00it/s]




Question: ['Does a press launch in ADGM (for a specific listing transaction), on the assumption that the prospectus will not be made available, be deemed a financial promotion/offer in the ADGM?']
92 passages and 93 answers.


100%|██████████| 92/92 [01:14<00:00,  1.24it/s]




Question: ['Can the FSRA apply discretion on collecting the Fees for late submission of Regulatory Filings?']
3 passages and 4 answers.


100%|██████████| 3/3 [00:00<00:00, 40.04it/s]




Question: ['What are the options to remediate non-payment of fees other than late fees?']
19 passages and 28 answers.


100%|██████████| 19/19 [00:04<00:00,  4.52it/s]




Question: ['Does the CDD rules in AML 8.3.1 apply to all Relevant Persons including DNFBP firms such as, law firms, notary firms or other independent legal business?']
19 passages and 20 answers.


100%|██████████| 19/19 [00:06<00:00,  2.99it/s]




Question: ['Are investments in short-term credit facilities, revolving facilities, and trade finance receivable financing prohibited under the private credit fund regime?']
36 passages and 40 answers.


100%|██████████| 36/36 [00:03<00:00, 10.06it/s]




Question: ['Can an Investment Partnership structured as a GP-LP fund be registered in ADGM if the fund manager is a foreign entity?']
4 passages and 8 answers.


100%|██████████| 4/4 [00:00<00:00, 24.60it/s]




Question: ["If a Payment Service Provider 'PSP' holding and controlling client money held in 2 client money accounts (2 different banks). Bank A will be held responsible for funds for the e-Wallet for customers. While the Payment Service Provider will also launch a VISA card (physical and virtual) via Bank B which would act as a BIN Sponsor were it would allow customers to withdraw cash. In that case Bank B would actually enter into a separate bilateral agreement with the customer and not the 'PSP'. This means an agreement between Visa and the PSP User which permits the User to access the national ATM system with their card and withdraw cash. Would the PSP be at risk of breaching COB 19.7 by allowing withdrawal of cash by their users? 19.7"]
6 passages and 7 answers.


100%|██████████| 6/6 [00:00<00:00, 36.93it/s]




Question: ['What are the rules FSRA have in relation to outsourced activities and the avoidance of concentration risk?']
15 passages and 19 answers.


100%|██████████| 15/15 [00:01<00:00, 10.65it/s]




Question: ['If an entity would invest in digital assets business, including those within ADGM ecosystem, and also acquire direct interests in Virtual Assets, primarily investing in coins/tokens issued to fund use cases rather than trading. What circumstances an ADGM vehicle of such with direct holdings could be considered to be dealing as principal and thus require an FSRA license?']
4 passages and 6 answers.


100%|██████████| 4/4 [00:00<00:00, 15.44it/s]




Question: ['With reference to the definition of Fund Administrator in the GLO module and the requirements of FUNDS 7.1.2(1)(a), if a DIFC-based firm in good standing that is licensed and authorised by the DFSA to carry on the Financial Service of Providing Fund Administration need not make any application or notification to the FSRA in connection with it being appointed by a Foreign Fund Manager to be the Fund Administrator for a Domestic Fund that is not a Public Fund?']
4 passages and 10 answers.


100%|██████████| 4/4 [00:00<00:00,  8.97it/s]




Question: ['What are the remedies FSRA have upon non-payment of annual supervision fees?']
7 passages and 9 answers.


100%|██████████| 7/7 [00:00<00:00,  8.95it/s]




Question: ['What are the provisions in FSRA in case a request from a company licensed by your authority, to acquire 30% or more of a non-licensed company (does not have a license from your authority) that is located outside your jurisdiction or outside the UAE?']
11 passages and 17 answers.


100%|██████████| 11/11 [00:02<00:00,  4.42it/s]




Question: ['Do the Waiver/Modification application Notices have to go through PWMC? Or an approval from the CEO would be considered enough?']
3 passages and 4 answers.


100%|██████████| 3/3 [00:00<00:00, 38.70it/s]






Average Entailment Score: 0.8372866666666667
Average Contradiction Score: 0.06893619047619048
Average Obligation Coverage Score: 1.0
Average Final Composite Score: 0.9227834920634921


In [77]:
# Load the selected test samples JSON
input_file_path = '/content/selected_test_samples.json'
with open(input_file_path, 'r') as file:
    test_data = json.load(file)

# Prepare the data
composite_scores = []
entailment_scores = []
contradiction_scores = []
obligation_coverage_scores = []

for item in test_data:
    question = [item['Question']]
    passages = [item['RetrievedPassage']]
    answers = [item['Answer']]
    print(f"Question: {question}")

    # Calculate and store scores
    passage_sentences = [sent for passage in passages for sent in sent_tokenize(passage)]
    answer_sentences = [sent for answer in answers for sent in sent_tokenize(answer)]
    entailment_matrix, contradiction_matrix = get_nli_matrix(passage_sentences, answer_sentences)
    entailment_score = calculate_scores_from_matrix(entailment_matrix, 'entailment')
    contradiction_score = calculate_scores_from_matrix(contradiction_matrix, 'contradiction')
    obligation_coverage_score = calculate_obligation_coverage_score(passages, answers)
    final_composite_score = (obligation_coverage_score + entailment_score - contradiction_score + 1) / 3

    # Append to respective lists
    entailment_scores.append(entailment_score)
    contradiction_scores.append(contradiction_score)
    obligation_coverage_scores.append(obligation_coverage_score)
    composite_scores.append(final_composite_score)

    print("\n")

# Calculate averages
avg_entailment = np.mean(entailment_scores)
avg_contradiction = np.mean(contradiction_scores)
avg_obligation_coverage = np.mean(obligation_coverage_scores)
avg_composite = np.mean(composite_scores)

print("\n")
print("Average Entailment Score:", avg_entailment)
print("Average Contradiction Score:", avg_contradiction)
print("Average Obligation Coverage Score:", avg_obligation_coverage)
print("Average Final Composite Score:", avg_composite)


Question: ['Are client classification requirements applicable to an Authorised Person operating a Representative Office when engaging with a Person regarding Regulated Activities or Specified Investments?']
2 passages and 3 answers.


100%|██████████| 2/2 [00:00<00:00, 37.72it/s]




Question: ['When categorizing Clients, what are the two Client categories that an Authorised Person must consider?']
2 passages and 2 answers.


100%|██████████| 2/2 [00:00<00:00, 40.06it/s]




Question: ['How does an Authorised Person determine the appropriate Client category for each Client?']
2 passages and 8 answers.


100%|██████████| 2/2 [00:00<00:00, 34.63it/s]




Question: ['How should an Authorised Person determine when a Person becomes their Client?']
1 passages and 3 answers.


100%|██████████| 1/1 [00:00<00:00, 34.71it/s]




Question: ['Why is it important for an Authorised Person to assess when a Person transitions to being their Client based on the nature of the Regulated Activity or Specified Investment involved?']
1 passages and 4 answers.


100%|██████████| 1/1 [00:00<00:00, 31.07it/s]




Question: ['In what order must Client classification and marketing activities take place before an Authorised Person engages in a Regulated Activity?']
2 passages and 2 answers.


100%|██████████| 2/2 [00:00<00:00, 39.66it/s]




Question: ['Can an Authorised Person conduct marketing activities before formal Client classification has been completed and documented?']
1 passages and 4 answers.


100%|██████████| 1/1 [00:00<00:00, 35.47it/s]




Question: ['How does the client classification process help in providing an appropriate level of regulatory protection to Clients based on their resources and expertise?']
2 passages and 3 answers.


100%|██████████| 2/2 [00:00<00:00, 37.04it/s]




Question: ['How can a person be classified into different categories of clients depending on the regulated activities, services, products, or transactions being provided?']
1 passages and 4 answers.


100%|██████████| 1/1 [00:00<00:00, 35.77it/s]




Question: ['Can a person be classified as one type of client for a particular regulated activity and a different type for another unrelated regulated activity?']
1 passages and 3 answers.


100%|██████████| 1/1 [00:00<00:00, 34.71it/s]




Question: ['What is the required action for an Authorised Person when a Client is acting as an agent for another person in a specific transaction, and the Client is not an Authorised Person, Recognised Body, Remote Body, or Regulated Financial Institution?']
1 passages and 2 answers.


100%|██████████| 1/1 [00:00<00:00, 32.66it/s]




Question: ['When providing a service or product to a trust, who should an Authorised Person treat as its Client according to the Rules?']
1 passages and 1 answers.


100%|██████████| 1/1 [00:00<00:00, 38.16it/s]




Question: ['What additional procedures do Authorised Persons need to follow when providing Regulated Activities to Retail Clients compared to Professional Clients?']
2 passages and 2 answers.


100%|██████████| 2/2 [00:00<00:00, 40.42it/s]




Question: ['What are the two routes through which a Person can be classified as a Professional Client?']
1 passages and 1 answers.


100%|██████████| 1/1 [00:00<00:00, 35.91it/s]




Question: ['How is a Person classified as a "deemed" Professional Client?']
1 passages and 3 answers.


100%|██████████| 1/1 [00:00<00:00, 38.03it/s]




Question: ['What is the process for classifying a Person as an "assessed" Professional Client?']
2 passages and 3 answers.


100%|██████████| 2/2 [00:00<00:00, 37.13it/s]




Question: ['What criteria must a client meet in order to be classified as a Professional Client, whether "deemed" or "assessed"?']
1 passages and 4 answers.


100%|██████████| 1/1 [00:00<00:00, 35.58it/s]




Question: ['What should an Authorised Person do to ensure they have a reasonable basis for classifying a Person as a "deemed" Professional Client?']
1 passages and 2 answers.


100%|██████████| 1/1 [00:00<00:00, 39.86it/s]




Question: ['How should an Authorised Person keep records related to the classification of a Person as a "deemed" Professional Client?']
2 passages and 4 answers.


100%|██████████| 2/2 [00:00<00:00, 33.62it/s]




Question: ['How are "family members" defined for the purposes of Rule \u200e2.4.4, in accordance with the Companies Regulations?']
1 passages and 1 answers.


100%|██████████| 1/1 [00:00<00:00, 34.60it/s]




Question: ["How does an individual's classification as a Professional Client differ from that of a Retail Client, according to the outlined regulations?"]
2 passages and 5 answers.


100%|██████████| 2/2 [00:00<00:00, 36.70it/s]




Question: ["Under what circumstances can a legal structure or vehicle set up exclusively to manage an individual's investment portfolio be classified as a Professional Client by an Authorised Person?"]
1 passages and 2 answers.


100%|██████████| 1/1 [00:00<00:00, 36.17it/s]




Question: ['How can an Undertaking, trust, or foundation be designated as a Professional Client by meeting the criteria outlined in Rule \u200e\u200e2.4.4(b)?']
1 passages and 3 answers.


100%|██████████| 1/1 [00:00<00:00, 41.18it/s]




Question: ['When can an Authorised Person classify an individual as a Professional Client based on joint account ownership?']
2 passages and 3 answers.


100%|██████████| 2/2 [00:00<00:00, 36.67it/s]




Question: ['What are the criteria for an individual to be classified as a Professional Client under joint account holder status?']
2 passages and 4 answers.


100%|██████████| 2/2 [00:00<00:00, 37.63it/s]




Question: ['What documentation is required for the joint account holder to be considered a Professional Client based on their relationship with the primary account holder?']
1 passages and 7 answers.


100%|██████████| 1/1 [00:00<00:00, 21.09it/s]




Question: ['Can a Professional Client act as a primary account holder on a joint account with multiple family members who also meet the requirements to be classified as Professional Clients?']
2 passages and 2 answers.


100%|██████████| 2/2 [00:00<00:00, 39.34it/s]




Question: ['How does Rule \u200e2.4.4(d) allow for multiple family members on a joint account to be classified as Professional Clients if they meet the specified requirements?']
2 passages and 2 answers.


100%|██████████| 2/2 [00:00<00:00, 34.36it/s]




Question: ['If multiple family members meet the criteria outlined in Rule \u200e2.4.4(d), can they all be classified as Professional Clients when operating a joint account with a primary account holder who is already classified as a Professional Client?']
2 passages and 2 answers.


100%|██████████| 2/2 [00:00<00:00, 36.01it/s]




Question: ['In the context of investment purposes, can a legal structure created by a Professional Client choose to be treated as a Retail Client?']
1 passages and 2 answers.


100%|██████████| 1/1 [00:00<00:00, 40.90it/s]




Question: ['Can a joint account holder request to be classified as a Retail Client if investment decisions are being made by a Professional Client who is the primary account holder?']
2 passages and 3 answers.


100%|██████████| 2/2 [00:00<00:00, 35.77it/s]




Question: ['When can a withdrawing individual be classified as a Professional Client after opting out of having investment decisions made by the primary account holder in a joint account arrangement?']
2 passages and 2 answers.


100%|██████████| 2/2 [00:00<00:00, 36.07it/s]




Question: ['When can an Authorised Person classify an Undertaking as an "assessed" Professional Client?']
1 passages and 6 answers.


100%|██████████| 1/1 [00:00<00:00, 28.23it/s]




Question: ['What criteria must the Controller of an Undertaking meet to be classified as a Professional Client?']
1 passages and 7 answers.


100%|██████████| 1/1 [00:00<00:00, 23.66it/s]




Question: ['What minimum own funds or called up capital must an Undertaking have to be classified as an "assessed" Professional Client?']
1 passages and 1 answers.


100%|██████████| 1/1 [00:00<00:00, 35.14it/s]




Question: ['How can an Undertaking demonstrate sufficient experience and understanding of financial markets to be classified as a Professional Client?']
1 passages and 7 answers.


100%|██████████| 1/1 [00:00<00:00, 22.79it/s]




Question: ['In what scenario can an Undertaking set up by joint venture participants be treated as a Professional Client?']
3 passages and 3 answers.


100%|██████████| 3/3 [00:00<00:00, 37.57it/s]




Question: ['What distinguishes a key decision maker in a joint venture partnership from a silent partner in terms of Professional Client status?']
3 passages and 5 answers.


100%|██████████| 3/3 [00:00<00:00, 38.07it/s]




Question: ['Under Rule 4.3.3, in what specific scenarios can an Undertaking that does not meet the criteria of a Professional Client be considered as a Professional Client for certain Regulated Activities?']
3 passages and 2 answers.


100%|██████████| 3/3 [00:00<00:00, 39.89it/s]




Question: ['Under what conditions can an Authorised Person classify a Person as a Market Counterparty?']
1 passages and 1 answers.


100%|██████████| 1/1 [00:00<00:00, 37.92it/s]




Question: ['How does a Person qualify as a "deemed" Professional Client according to Rule 2.4.2?']
1 passages and 3 answers.


100%|██████████| 1/1 [00:00<00:00, 37.41it/s]




Question: ['What steps must an Authorised Person take before classifying a Professional Client as a Market Counterparty?']
1 passages and 4 answers.


100%|██████████| 1/1 [00:00<00:00, 33.48it/s]




Question: ['What must be provided to a Professional Client before they are classified as a Market Counterparty?']
1 passages and 2 answers.


100%|██████████| 1/1 [00:00<00:00, 35.48it/s]




Question: ['Under what circumstances can a Professional Client be reclassified as a Retail Client?']
2 passages and 2 answers.


100%|██████████| 2/2 [00:00<00:00, 34.97it/s]




Question: ['What information must an Authorised Person provide to a Professional Client when first establishing a relationship with them?']
2 passages and 8 answers.


100%|██████████| 2/2 [00:00<00:00, 36.00it/s]




Question: ['What must the Authorised Person inform the Professional Client about regarding the higher level of protection available to Retail Clients?']
2 passages and 5 answers.


100%|██████████| 2/2 [00:00<00:00, 38.79it/s]




Question: ['What happens if a Person does not make a specific choice to be classified as a Retail Client within the given timeframe by the Authorised Person?']
1 passages and 1 answers.


100%|██████████| 1/1 [00:00<00:00, 35.97it/s]




Question: ['What action must an Authorised Person take if a Person classified as a Professional Client requests to be re-classified as a Retail Client?']
1 passages and 1 answers.


100%|██████████| 1/1 [00:00<00:00, 34.22it/s]




Question: ['What is the responsibility of an Authorised Person if it does not offer Regulated Activities to Retail Clients?']
1 passages and 3 answers.


100%|██████████| 1/1 [00:00<00:00, 36.35it/s]




Question: ['In what situation must an Authorised Person comply with Rule 2.6.1(a)?']
1 passages and 1 answers.


100%|██████████| 1/1 [00:00<00:00, 34.06it/s]




Question: ['What right does a Professional Client have in terms of re-classification as a Retail Client after being initially classified by an Authorised Person?']
2 passages and 4 answers.


100%|██████████| 2/2 [00:00<00:00, 31.25it/s]




Question: ['What should an Authorised Person do as a matter of good practice regarding the classification of Professional Clients?']
2 passages and 3 answers.


100%|██████████| 2/2 [00:00<00:00, 35.39it/s]




Question: ['When should an Authorised Person review the circumstances of a particular Client in relation to their classification status?']
2 passages and 2 answers.


100%|██████████| 2/2 [00:00<00:00, 38.03it/s]




Question: ['If an Authorised Person becomes aware of certain circumstances, what action should they take in relation to the reclassification of a Client?']
2 passages and 4 answers.


100%|██████████| 2/2 [00:00<00:00, 37.57it/s]




Question: ['Is it permissible for an Authorised Person to refer a Retail Client to another Authorised Person who has the required Financial Services Permission?']
2 passages and 2 answers.


100%|██████████| 2/2 [00:00<00:00, 35.19it/s]




Question: ['What factors must an Authorised Person consider when determining if a Person qualifies as an "assessed" Professional Client?']
3 passages and 4 answers.


100%|██████████| 3/3 [00:00<00:00, 36.17it/s]




Question: ['How does the length of time a Person has participated in financial markets impact their classification as an assessed Professional Client?']
1 passages and 4 answers.


100%|██████████| 1/1 [00:00<00:00, 25.54it/s]




Question: ['What factors should an Authorised Person consider to have reasonable grounds to believe that a client classification made by another entity within the same Group is substantially similar to the required client classification under the Rules?']
1 passages and 9 answers.


100%|██████████| 1/1 [00:00<00:00, 22.86it/s]




Question: ['What steps must an Authorised Person take if it identifies any discrepancies between its own client classification requirements and those of another entity carrying out client classification?']
1 passages and 2 answers.


100%|██████████| 1/1 [00:00<00:00, 39.55it/s]




Question: ['What due diligence process should an Authorised Person be able to demonstrate to the Regulator when relying on a client classification made by its head office or other branch of the same legal entity or a member of its Group?']
2 passages and 4 answers.


100%|██████████| 2/2 [00:00<00:00, 29.32it/s]




Question: ['How should an Authorised Person address any identified gaps in the client classification made by its head office or other branch of the same legal entity or a member of its Group when relying on this Rule?']
2 passages and 6 answers.


100%|██████████| 2/2 [00:00<00:00, 33.68it/s]




Question: ['What are the requirements relating to outsourcing that an Authorised Person needs to meet if they wish to use client classification done by a third party other than their head office or related entities?']
2 passages and 4 answers.


100%|██████████| 2/2 [00:00<00:00, 38.25it/s]




Question: ['What must an Authorised Person that is a member of a Group ensure regarding client classification for Regulated Activities carried out for the benefit of a Client by Group members?']
1 passages and 3 answers.


100%|██████████| 1/1 [00:00<00:00, 34.69it/s]




Question: ["How might a Group structure its operations to enable different members to offer specialized advice for a Client's proposed restructure and financing needs?"]
4 passages and 4 answers.


100%|██████████| 4/4 [00:00<00:00, 37.06it/s]




Question: ['What are some common risks that may arise when Group members located in different jurisdictions carry out Regulated Activities for the benefit of the same Client?']
2 passages and 2 answers.


100%|██████████| 2/2 [00:00<00:00, 35.97it/s]



Question: ["What specific records must an Authorised Person maintain according to the rules, including evidence of the client's classification and notices sent to the client?"]





1 passages and 2 answers.


100%|██████████| 1/1 [00:00<00:00, 34.44it/s]




Question: ['How long is an Authorised Person required to keep records for after ending a business relationship with a Client?']
1 passages and 1 answers.


100%|██████████| 1/1 [00:00<00:00, 36.93it/s]




Question: ['How should an Authorised Person determine the end date of a business relationship with a Client if the exact date is uncertain?']
1 passages and 1 answers.


100%|██████████| 1/1 [00:00<00:00, 33.23it/s]




Question: ['In what scenario can the completion date of the last transaction with a Client be used as the end date of the business relationship by an Authorised Person?']
1 passages and 2 answers.


100%|██████████| 1/1 [00:00<00:00, 35.98it/s]




Question: ['Under what circumstances can an Authorised Person consider the date of the last completed transaction as the official end date of a business relationship with a Client?']
5 passages and 3 answers.


100%|██████████| 5/5 [00:00<00:00, 38.88it/s]




Question: ['How can an Authorised Person demonstrate compliance with regulatory requirements to the Regulator regarding reliance on external classifications and Group Clients?']
1 passages and 3 answers.


100%|██████████| 1/1 [00:00<00:00, 33.98it/s]




Question: ['What action must an Authorised Person take if they are no longer able to provide unrestricted access to records?']
1 passages and 2 answers.


100%|██████████| 1/1 [00:00<00:00, 37.27it/s]




Question: ['How can an Authorised Person determine if their communication about Specified Investments or Regulated Activities is fair and not misleading?']
1 passages and 5 answers.


100%|██████████| 1/1 [00:00<00:00, 34.88it/s]




Question: ['How should communications to Professional Clients differ from communications to Retail Clients, in terms of information included and presentation style?']
1 passages and 6 answers.


100%|██████████| 1/1 [00:00<00:00, 37.00it/s]




Question: ['What obligation must an Authorised Person abide by in their communications with individuals regarding their duties and liabilities under the ADGM Founding Law or FSMR?']
1 passages and 3 answers.


100%|██████████| 1/1 [00:00<00:00, 36.55it/s]




Question: ['How should an Authorised Person handle their duties and liabilities towards individuals in their communications under the ADGM Founding Law or FSMR?']
1 passages and 2 answers.


100%|██████████| 1/1 [00:00<00:00, 35.54it/s]




Question: ['In what circumstances can an Authorised Person provide information to a Person other than the Client, as per the Rule regarding information disclosure to clients?']
1 passages and 10 answers.


100%|██████████| 1/1 [00:00<00:00, 16.59it/s]




Question: ['When should a clear statement be included in Marketing Material indicating that it is intended only for Professional Clients or Market Counterparties?']
1 passages and 2 answers.


100%|██████████| 1/1 [00:00<00:00, 36.34it/s]




Question: ['What actions should an Authorised Person take if they become aware of a breach of regulatory requirements by the entity offering the Specified Investment?']
1 passages and 3 answers.


100%|██████████| 1/1 [00:00<00:00, 34.11it/s]




Question: ['How should an Authorised Person ensure that Marketing Material is only accessed by Professional Clients or Market Counterparties?']
1 passages and 4 answers.


100%|██████████| 1/1 [00:00<00:00, 33.02it/s]




Question: ['In what manner should an Authorised Person prevent unauthorized individuals from accessing Marketing Material intended for Professional Clients or Market Counterparties?']
1 passages and 4 answers.


100%|██████████| 1/1 [00:00<00:00, 35.57it/s]




Question: ['What are the requirements that an Authorised Person must adhere to when providing information or representation related to past performance to Retail Clients?']
1 passages and 6 answers.


100%|██████████| 1/1 [00:00<00:00, 30.07it/s]




Question: ['How should an Authorised Person present information related to past performance to Retail Clients in order to ensure fairness and balance?']
1 passages and 2 answers.


100%|██████████| 1/1 [00:00<00:00, 36.04it/s]




Question: ['In what manner should the source of information for past performance data be identified to Retail Clients?']
1 passages and 3 answers.


100%|██████████| 1/1 [00:00<00:00, 35.60it/s]




Question: ['According to Rule \u200e3.3.2(b), what must an Authorised Person have in place before providing a service to a Client involving a Regulated Activity?']
1 passages and 3 answers.


100%|██████████| 1/1 [00:00<00:00, 34.48it/s]




Question: ['How can an Authorised Person ensure that a Client has the necessary information to make an informed decision regarding a Regulated Activity before signing a Client Agreement?']
1 passages and 8 answers.


100%|██████████| 1/1 [00:00<00:00, 20.81it/s]




Question: ['In what instances can an Authorised Person carry on a Regulated Activity without a Client Agreement according to Rule \u200e3.3.2(b) of the regulations?']
2 passages and 4 answers.


100%|██████████| 2/2 [00:00<00:00, 38.69it/s]




Question: ['Under what circumstances is an Authorised Person allowed to carry on a Regulated Activity without complying with the requirement in Rule 3.3.2(a)?']
2 passages and 5 answers.


100%|██████████| 2/2 [00:00<00:00, 35.31it/s]




Question: ['What steps must an Authorised Person take if it is deemed impracticable to comply with a requirement in carrying on a Regulated Activity for a Client?']
2 passages and 5 answers.


100%|██████████| 2/2 [00:00<00:00, 37.78it/s]




Question: ['Under what circumstances can an Authorised Person that is a Branch rely on a Client Agreement executed by its head office or another branch of the same legal entity?']
2 passages and 4 answers.


100%|██████████| 2/2 [00:00<00:00, 37.01it/s]




Question: ['In what situations can an Authorised Person rely on a Client Agreement executed by a member of its Group?']
2 passages and 2 answers.


100%|██████████| 2/2 [00:00<00:00, 36.92it/s]




Question: ['What type of information must be included in every Client Agreement according to Chapter 12?']
1 passages and 2 answers.


100%|██████████| 1/1 [00:00<00:00, 33.31it/s]




Question: ['In what circumstances must additional disclosure be provided in addition to the standard Client Agreement, as outlined in Chapter 12?']
4 passages and 3 answers.


100%|██████████| 4/4 [00:00<00:00, 36.70it/s]




Question: ['What options do Authorised Persons have when providing a Person with the proposed Client Agreement?']
1 passages and 9 answers.


100%|██████████| 1/1 [00:00<00:00, 23.41it/s]




Question: ['In what way must an Authorised Person ensure that any changes to the Client Agreement are accurately reflected in the final signed document with the Person?']
2 passages and 3 answers.


100%|██████████| 2/2 [00:00<00:00, 36.82it/s]




Question: ['In what situation can an Authorised Person find it reasonably impracticable to provide key information to a Person who requests the execution of a Transaction on a time critical basis?']
2 passages and 3 answers.


100%|██████████| 2/2 [00:00<00:00, 35.57it/s]




Question: ['If an Authorised Person has verbally explained why they cannot enter into a Client Agreement, what records must they maintain to demonstrate this to the Regulator?']
2 passages and 2 answers.


100%|██████████| 2/2 [00:00<00:00, 38.51it/s]




Question: ['What is the minimum notice period that must be given to a Retail Client before an Authorised Person can carry on a Regulated Activity with the Client on amended terms as per the Client Agreement?']
1 passages and 2 answers.


100%|██████████| 1/1 [00:00<00:00, 34.84it/s]




Question: ["Under what conditions is it permissible for an Authorised Person to conduct a Regulated Activity with a Client on amended terms without giving the fourteen days' notice as stated in the Client Agreement?"]
1 passages and 2 answers.


100%|██████████| 1/1 [00:00<00:00, 34.26it/s]




Question: ['What are the obligations of an Authorised Person under Rule 3.4.2(b) regarding recommending Specified Investments or carrying out Regulated Activities for a Client?']
2 passages and 5 answers.


100%|██████████| 2/2 [00:00<00:00, 35.29it/s]






Average Entailment Score: 0.7024035
Average Contradiction Score: 0.1372604
Average Obligation Coverage Score: 0.34450000000000003
Average Final Composite Score: 0.6365476999999999
