#Setup Running Environment

##GPU

To optimize requirements (Req) classification performance, we considered using a GPU instead of a CPU. GPUs significantly accelerate inference, especially for larger models. In our experimentation, we selected A100 GPU as a running environment, a high-performance A100 GPU, users with Google Colab Pro+ can also utilize T4 and L4 GPUs, though with slightly longer processing times compared to the A100.

*  make sure to change the runtime type and select "A100 GPU" as a hardware accelerator
*  then, verify GPU availability and select a device, please run the following code cell.


In [None]:

#Setup Running Environment (GPU)
import tensorflow as tf
device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
    print("No GPU detected!")
else:
    print("GPU detected at:", device_name)

GPU detected at: /device:GPU:0


In [None]:
import torch
if torch.cuda.is_available():
    print("Using GPU:", torch.cuda.get_device_name(0))
else:
    print("GPU not available")
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)

Using GPU: NVIDIA A100-SXM4-40GB
cuda


Users who have purchased one of Colab's paid plans have access to high-memory VMs when they are available.



You can see how much memory you have available at any time by running the following code cell. If the execution result of running the code cell below is "Not using a high-RAM runtime", then you can enable a high-RAM runtime via `Runtime > Change runtime type` in the menu. Then select High-RAM in the Runtime shape dropdown. After, re-execute the code cell.

In [None]:
from psutil import virtual_memory
ram_gb = virtual_memory().total / 1e9
print('Your runtime has {:.1f} gigabytes of available RAM\n'.format(ram_gb))

if ram_gb < 20:
  print('Not using a high-RAM runtime')
else:
  print('You are using a high-RAM runtime!')

Your runtime has 89.6 gigabytes of available RAM

You are using a high-RAM runtime!


##Radnom Seed
This process is important to ensure reproducibility, we set a fixed random seed (typically 42) for PyTorch, NumPy, and Python's random module.

In [None]:
import numpy as np
import random

seed = 42  # typical number!

torch.manual_seed(seed)
if torch.cuda.is_available():
    torch.cuda.manual_seed_all(seed)
np.random.seed(seed)
random.seed(seed)

##TPU (Not-used)

For optimal performance, consider using a TPU v2 accelerator. TPUs are specifically designed for machine learning workloads and often significantly outperform GPUs and CPUs. While this notebook is currently configured for GPU acceleration, switching to a TPU v2 can potentially boost inference speed and efficiency, especially for larger models or datasets.

*P.S. If you switch to TPU make sure to update the code accordingly by using "strategy" scoping instead of pushing the code with "tf.device(device_name)"*

* Change Runtime Type and select "TPU" as the hardware accelerator. Ensure you choose the TPU v2 option if available.
* Run the following code to check for TPU availability and gather device information:


In [None]:
#Setup Running Environment (TPU)
import tensorflow as tf
resolver = tf.distribute.cluster_resolver.TPUClusterResolver()
tf.config.experimental_connect_to_cluster(resolver)
tf.tpu.experimental.initialize_tpu_system(resolver) # Use initialize_tpu_system instead of initialize
strategy = tf.distribute.TPUStrategy(resolver)
print("Number of accelerators: ", strategy.num_replicas_in_sync)
print("All devices: ", tf.config.list_logical_devices('TPU'))

##Gated HF Models

Some LLMs used in this Colab, such as Llama and Gemma, require user authentication and valid tokens to access. These models are considered "gated" on the Hugging Face Models Hub, meaning they have restricted access to protect intellectual property or control usage. To use these models, you'll need to obtain necessary credentials and follow Hugging Face's guidelines for authentication.

* Create users account on Hugging Face
* Access these models page on HF, and request to access. The request vetification shall be recived withing 2 hours or less.
* Go the user profile page, and select Access Token
* Create new token, and copy that token as it will be appeared one time.

In [None]:
!huggingface-cli login

In [None]:
!huggingface-cli whoami

WaadAlhoshan


In [None]:
import tensorflow as tf
with tf.device(device_name):
#with strategy.scope():
  import os
  os.environ["HF_AUTH_TOKEN"] = "PUT YOUR KEY HERE"

 ## Montoring Performance "Explanation"

To monitor memory usage and performance within our code, we employ Python packages like tracemalloc and line_profiler. Tracemalloc provides insights into memory allocations, helping identify potential memory leaks and optimize resource utilization. Line profiler, on the other hand, focuses on function-level performance, enabling us to pinpoint time-consuming code sections. By combining these tools, we can effectively analyze efficiency of our code on the selected hardware accelerator (either GPU or TPU).

Below are some examples in how to emply these packages.

In [None]:
import tracemalloc
import time

def my_function():
  print("This is output1")

tracemalloc.start()
start_time = time.time()

# Your main logic
print("This is output2")

snapshot = tracemalloc.take_snapshot()
top_stats = snapshot.statistics('lineno')

end_time = time.time()
elapsed_time = end_time - start_time

print(f"Total execution time: {elapsed_time:.2f} seconds")
print("Top 10 memory consumers:")
for stat in top_stats[:10]:
    print(stat)

This is output2
Total execution time: 9.25 seconds
Top 10 memory consumers:
<frozen importlib._bootstrap_external>:672: size=61.7 MiB, count=547908, average=118 B
<frozen importlib._bootstrap>:241: size=52.0 MiB, count=442403, average=123 B
/usr/lib/python3.10/dataclasses.py:432: size=11.5 MiB, count=18751, average=644 B
/usr/local/lib/python3.10/dist-packages/IPython/core/completer.py:2101: size=10.9 MiB, count=137715, average=83 B
/usr/lib/python3.10/linecache.py:137: size=10.5 MiB, count=106297, average=104 B
<ipython-input-43-6be0b9da56ff>:73: size=2889 KiB, count=92988, average=32 B
/usr/lib/python3.10/abc.py:106: size=2197 KiB, count=8533, average=264 B
/usr/lib/python3.10/inspect.py:2969: size=1709 KiB, count=22553, average=78 B
<frozen importlib._bootstrap_external>:128: size=1549 KiB, count=11656, average=136 B
/usr/lib/python3.10/inspect.py:2967: size=1402 KiB, count=23498, average=61 B


In [None]:
!pip install line_profiler



In [None]:
from line_profiler import LineProfiler

def my_function(x, y):
    # Some code here
    result = x * y
    # More code here
    return result

profiler = LineProfiler()
profiler.add_function(my_function)
profiler.enable_by_count()

my_function(10, 20)

profiler.disable_by_count()
profiler.print_stats()

Timer unit: 1e-09 s

Total time: 1.285e-06 s
File: <ipython-input-57-7b006db3f5fd>
Function: my_function at line 3

Line #      Hits         Time  Per Hit   % Time  Line Contents
     3                                           def my_function(x, y):
     4                                               # Some code here
     5         1       1003.0   1003.0     78.1      result = x * y
     6                                               # More code here
     7         1        282.0    282.0     21.9      return result



In [None]:
#!nvidia-smi #  provides valuable information about GPU utilization, memory consumption, and other performance indicators but cannot be used with TPU

#Benchmark Datasets

We used selected datasets from primary studies in RE. These datasets have been extensively used in RE to experiment with requirements classification tasks.

Please note that the URLs provided for accessing these datasets might become invalid after the publication of this Colab. We strongly encourage retrieving the datasets from their original repositories for future use.

================================================================================

**Functional and Quality**

*dronology.csv*
https://drive.google.com/file/d/1yaFD5NIx4De698ok9ryv4EqIW0jPdlEr/view?usp=sharing

*leeds.csv*
https://drive.google.com/file/d/1EQVynWt1eMlZHNiyKYbjU0FoeVjcJjF7/view?usp=sharing

*promise-reclass.csv*
https://drive.google.com/file/d/1JsctHwmGrQuP1dXFG1i6XnS6aAtMXeN3/view?usp=sharing

*reqview.csv*
https://drive.google.com/file/d/1UeE_EVt3PKvMF8fyyM-LruhTHXZINdcQ/view?usp=sharing

*wasp.csv*
https://drive.google.com/file/d/1-wuKWflxqL3k0tZ0mqQtr0mUXrPPJsCB/view?usp=sharing

================================================================================

**NFR Multiclass**

*PROMISE-NFR.csv*
'https://drive.google.com/file/d/1JsctHwmGrQuP1dXFG1i6XnS6aAtMXeN3/view?usp=sharing

================================================================================

**Security or Not**

*SeqReq.csv*


================================================================================

## Preps Functions

In [None]:
#General methods for datasets preps
import pandas as pd
def read_dataset_from_google(url):
  url='https://drive.google.com/uc?id=' + url.split('/')[-2]
  df = pd.read_csv(url)
  return df

def combine_datasets(dataset_list):
  combined_df = pd.concat(dataset_list, ignore_index=True)
  return combined_df

In [None]:
# A function that applies changes to the labels using uppercase or lowercased or captizlied

def modify_labels(labels, case):
  modified_labels = []
  for label in labels:
    if case == 'upper':
      modified_labels.append(label.upper())
    elif case == 'lower':
      modified_labels.append(label.lower())
    elif case == 'capitalize':
      modified_labels.append(label.capitalize())
    else:
      modified_labels.append(label)
  return modified_labels


In [None]:
#A function that applies changes to the requirementtext by adding full stops or remove puncations -- a list of text

import string
def remove_punctuation(text_string):
  # Remove punctuation from the string
  translator = str.maketrans('', '', string.punctuation)
  return text_string.translate(translator)

def modify_text(text_list, modification):
  modified_texts = []
  for text in text_list:
    if modification == 'add_full_stops':
      modified_texts.append(text + '.')
    elif modification == 'remove_punctuation':
      modified_texts.append(remove_punctuation(text))
    else:
      modified_texts.append(text)
  return modified_texts


## Functional and Quality (Binary)

Two datasets is generated as **Functional** binary dataset ```functional_df``` and **Quality** binary dataset ```quality_df```


In [None]:

datasets_names = ["dronology.csv", "leeds.csv", "promise-reclass.csv", "reqview.csv", "wasp.csv"]
datasets_urls = [ "ADD SHAREABLE LINK OF EACH DATASET"] #Or you can import the dataset locally here!

def read_datasets(datasets_urls, datasets_names):
  datasets = []
  for url in datasets_urls:
    df = read_dataset_from_google(url)
    df["DatasetName"] = datasets_names[datasets_urls.index(url)]
    datasets.append(df)
  return datasets
datasets = read_datasets(datasets_urls, datasets_names)
df = combine_datasets(datasets)
functional_status = []
for index, row in df.iterrows():
  if row['IsFunctional'] == 1:
    functional_status.append('Functional')
  else:
    functional_status.append('Non-functional')
df['FunctionalStatus'] = functional_status


quality_status = []
for index, row in df.iterrows():
  if row['IsQuality'] == 1:
    quality_status.append('Quality')
  else:
    quality_status.append('Non-quality')
df['QualityStatus'] = quality_status

# Functional Status DataFrame
functional_df = df[['DatasetName', 'ProjectID', 'RequirementText', 'FunctionalStatus']]
functional_df = functional_df.rename(columns={'FunctionalStatus': 'Label'})

# Quality Status DataFrame
quality_df = df[['DatasetName', 'ProjectID', 'RequirementText', 'QualityStatus']]
quality_df = quality_df.rename(columns={'QualityStatus': 'Label'})


print("Functional Status DataFrame:")
#print(functional_df.head())

print("\nQuality Status DataFrame:")
#print(quality_df.head())

print("Functional Status Label Counts:")
print(functional_df['Label'].value_counts())

print("\nQuality Status Label Counts:")
print(quality_df['Label'].value_counts())

# Re-arrange columns for Functional Status DataFrame
functional_df = functional_df[['DatasetName', 'ProjectID', 'RequirementText', 'Label']]
# Re-arrange columns for Quality Status DataFrame
quality_df = quality_df[['DatasetName', 'ProjectID', 'RequirementText', 'Label']]

# Save Functional Status DataFrame to CSV
functional_df.to_csv('functional_status_binary.csv', index=False)
# Save Quality Status DataFrame to CSV
quality_df.to_csv('quality_status_binary.csv', index=False)


# ---------Dataset Variations for Functional Binary Classification
Functional_None = functional_df.copy()

Functional_Remove_Punctuation = functional_df.copy()
Functional_Remove_Punctuation['RequirementText'] = modify_text(Functional_Remove_Punctuation['RequirementText'], 'remove_punctuation')

Functional_Add_FullStops = functional_df.copy()
Functional_Add_FullStops['RequirementText'] = modify_text(Functional_Add_FullStops['RequirementText'], 'add_full_stops')

Functional_Labels_Uppercase = functional_df.copy()
Functional_Labels_Uppercase['Label'] = modify_labels(Functional_Labels_Uppercase['Label'], 'upper')

Functional_Labels_Lowercase = functional_df.copy()
Functional_Labels_Lowercase['Label'] = modify_labels(Functional_Labels_Lowercase['Label'], 'lower')

Functional_Labels_Capitalized = functional_df.copy()
Functional_Labels_Capitalized['Label'] = modify_labels(Functional_Labels_Capitalized['Label'], 'capitalize')


# ---------Dataset Variations for Quality Binary Classification
Quality_None = quality_df.copy()

Quality_Remove_Punctuation = quality_df.copy()
Quality_Remove_Punctuation['RequirementText'] = modify_text(Quality_Remove_Punctuation['RequirementText'], 'remove_punctuation')

Quality_Add_FullStops = quality_df.copy()
Quality_Add_FullStops['RequirementText'] = modify_text(Quality_Add_FullStops['RequirementText'], 'add_full_stops')

Quality_Labels_Uppercase = quality_df.copy()
Quality_Labels_Uppercase['Label'] = modify_labels(Quality_Labels_Uppercase['Label'], 'upper')

Quality_Labels_Lowercase = quality_df.copy()
Quality_Labels_Lowercase['Label'] = modify_labels(Quality_Labels_Lowercase['Label'], 'lower')

Quality_Labels_Capitalized = quality_df.copy()
Quality_Labels_Capitalized['Label'] = modify_labels(Quality_Labels_Capitalized['Label'], 'capitalize')


In [None]:
print("\ Functional_Labels_Uppercase Label Counts:")
print(Functional_Labels_Uppercase['Label'].value_counts())
print("\nFunctional_Labels_Lowercase Label Counts:")
print(Functional_Labels_Lowercase['Label'].value_counts())
print("\nFunctional_Labels_Capitalized Label Counts:")
print(Functional_Labels_Capitalized['Label'].value_counts())

In [None]:
print("\ Quality_Labels_Uppercase Label Counts:")
print(Quality_Labels_Uppercase['Label'].value_counts())
print("\nQuality_Labels_Lowercase Label Counts:")
print(Quality_Labels_Lowercase['Label'].value_counts())
print("\nQuality_Labels_Capitalized Label Counts:")
print(Quality_Labels_Capitalized['Label'].value_counts())

In [None]:
# Add Label Definition for Functional Status DataFrame
Functional_None['Label Definition'] = Functional_None['Label'].apply(lambda x: 'Functional requirements (FRs) specify the functions of the system.' if x == 'Functional' else 'Non-functional requirements (NFRs) define how the system performs its intended functions.')
Functional_Remove_Punctuation['Label Definition'] = Functional_Remove_Punctuation['Label'].apply(lambda x: 'Functional requirements (FRs) specify the functions of the system.' if x == 'Functional' else 'Non-functional requirements (NFRs) define how the system performs its intended functions.')
Functional_Add_FullStops['Label Definition'] = Functional_Add_FullStops['Label'].apply(lambda x: 'Functional requirements (FRs) specify the functions of the system.' if x == 'Functional' else 'Non-functional requirements (NFRs) define how the system performs its intended functions.')
Functional_Labels_Uppercase['Label Definition'] = Functional_Labels_Uppercase['Label'].apply(lambda x: 'Functional requirements (FRs) specify the functions of the system.' if x == 'FUNCTIONAL' else 'Non-functional requirements (NFRs) define how the system performs its intended functions.')
Functional_Labels_Lowercase['Label Definition'] = Functional_Labels_Lowercase['Label'].apply(lambda x: 'Functional requirements (FRs) specify the functions of the system.' if x == 'functional' else 'Non-functional requirements (NFRs) define how the system performs its intended functions.')
Functional_Labels_Capitalized['Label Definition'] = Functional_Labels_Capitalized['Label'].apply(lambda x: 'Functional requirements (FRs) specify the functions of the system.' if x == 'Functional' else 'Non-functional requirements (NFRs) define how the system performs its intended functions.')


# Add Label Definition for Quality Status DataFrame
Quality_None['Label Definition'] = Quality_None['Label'].apply(lambda x: 'Quality requirements (QRs) define how well the system performs its intended functions.' if x == 'Quality' else 'Non-quality requirements are the NFRs that are not the quality requirements.')
Quality_Remove_Punctuation['Label Definition'] = Quality_Remove_Punctuation['Label'].apply(lambda x: 'Quality requirements (QRs) define how well the system performs its intended functions.' if x == 'Quality' else 'Non-quality requirements are the NFRs that are not the quality requirements.')
Quality_Add_FullStops['Label Definition'] = Quality_Add_FullStops['Label'].apply(lambda x: 'Quality requirements (QRs) define how well the system performs its intended functions.' if x == 'Quality' else 'Non-quality requirements are the NFRs that are not the quality requirements.')
Quality_Labels_Uppercase['Label Definition'] = Quality_Labels_Uppercase['Label'].apply(lambda x: 'Quality requirements (QRs) define how well the system performs its intended functions.' if x == 'QUALITY' else 'Non-quality requirements are the NFRs that are not the quality requirements.')
Quality_Labels_Lowercase['Label Definition'] = Quality_Labels_Lowercase['Label'].apply(lambda x: 'Quality requirements (QRs) define how well the system performs its intended functions.' if x == 'quality' else 'Non-quality requirements are the NFRs that are not the quality requirements.')
Quality_Labels_Capitalized['Label Definition'] = Quality_Labels_Capitalized['Label'].apply(lambda x: 'Quality requirements (QRs) define how well the system performs its intended functions.' if x == 'Quality' else 'Non-quality requirements are the NFRs that are not the quality requirements.')

print("Functional Status DataFrame with Label Definition:")
print(Functional_None.head())

print("\nQuality Status DataFrame with Label Definition:")
print(Quality_None.head())


## Security (Binary)

Another binary dataset is generated as **Sequrity** binary dataset ```seq_df```


In [None]:
dataset_name = "SeqReq"
seq_df = read_dataset_from_google('[ "ADD SHAREABLE LINK OF EACH DATASET"] #Or you can import the dataset locally here!')
seq_df['DatasetName'] = dataset_name
seq_df['Label'] = seq_df['Label'].apply(lambda x: 'Security' if x == 'sec' else 'Non-Security')
seq_df = seq_df[['DatasetName', 'ProjectID', 'RequirementText', 'Label']]
print(seq_df.head())
print(len(seq_df.values))
# Save DataFrame to CSV
seq_df.to_csv('SeqReq.csv', index=False)
print("\nSequrity Label Counts:")
print(seq_df['Label'].value_counts())

# ---------Dataset Variations for SeqReq Binary Classification
SeqReq_None = seq_df.copy()

SeqReq_Remove_Punctuation = seq_df.copy()
SeqReq_Remove_Punctuation['RequirementText'] = modify_text(SeqReq_Remove_Punctuation['RequirementText'],'remove_punctuation')

SeqReq_Add_FullStops = seq_df.copy()
SeqReq_Add_FullStops['RequirementText'] = modify_text(SeqReq_Add_FullStops['RequirementText'],'add_full_stops')

SeqReq_Labels_Uppercase = seq_df.copy()
SeqReq_Labels_Uppercase['Label'] = modify_labels(SeqReq_Labels_Uppercase['Label'],'upper')

SeqReq_Labels_Lowercase = seq_df.copy()
SeqReq_Labels_Lowercase['Label'] = modify_labels(SeqReq_Labels_Lowercase['Label'],'lower')

SeqReq_Labels_Capitalized = seq_df.copy()
SeqReq_Labels_Capitalized['Label'] = modify_labels(SeqReq_Labels_Capitalized['Label'],'capitalize')

In [None]:
# Add Label Definition for SeqReq_None Status DataFrame
SeqReq_None['Label Definition'] = SeqReq_None['Label'].apply(lambda x: 'Security requirements (SRs) are quality requirements that ensure the integrity, confidentiality, reliability, availability, and safety of a system.' if x == 'Security' else 'Non-security requirements are any requirements that are not the security requirements.')
SeqReq_Remove_Punctuation['Label Definition'] = SeqReq_Remove_Punctuation['Label'].apply(lambda x: 'Security requirements (SRs) are quality requirements that ensure the integrity, confidentiality, reliability, availability, and safety of a system.' if x == 'Security' else 'Non-security requirements are any requirements that are not the security requirements.')
SeqReq_Add_FullStops['Label Definition'] = SeqReq_Add_FullStops['Label'].apply(lambda x: 'Security requirements (SRs) are quality requirements that ensure the integrity, confidentiality, reliability, availability, and safety of a system.' if x == 'Security' else 'Non-security requirements are any requirements that are not the security requirements.')
SeqReq_Labels_Uppercase['Label Definition'] = SeqReq_Labels_Uppercase['Label'].apply(lambda x: 'Security requirements (SRs) are quality requirements that ensure the integrity, confidentiality, reliability, availability, and safety of a system.' if x == 'SECURITY' else 'Non-security requirements are any requirements that are not the security requirements.')
SeqReq_Labels_Lowercase['Label Definition'] = SeqReq_Labels_Lowercase['Label'].apply(lambda x: 'Security requirements (SRs) are quality requirements that ensure the integrity, confidentiality, reliability, availability, and safety of a system.' if x == 'security' else 'Non-security requirements are any requirements that are not the security requirements.')
SeqReq_Labels_Capitalized['Label Definition'] = SeqReq_Labels_Capitalized['Label'].apply(lambda x: 'Security requirements (SRs) are quality requirements that ensure the integrity, confidentiality, reliability, availability, and safety of a system.' if x == 'Security' else 'Non-security requirements are any requirements that are not the security requirements.')

print("SeqReq Status DataFrame with Label Definition:")
print(SeqReq_None.head())


## Non-functional Classes-NFR (Multi)

The only multiclass dataset employed for this experiments as **NFR** multiclass dataset ```NFR_df```


In [None]:
dataset_name = "PROMISE-NFR-V2"
NFR_df = read_dataset_from_google('[ "ADD SHAREABLE LINK OF EACH DATASET"] #Or you can import the dataset locally here!')
NFR_df['DatasetName'] = dataset_name
NFR_df.drop(columns=['IsFunctional'], inplace=True)
NFR_df.drop(columns=['IsQuality'], inplace=True)
# Replace class names with full names
NFR_df['Class'] = NFR_df['Class'].replace({
    'PE': 'Performance',
    'SE': 'Security',
    'O': 'Operational',
    'SC': 'Scalability',
    'US': 'Usability',
    'MN': 'Maintainability',
    #'RE': 'Reliability',
    'LF': 'Look and Feel',
    'A': 'Availability',
    'PO': 'Portability',
    'L' : 'Legal',
    'FT' : 'Fault Tolerance',
    'F': 'Functional'
})
NFR_df = NFR_df.rename(columns={'Class': 'Label'})
# Re-arrange columns
NFR_df = NFR_df[['DatasetName', 'ProjectID', 'RequirementText', 'Label']]
NFR_df = NFR_df[NFR_df['Label'] != 'Portability']
NFR_df = NFR_df[NFR_df['Label'] != 'Functional']
print(NFR_df.head())
# Save DataFrame to CSV
NFR_df.to_csv('PROMISE-NFR-v2.csv', index=False)
print("\nNFR Label Counts:")
print(NFR_df['Label'].value_counts())

# ---------Dataset Variations for PROMISE NFR Multi Classification
NFR_None = NFR_df.copy()

NFR_Remove_Punctuation = NFR_df.copy()
NFR_Remove_Punctuation['RequirementText'] = modify_text(NFR_Remove_Punctuation['RequirementText'],'remove_punctuation')

NFR_Add_FullStops = NFR_df.copy()
NFR_Add_FullStops['RequirementText'] = modify_text(NFR_Add_FullStops['RequirementText'],'add_full_stops')

NFR_Labels_Uppercase = NFR_df.copy()
NFR_Labels_Uppercase['Label'] = modify_labels(NFR_Labels_Uppercase['Label'],'upper')

NFR_Labels_Lowercase = NFR_df.copy()
NFR_Labels_Lowercase['Label'] = modify_labels(NFR_Labels_Lowercase['Label'],'lower')

NFR_Labels_Capitalized = NFR_df.copy()
NFR_Labels_Capitalized['Label'] = modify_labels(NFR_Labels_Capitalized['Label'],'capitalize')


      DatasetName  ProjectID  \
0  PROMISE-NFR-V2          1   
1  PROMISE-NFR-V2          1   
2  PROMISE-NFR-V2          1   
3  PROMISE-NFR-V2          1   
4  PROMISE-NFR-V2          1   

                                     RequirementText          Label  
0  'The system shall refresh the display every 60...    Performance  
1  'The application shall match the color of the ...  Look and Feel  
2  'If projected the data must be readable. On a ...      Usability  
3  'The product shall be available during normal ...   Availability  
4  'If projected the data must be understandable....      Usability  

NFR Label Counts:
Label
Usability          67
Security           66
Operational        62
Performance        54
Look and Feel      38
Availability       21
Scalability        21
Maintainability    17
Legal              13
Fault Tolerance    10
Name: count, dtype: int64


In [None]:
#print("\ NFR_Labels_None Label Counts:")
#print(NFR_None['Label'].value_counts())
print("\ NFR_Remove_Punctuation Label Counts:")
print(NFR_Remove_Punctuation['Label'].value_counts())
print("\ NFR_Add_FullStops Label Counts:")
print(NFR_Add_FullStops['Label'].value_counts())
print("\ NFR_Labels_Uppercase Label Counts:")
print(NFR_Labels_Uppercase['Label'].value_counts())
print("\nNFR_Labels_Lowercase Label Counts:")
print(NFR_Labels_Lowercase['Label'].value_counts())
print("\nNFR_Labels_Capitalized Label Counts:")
print(NFR_Labels_Capitalized['Label'].value_counts())

\ NFR_Remove_Punctuation Label Counts:
Label
Usability          67
Security           66
Operational        62
Performance        54
Look and Feel      38
Availability       21
Scalability        21
Maintainability    17
Legal              13
Fault Tolerance    10
Name: count, dtype: int64
\ NFR_Add_FullStops Label Counts:
Label
Usability          67
Security           66
Operational        62
Performance        54
Look and Feel      38
Availability       21
Scalability        21
Maintainability    17
Legal              13
Fault Tolerance    10
Name: count, dtype: int64
\ NFR_Labels_Uppercase Label Counts:
Label
USABILITY          67
SECURITY           66
OPERATIONAL        62
PERFORMANCE        54
LOOK AND FEEL      38
AVAILABILITY       21
SCALABILITY        21
MAINTAINABILITY    17
LEGAL              13
FAULT TOLERANCE    10
Name: count, dtype: int64

NFR_Labels_Lowercase Label Counts:
Label
usability          67
security           66
operational        62
performance        54
look

In [None]:
# Add Label Definition for NFR_None Status DataFrame

'''
NFR_None['Label Definition'] = NFR_None['Label'].apply(lambda x:
    'Performance requirements are quality requirements that specify the performance measure of the system.' if x == 'Performance' else (
    'Security requirements are quality requirements that ensure the integrity, confidentiality, reliability, availability, and safety of a system.' if x == 'Security' else (
    'Operational requirements define the specific conditions and situations under which the system will function.' if x == 'Operational' else (
    'Scalability requirements define the capability of a system to expand and adapt to the changing size and needs of the client.' if x == 'Scalability' else (
    'Usability requirements are quality requirements that define what a system has to do to support users task performance.' if x == 'Usability' else (
    'Maintainability requirements are quality requirements that define how easy it is to maintain and evolve a system over time.' if x == 'Maintainability' else (
    'Reliability requirements are quality requirements that specify how likely a system would run without a failure for a given period of time under predefined conditions.' if x == 'Reliability' else (
    'Look and feel requirements are quality requirements that consider all static and dynamic aspects of the user interface, including colors, shapes, layout, typefaces, buttons, boxes, and menus.' if x == 'Look and Feel' else (
    'Availability requirements are quality requirements that ensure the maximum operational time of a system.' if x == 'Availability' else (
    'Portability requirements define how easy it is to transport a system from its current hardware or software environment to another environment.' if x == 'Portability' else (
    'Legal requirements define a system legal conformity to data privacy, regulations and standards.' if x == 'Legal' else (
    'Fault tolerance requirements are quality requirements that ensure the system to have the ability to detect the fault and to have a backup plan.' if x == 'Fault Tolerance' else (
    'Functional requirements (FRs) specify the functions of the system.' if x == 'Functional' else 'Unknown'
    ))))))))))))
)
'''

NFR_Remove_Punctuation['Label Definition'] = NFR_Remove_Punctuation['Label'].apply(lambda x:
    'Performance requirements are quality requirements that specify the performance measure of the system.' if x == 'Performance' else (
    'Security requirements are quality requirements that ensure the integrity, confidentiality, reliability, availability, and safety of a system.' if x == 'Security' else (
    'Operational requirements define the specific conditions and situations under which the system will function.' if x == 'Operational' else (
    'Scalability requirements define the capability of a system to expand and adapt to the changing size and needs of the client.' if x == 'Scalability' else (
    'Usability requirements are quality requirements that define what a system has to do to support users task performance.' if x == 'Usability' else (
    'Maintainability requirements are quality requirements that define how easy it is to maintain and evolve a system over time.' if x == 'Maintainability' else (
    'Reliability requirements are quality requirements that specify how likely a system would run without a failure for a given period of time under predefined conditions.' if x == 'Reliability' else (
    'Look and feel requirements are quality requirements that consider all static and dynamic aspects of the user interface, including colors, shapes, layout, typefaces, buttons, boxes, and menus.' if x == 'Look and Feel' else (
    'Availability requirements are quality requirements that ensure the maximum operational time of a system.' if x == 'Availability' else (
    'Portability requirements define how easy it is to transport a system from its current hardware or software environment to another environment.' if x == 'Portability' else (
    'Legal requirements define a system legal conformity to data privacy, regulations and standards.' if x == 'Legal' else (
    'Fault tolerance requirements are quality requirements that ensure the system to have the ability to detect the fault and to have a backup plan.' if x == 'Fault Tolerance' else (
    'Functional requirements (FRs) specify the functions of the system.' if x == 'Functional' else 'Unknown'
    ))))))))))))
)

NFR_Add_FullStops['Label Definition'] = NFR_Add_FullStops['Label'].apply(lambda x:
    'Performance requirements are quality requirements that specify the performance measure of the system.' if x == 'Performance' else (
    'Security requirements are quality requirements that ensure the integrity, confidentiality, reliability, availability, and safety of a system.' if x == 'Security' else (
    'Operational requirements define the specific conditions and situations under which the system will function.' if x == 'Operational' else (
    'Scalability requirements define the capability of a system to expand and adapt to the changing size and needs of the client.' if x == 'Scalability' else (
    'Usability requirements are quality requirements that define what a system has to do to support users task performance.' if x == 'Usability' else (
    'Maintainability requirements are quality requirements that define how easy it is to maintain and evolve a system over time.' if x == 'Maintainability' else (
    'Reliability requirements are quality requirements that specify how likely a system would run without a failure for a given period of time under predefined conditions.' if x == 'Reliability' else (
    'Look and feel requirements are quality requirements that consider all static and dynamic aspects of the user interface, including colors, shapes, layout, typefaces, buttons, boxes, and menus.' if x == 'Look and Feel' else (
    'Availability requirements are quality requirements that ensure the maximum operational time of a system.' if x == 'Availability' else (
    'Portability requirements define how easy it is to transport a system from its current hardware or software environment to another environment.' if x == 'Portability' else (
    'Legal requirements define a system legal conformity to data privacy, regulations and standards.' if x == 'Legal' else (
    'Fault tolerance requirements are quality requirements that ensure the system to have the ability to detect the fault and to have a backup plan.' if x == 'Fault Tolerance' else (
    'Functional requirements (FRs) specify the functions of the system.' if x == 'Functional' else 'Unknown'
    ))))))))))))
)

NFR_Labels_Uppercase['Label Definition'] = NFR_Labels_Uppercase['Label'].apply(lambda x:'Performance requirements are quality requirements that specify the performance measure of the system.' if x == 'PERFORMANCE' else (
    'Security requirements are quality requirements that ensure the integrity, confidentiality, reliability, availability, and safety of a system.' if x == 'SECURITY' else (
    'Operational requirements define the specific conditions and situations under which the system will function.' if x == 'OPERATIONAL' else (
    'Scalability requirements define the capability of a system to expand and adapt to the changing size and needs of the client.' if x == 'SCALABILITY' else (
    'Usability requirements are quality requirements that define what a system has to do to support users task performance.' if x == 'USABILITY' else (
    'Maintainability requirements are quality requirements that define how easy it is to maintain and evolve a system over time.' if x == 'MAINTAINABILITY' else (
    'Reliability requirements are quality requirements that specify how likely a system would run without a failure for a given period of time under predefined conditions.' if x == 'RELIABILITY' else (
    'Look and feel requirements are quality requirements that consider all static and dynamic aspects of the user interface, including colors, shapes, layout, typefaces, buttons, boxes, and menus.' if x == 'LOOK AND FEEL' else (
    'Availability requirements are quality requirements that ensure the maximum operational time of a system.' if x == 'AVAILABILITY' else (
    'Portability requirements define how easy it is to transport a system from its current hardware or software environment to another environment.' if x == 'PORTABILITY' else (
    'Legal requirements define a system legal conformity to data privacy, regulations and standards.' if x == 'LEGAL' else (
    'Fault tolerance requirements are quality requirements that ensure the system to have the ability to detect the fault and to have a backup plan.' if x == 'FAULT TOLERANCE' else (
    'Functional requirements (FRs) specify the functions of the system.' if x == 'FUNCTIONAL' else 'Unknown'
    ))))))))))))
)

NFR_Labels_Lowercase['Label Definition'] = NFR_Labels_Lowercase['Label'].apply(lambda x: 'Performance requirements are quality requirements that specify the performance measure of the system.' if x == 'performance' else (
    'Security requirements are quality requirements that ensure the integrity, confidentiality, reliability, availability, and safety of a system.' if x == 'security' else (
    'Operational requirements define the specific conditions and situations under which the system will function.' if x == 'operational' else (
    'Scalability requirements define the capability of a system to expand and adapt to the changing size and needs of the client.' if x == 'scalability' else (
    'Usability requirements are quality requirements that define what a system has to do to support users task performance.' if x == 'usability' else (
    'Maintainability requirements are quality requirements that define how easy it is to maintain and evolve a system over time.' if x == 'maintainability' else (
    'Reliability requirements are quality requirements that specify how likely a system would run without a failure for a given period of time under predefined conditions.' if x == 'reliability' else (
    'Look and feel requirements are quality requirements that consider all static and dynamic aspects of the user interface, including colors, shapes, layout, typefaces, buttons, boxes, and menus.' if x == 'look and feel' else (
    'Availability requirements are quality requirements that ensure the maximum operational time of a system.' if x == 'availability' else (
    'Portability requirements define how easy it is to transport a system from its current hardware or software environment to another environment.' if x == 'portability' else (
    'Legal requirements define a system legal conformity to data privacy, regulations and standards.' if x == 'legal' else (
    'Fault tolerance requirements are quality requirements that ensure the system to have the ability to detect the fault and to have a backup plan.' if x == 'fault tolerance' else (
    'Functional requirements (FRs) specify the functions of the system.' if x == 'functional' else 'Unknown'
    ))))))))))))

)

NFR_Labels_Capitalized['Label Definition'] = NFR_Labels_Capitalized['Label'].apply(lambda x:
    'Performance requirements are quality requirements that specify the performance measure of the system.' if x == 'Performance' else (
    'Security requirements are quality requirements that ensure the integrity, confidentiality, reliability, availability, and safety of a system.' if x == 'Security' else (
    'Operational requirements define the specific conditions and situations under which the system will function.' if x == 'Operational' else (
    'Scalability requirements define the capability of a system to expand and adapt to the changing size and needs of the client.' if x == 'Scalability' else (
    'Usability requirements are quality requirements that define what a system has to do to support users task performance.' if x == 'Usability' else (
    'Maintainability requirements are quality requirements that define how easy it is to maintain and evolve a system over time.' if x == 'Maintainability' else (
    'Reliability requirements are quality requirements that specify how likely a system would run without a failure for a given period of time under predefined conditions.' if x == 'Reliability' else (
    'Look and feel requirements are quality requirements that consider all static and dynamic aspects of the user interface, including colors, shapes, layout, typefaces, buttons, boxes, and menus.' if x == 'Look and feel' else (
    'Availability requirements are quality requirements that ensure the maximum operational time of a system.' if x == 'Availability' else (
    'Portability requirements define how easy it is to transport a system from its current hardware or software environment to another environment.' if x == 'Portability' else (
    'Legal requirements define a system legal conformity to data privacy, regulations and standards.' if x == 'Legal' else (
    'Fault tolerance requirements are quality requirements that ensure the system to have the ability to detect the fault and to have a backup plan.' if x == 'Fault tolerance' else (
    'Functional requirements (FRs) specify the functions of the system.' if x == 'Functional' else 'Unknown'
    ))))))))))))
)
print("NFR Status DataFrame with Label Definition:")
print(NFR_None.head())


NFR Status DataFrame with Label Definition:
      DatasetName  ProjectID  \
0  PROMISE-NFR-V2          1   
1  PROMISE-NFR-V2          1   
2  PROMISE-NFR-V2          1   
3  PROMISE-NFR-V2          1   
4  PROMISE-NFR-V2          1   

                                     RequirementText          Label  
0  'The system shall refresh the display every 60...    Performance  
1  'The application shall match the color of the ...  Look and Feel  
2  'If projected the data must be readable. On a ...      Usability  
3  'The product shall be available during normal ...   Availability  
4  'If projected the data must be understandable....      Usability  


In [None]:
# print any df with any unkown defintion and show which label

# Check for unknown definitions in each DataFrame
dfs_to_check = [NFR_Remove_Punctuation, NFR_Add_FullStops,
               NFR_Labels_Uppercase, NFR_Labels_Lowercase, NFR_Labels_Capitalized]

for df in dfs_to_check:
  unknown_definitions = df[df['Label Definition'] == 'Unknown']
  if not unknown_definitions.empty:
    print(f"Found unknown definitions in DataFrame:")
    print(unknown_definitions[['Label', 'Label Definition']])


In [None]:
#check if there is and Unknown as label defintions for all NFR dfs

# Check for 'Unknown' in Label Definition for all NFR DataFrames
#print("NFR_None has 'Unknown' label definitions:", NFR_None['Label Definition'].str.contains('Unknown').any())
print("NFR_Remove_Punctuation has 'Unknown' label definitions:", NFR_Remove_Punctuation['Label Definition'].str.contains('Unknown').any())
print("NFR_Add_FullStops has 'Unknown' label definitions:", NFR_Add_FullStops['Label Definition'].str.contains('Unknown').any())
print("NFR_Labels_Uppercase has 'Unknown' label definitions:", NFR_Labels_Uppercase['Label Definition'].str.contains('Unknown').any())
print("NFR_Labels_Lowercase has 'Unknown' label definitions:", NFR_Labels_Lowercase['Label Definition'].str.contains('Unknown').any())
print("NFR_Labels_Capitalized has 'Unknown' label definitions:", NFR_Labels_Capitalized['Label Definition'].str.contains('Unknown').any())


NFR_Remove_Punctuation has 'Unknown' label definitions: False
NFR_Add_FullStops has 'Unknown' label definitions: False
NFR_Labels_Uppercase has 'Unknown' label definitions: False
NFR_Labels_Lowercase has 'Unknown' label definitions: False
NFR_Labels_Capitalized has 'Unknown' label definitions: False


### Multiclass Dataset Distributions

#### Functional vs NFR classes in PROMISE -- Not needed already included in Functional Binary (1)

In [None]:
'''
Functional_NFR_None = NFR_None.copy()
Functional_NFR_None['Label'] = Functional_NFR_None['Label'].apply(lambda x: x if x == 'Functional' else 'Non-Functional')
Functional_NFR_None['Label Definition'] = Functional_NFR_None['Label'].apply(lambda x: 'Functional requirements (FRs) specify the functions of the system.' if x == 'Functional' else 'Non-functional requirements (NFRs) define how the system performs its intended functions.')
'''

#### NFR Classes (excluding Functional)

In [None]:
#Copying dataframes from NFR dfs by excluding Functional

# Exclude 'Functional' label
#NFR_None_NoFunc = NFR_None[NFR_None['Label'] != 'Functional'].copy()
NFR_Remove_Punctuation_NoFunc = NFR_Remove_Punctuation[NFR_Remove_Punctuation['Label'] != 'Functional'].copy()
NFR_Add_FullStops_NoFunc = NFR_Add_FullStops[NFR_Add_FullStops['Label'] != 'Functional'].copy()
NFR_Labels_Uppercase_NoFunc = NFR_Labels_Uppercase[NFR_Labels_Uppercase['Label'] != 'FUNCTIONAL'].copy()
NFR_Labels_Lowercase_NoFunc = NFR_Labels_Lowercase[NFR_Labels_Lowercase['Label'] != 'functional'].copy()
NFR_Labels_Capitalized_NoFunc = NFR_Labels_Capitalized[NFR_Labels_Capitalized['Label'] != 'Functional'].copy()

# Verify the changes
print(NFR_Labels_Capitalized_NoFunc['Label'].value_counts())


Label
Usability          67
Security           66
Operational        62
Performance        54
Look and feel      38
Availability       21
Scalability        21
Maintainability    17
Legal              13
Fault tolerance    10
Name: count, dtype: int64


In [None]:
NFR_Labels_Uppercase_NoFunc['Label'].value_counts()

Unnamed: 0_level_0,count
Label,Unnamed: 1_level_1
USABILITY,67
SECURITY,66
OPERATIONAL,62
PERFORMANCE,54
LOOK AND FEEL,38
AVAILABILITY,21
SCALABILITY,21
MAINTAINABILITY,17
LEGAL,13
FAULT TOLERANCE,10


#### NFR class as one vs rest (OVR) distrubution (excluding Functional)

In [None]:


# Create XXX_NFR_None_NoFunc dataframes
Performance_NFR_None_NoFunc = NFR_None_NoFunc.copy()
Performance_NFR_None_NoFunc['Label'] = Performance_NFR_None_NoFunc['Label'].apply(lambda x: x if x == 'Performance' else 'Non-performance')
Security_NFR_None_NoFunc = NFR_None_NoFunc.copy()
Security_NFR_None_NoFunc['Label'] = Security_NFR_None_NoFunc['Label'].apply(lambda x: x if x == 'Security' else 'Non-security')
Operational_NFR_None_NoFunc = NFR_None_NoFunc.copy()
Operational_NFR_None_NoFunc['Label'] = Operational_NFR_None_NoFunc['Label'].apply(lambda x: x if x == 'Operational' else 'Non-operational')

Scalability_NFR_None_NoFunc = NFR_None_NoFunc.copy()
Scalability_NFR_None_NoFunc['Label'] = Scalability_NFR_None_NoFunc['Label'].apply(lambda x: x if x == 'Scalability' else 'Non-scalability')
Usability_NFR_None_NoFunc = NFR_None_NoFunc.copy()
Usability_NFR_None_NoFunc['Label'] = Usability_NFR_None_NoFunc['Label'].apply(lambda x: x if x == 'Usability' else 'Non-usability')
Maintainability_NFR_None_NoFunc = NFR_None_NoFunc.copy()
Maintainability_NFR_None_NoFunc['Label'] = Maintainability_NFR_None_NoFunc['Label'].apply(lambda x: x if x == 'Maintainability' else 'Non-maintainability')

Look_and_Feel_NFR_None_NoFunc = NFR_None_NoFunc.copy()
Look_and_Feel_NFR_None_NoFunc['Label'] = Look_and_Feel_NFR_None_NoFunc['Label'].apply(lambda x: x if x == 'Look and Feel' else 'Non-Look and Feel')
Availability_NFR_None_NoFunc = NFR_None_NoFunc.copy()
Availability_NFR_None_NoFunc['Label'] = Availability_NFR_None_NoFunc['Label'].apply(lambda x: x if x == 'Availability' else 'Non-availability')
Portability_NFR_None_NoFunc = NFR_None_NoFunc.copy()
Portability_NFR_None_NoFunc['Label'] = Portability_NFR_None_NoFunc['Label'].apply(lambda x: x if x == 'Portability' else 'Non-portability')

Fault_Tolerance_NFR_None_NoFunc = NFR_None_NoFunc.copy()
Fault_Tolerance_NFR_None_NoFunc['Label'] = Fault_Tolerance_NFR_None_NoFunc['Label'].apply(lambda x: x if x == 'Fault Tolerance' else 'Non-Fault Tolerance')
Legal_NFR_None_NoFunc = NFR_None_NoFunc.copy()
Legal_NFR_None_NoFunc['Label'] = Legal_NFR_None_NoFunc['Label'].apply(lambda x: x if x == 'Legal' else 'Non-Legal')

In [None]:
#adding defintions label
Performance_NFR_None_NoFunc['Label Definition'] = Performance_NFR_None_NoFunc['Label'].apply(lambda x: 'Performance requirements are quality requirements that specify the performance measure of the system.' if x == 'Performance' else 'Non-performance requirements are quality requirements that do not specify any performance measures of the system.')
Security_NFR_None_NoFunc['Label Definition'] = Security_NFR_None_NoFunc['Label'].apply(lambda x: 'Security requirements (SRs) are quality requirements that ensure the integrity, confidentiality, reliability, availability, and safety of a system.' if x == 'Security' else 'Non-security requirements are any requirements that are not the security requirements.')
Operational_NFR_None_NoFunc['Label Definition'] = Operational_NFR_None_NoFunc['Label'].apply(lambda x: 'Operational requirements define the specific conditions and situations under which the system will function.' if x == 'Operational' else 'Non-operational any requirement that do not discuss operational conditions and situations.')

Scalability_NFR_None_NoFunc['Label Definition'] = Scalability_NFR_None_NoFunc['Label'].apply(lambda x: 'Scalability requirements define the capability of a system to expand and adapt to the changing size and needs of the client.' if x == 'Scalability' else 'Non-scalability requirements are any requirements that do not discuss system scalability requirements.')
Usability_NFR_None_NoFunc['Label Definition'] = Usability_NFR_None_NoFunc['Label'].apply(lambda x: 'Usability requirements are quality requirements that define what a system has to do to support users task performance.' if x == 'Usability' else 'Non-usability requirements are any requirements that do not discuss system usability requirements.')
Maintainability_NFR_None_NoFunc['Label Definition'] = Maintainability_NFR_None_NoFunc['Label'].apply(lambda x: 'Maintainability requirements are quality requirements that define how easy it is to maintain and evolve a system over time.' if x == 'Maintainability' else 'Non-maintainability requirements are any requirements that do not discuss system maintainability requirements.')

Look_and_Feel_NFR_None_NoFunc['Label Definition'] = Look_and_Feel_NFR_None_NoFunc['Label'].apply(lambda x: 'Look and feel requirements are quality requirements that consider all static and dynamic aspects of the user interface, including colors, shapes, layout, typefaces, buttons, boxes, and menus.' if x == 'Look and Feel' else 'Non-Look and feel requirements are any requirements that do not discuss system look and feel requirements.')
Availability_NFR_None_NoFunc['Label Definition'] = Availability_NFR_None_NoFunc['Label'].apply(lambda x: 'Availability requirements are quality requirements that ensure the maximum operational time of a system.' if x == 'Availability' else 'Non-availability requirements are any requirements that do not discuss system availability requirements.')
Portability_NFR_None_NoFunc['Label Definition'] = Portability_NFR_None_NoFunc['Label'].apply(lambda x: 'Portability requirements define how easy it is to transport a system from its current hardware or software environment to another environment.' if x == 'Portability' else 'Non-portability requirements are any requirements that do not discuss system portability requirements.')

Fault_Tolerance_NFR_None_NoFunc['Label Definition'] = Fault_Tolerance_NFR_None_NoFunc['Label'].apply(lambda x: 'Fault tolerance requirements are quality requirements that ensure the system to have the ability to detect the fault and to have a backup plan.' if x == 'Fault Tolerance' else 'Non-Fault tolerance requirements are any requirements that do not discuss system fault tolerance requirements.')
Legal_NFR_None_NoFunc['Label Definition'] = Legal_NFR_None_NoFunc['Label'].apply(lambda x: 'Legal requirements define a system legal conformity to data privacy, regulations and standards.' if x == 'Legal' else 'Non-Legal requirements are any requirements that do not discuss system legal requirements.')



In [None]:
##Performance
Performance_NFR_None_NoFunc_Remove_Punctuation = Performance_NFR_None_NoFunc.copy()
Performance_NFR_None_NoFunc_Remove_Punctuation['RequirementText'] = modify_text(Performance_NFR_None_NoFunc_Remove_Punctuation['RequirementText'],'remove_punctuation')

Performance_NFR_None_NoFunc_Add_FullStops = Performance_NFR_None_NoFunc.copy()
Performance_NFR_None_NoFunc_Add_FullStops['RequirementText'] = modify_text(Performance_NFR_None_NoFunc_Add_FullStops['RequirementText'],'add_full_stops')

Performance_NFR_None_NoFunc_Labels_Uppercase = Performance_NFR_None_NoFunc.copy()
Performance_NFR_None_NoFunc_Labels_Uppercase['Label'] = modify_labels(Performance_NFR_None_NoFunc_Labels_Uppercase['Label'],'upper')

Performance_NFR_None_NoFunc_Labels_Lowercase = Performance_NFR_None_NoFunc.copy()
Performance_NFR_None_NoFunc_Labels_Lowercase['Label'] = modify_labels(Performance_NFR_None_NoFunc_Labels_Lowercase['Label'],'lower')

Performance_NFR_None_NoFunc_Labels_Capitalized = Performance_NFR_None_NoFunc.copy()
Performance_NFR_None_NoFunc_Labels_Capitalized['Label'] = modify_labels(Performance_NFR_None_NoFunc_Labels_Capitalized['Label'],'capitalize')

##Security
Security_NFR_None_NoFunc_Remove_Punctuation = Security_NFR_None_NoFunc.copy()
Security_NFR_None_NoFunc_Remove_Punctuation['RequirementText'] = modify_text(Security_NFR_None_NoFunc_Remove_Punctuation['RequirementText'],'remove_punctuation')

Security_NFR_None_NoFunc_Add_FullStops = Security_NFR_None_NoFunc.copy()
Security_NFR_None_NoFunc_Add_FullStops['RequirementText'] = modify_text(Security_NFR_None_NoFunc_Add_FullStops['RequirementText'],'add_full_stops')

Security_NFR_None_NoFunc_Labels_Uppercase = Security_NFR_None_NoFunc.copy()
Security_NFR_None_NoFunc_Labels_Uppercase['Label'] = modify_labels(Security_NFR_None_NoFunc_Labels_Uppercase['Label'],'upper')

Security_NFR_None_NoFunc_Labels_Lowercase = Security_NFR_None_NoFunc.copy()
Security_NFR_None_NoFunc_Labels_Lowercase['Label'] = modify_labels(Security_NFR_None_NoFunc_Labels_Lowercase['Label'],'lower')

Security_NFR_None_NoFunc_Labels_Capitalized = Security_NFR_None_NoFunc.copy()
Security_NFR_None_NoFunc_Labels_Capitalized['Label'] = modify_labels(Security_NFR_None_NoFunc_Labels_Capitalized['Label'],'capitalize')

##Operational
Operational_NFR_None_NoFunc_Remove_Punctuation = Operational_NFR_None_NoFunc.copy()
Operational_NFR_None_NoFunc_Remove_Punctuation['RequirementText'] = modify_text(Operational_NFR_None_NoFunc_Remove_Punctuation['RequirementText'],'remove_punctuation')

Operational_NFR_None_NoFunc_Add_FullStops = Operational_NFR_None_NoFunc.copy()
Operational_NFR_None_NoFunc_Add_FullStops['RequirementText'] = modify_text(Operational_NFR_None_NoFunc_Add_FullStops['RequirementText'],'add_full_stops')

Operational_NFR_None_NoFunc_Labels_Uppercase = Operational_NFR_None_NoFunc.copy()
Operational_NFR_None_NoFunc_Labels_Uppercase['Label'] = modify_labels(Operational_NFR_None_NoFunc_Labels_Uppercase['Label'],'upper')

Operational_NFR_None_NoFunc_Labels_Lowercase = Operational_NFR_None_NoFunc.copy()
Operational_NFR_None_NoFunc_Labels_Lowercase['Label'] = modify_labels(Operational_NFR_None_NoFunc_Labels_Lowercase['Label'],'lower')

Operational_NFR_None_NoFunc_Labels_Capitalized = Operational_NFR_None_NoFunc.copy()
Operational_NFR_None_NoFunc_Labels_Capitalized['Label'] = modify_labels(Operational_NFR_None_NoFunc_Labels_Capitalized['Label'],'capitalize')


##Scalability
Scalability_NFR_None_NoFunc_Remove_Punctuation = Scalability_NFR_None_NoFunc.copy()
Scalability_NFR_None_NoFunc_Remove_Punctuation['RequirementText'] = modify_text(Scalability_NFR_None_NoFunc_Remove_Punctuation['RequirementText'],'remove_punctuation')

Scalability_NFR_None_NoFunc_Add_FullStops = Scalability_NFR_None_NoFunc.copy()
Scalability_NFR_None_NoFunc_Add_FullStops['RequirementText'] = modify_text(Scalability_NFR_None_NoFunc_Add_FullStops['RequirementText'],'add_full_stops')

Scalability_NFR_None_NoFunc_Labels_Uppercase = Scalability_NFR_None_NoFunc.copy()
Scalability_NFR_None_NoFunc_Labels_Uppercase['Label'] = modify_labels(Scalability_NFR_None_NoFunc_Labels_Uppercase['Label'],'upper')

Scalability_NFR_None_NoFunc_Labels_Lowercase = Scalability_NFR_None_NoFunc.copy()
Scalability_NFR_None_NoFunc_Labels_Lowercase['Label'] = modify_labels(Scalability_NFR_None_NoFunc_Labels_Lowercase['Label'],'lower')

Scalability_NFR_None_NoFunc_Labels_Capitalized = Scalability_NFR_None_NoFunc.copy()
Scalability_NFR_None_NoFunc_Labels_Capitalized['Label'] = modify_labels(Scalability_NFR_None_NoFunc_Labels_Capitalized['Label'],'capitalize')

##Usability
Usability_NFR_None_NoFunc_Remove_Punctuation = Usability_NFR_None_NoFunc.copy()
Usability_NFR_None_NoFunc_Remove_Punctuation['RequirementText'] = modify_text(Usability_NFR_None_NoFunc_Remove_Punctuation['RequirementText'],'remove_punctuation')

Usability_NFR_None_NoFunc_Add_FullStops = Usability_NFR_None_NoFunc.copy()
Usability_NFR_None_NoFunc_Add_FullStops['RequirementText'] = modify_text(Usability_NFR_None_NoFunc_Add_FullStops['RequirementText'],'add_full_stops')

Usability_NFR_None_NoFunc_Labels_Uppercase = Usability_NFR_None_NoFunc.copy()
Usability_NFR_None_NoFunc_Labels_Uppercase['Label'] = modify_labels(Usability_NFR_None_NoFunc_Labels_Uppercase['Label'],'upper')

Usability_NFR_None_NoFunc_Labels_Lowercase = Usability_NFR_None_NoFunc.copy()
Usability_NFR_None_NoFunc_Labels_Lowercase['Label'] = modify_labels(Usability_NFR_None_NoFunc_Labels_Lowercase['Label'],'lower')

Usability_NFR_None_NoFunc_Labels_Capitalized = Usability_NFR_None_NoFunc.copy()
Usability_NFR_None_NoFunc_Labels_Capitalized['Label'] = modify_labels(Usability_NFR_None_NoFunc_Labels_Capitalized['Label'],'capitalize')

## Maintainability_NFR_None_NoFunc
Maintainability_NFR_None_NoFunc_Remove_Punctuation = Maintainability_NFR_None_NoFunc.copy()
Maintainability_NFR_None_NoFunc_Remove_Punctuation['RequirementText'] = modify_text(Maintainability_NFR_None_NoFunc_Remove_Punctuation['RequirementText'],'remove_punctuation')

Maintainability_NFR_None_NoFunc_Add_FullStops = Maintainability_NFR_None_NoFunc.copy()
Maintainability_NFR_None_NoFunc_Add_FullStops['RequirementText'] = modify_text(Maintainability_NFR_None_NoFunc_Add_FullStops['RequirementText'],'add_full_stops')

Maintainability_NFR_None_NoFunc_Labels_Uppercase = Maintainability_NFR_None_NoFunc.copy()
Maintainability_NFR_None_NoFunc_Labels_Uppercase['Label'] = modify_labels(Maintainability_NFR_None_NoFunc_Labels_Uppercase['Label'],'upper')

Maintainability_NFR_None_NoFunc_Labels_Lowercase = Maintainability_NFR_None_NoFunc.copy()
Maintainability_NFR_None_NoFunc_Labels_Lowercase['Label'] = modify_labels(Maintainability_NFR_None_NoFunc_Labels_Lowercase['Label'],'lower')

Maintainability_NFR_None_NoFunc_Labels_Capitalized = Maintainability_NFR_None_NoFunc.copy()
Maintainability_NFR_None_NoFunc_Labels_Capitalized['Label'] = modify_labels(Maintainability_NFR_None_NoFunc_Labels_Capitalized['Label'],'capitalize')


## Look_and_Feel_NFR
Look_and_Feel_NFR_None_NoFunc_Remove_Punctuation = Look_and_Feel_NFR_None_NoFunc.copy()
Look_and_Feel_NFR_None_NoFunc_Remove_Punctuation['RequirementText'] = modify_text(Look_and_Feel_NFR_None_NoFunc_Remove_Punctuation['RequirementText'],'remove_punctuation')

Look_and_Feel_NFR_None_NoFunc_Add_FullStops = Look_and_Feel_NFR_None_NoFunc.copy()
Look_and_Feel_NFR_None_NoFunc_Add_FullStops['RequirementText'] = modify_text(Look_and_Feel_NFR_None_NoFunc_Add_FullStops['RequirementText'],'add_full_stops')

Look_and_Feel_NFR_None_NoFunc_Labels_Uppercase = Look_and_Feel_NFR_None_NoFunc.copy()
Look_and_Feel_NFR_None_NoFunc_Labels_Uppercase['Label'] = modify_labels(Look_and_Feel_NFR_None_NoFunc_Labels_Uppercase['Label'],'upper')

Look_and_Feel_NFR_None_NoFunc_Labels_Lowercase = Look_and_Feel_NFR_None_NoFunc.copy()
Look_and_Feel_NFR_None_NoFunc_Labels_Lowercase['Label'] = modify_labels(Look_and_Feel_NFR_None_NoFunc_Labels_Lowercase['Label'],'lower')

Look_and_Feel_NFR_None_NoFunc_Labels_Capitalized = Look_and_Feel_NFR_None_NoFunc.copy()
Look_and_Feel_NFR_None_NoFunc_Labels_Capitalized['Label'] = modify_labels(Look_and_Feel_NFR_None_NoFunc_Labels_Capitalized['Label'],'capitalize')

## Availability
Availability_NFR_None_NoFunc_Remove_Punctuation = Availability_NFR_None_NoFunc.copy()
Availability_NFR_None_NoFunc_Remove_Punctuation['RequirementText'] = modify_text(Availability_NFR_None_NoFunc_Remove_Punctuation['RequirementText'],'remove_punctuation')

Availability_NFR_None_NoFunc_Add_FullStops = Availability_NFR_None_NoFunc.copy()
Availability_NFR_None_NoFunc_Add_FullStops['RequirementText'] = modify_text(Availability_NFR_None_NoFunc_Add_FullStops['RequirementText'],'add_full_stops')

Availability_NFR_None_NoFunc_Labels_Uppercase = Availability_NFR_None_NoFunc.copy()
Availability_NFR_None_NoFunc_Labels_Uppercase['Label'] = modify_labels(Availability_NFR_None_NoFunc_Labels_Uppercase['Label'],'upper')

Availability_NFR_None_NoFunc_Labels_Lowercase = Availability_NFR_None_NoFunc.copy()
Availability_NFR_None_NoFunc_Labels_Lowercase['Label'] = modify_labels(Availability_NFR_None_NoFunc_Labels_Lowercase['Label'],'lower')

Availability_NFR_None_NoFunc_Labels_Capitalized = Availability_NFR_None_NoFunc.copy()
Availability_NFR_None_NoFunc_Labels_Capitalized['Label'] = modify_labels(Availability_NFR_None_NoFunc_Labels_Capitalized['Label'],'capitalize')

## Portability_NFR_None_NoFunc
Portability_NFR_None_NoFunc_Remove_Punctuation = Portability_NFR_None_NoFunc.copy()
Portability_NFR_None_NoFunc_Remove_Punctuation['RequirementText'] = modify_text(Portability_NFR_None_NoFunc_Remove_Punctuation['RequirementText'],'remove_punctuation')

Portability_NFR_None_NoFunc_Add_FullStops = Portability_NFR_None_NoFunc.copy()
Portability_NFR_None_NoFunc_Add_FullStops['RequirementText'] = modify_text(Portability_NFR_None_NoFunc_Add_FullStops['RequirementText'],'add_full_stops')

Portability_NFR_None_NoFunc_Labels_Uppercase = Portability_NFR_None_NoFunc.copy()
Portability_NFR_None_NoFunc_Labels_Uppercase['Label'] = modify_labels(Portability_NFR_None_NoFunc_Labels_Uppercase['Label'],'upper')

Portability_NFR_None_NoFunc_Labels_Lowercase = Portability_NFR_None_NoFunc.copy()
Portability_NFR_None_NoFunc_Labels_Lowercase['Label'] = modify_labels(Portability_NFR_None_NoFunc_Labels_Lowercase['Label'],'lower')

Portability_NFR_None_NoFunc_Labels_Capitalized = Portability_NFR_None_NoFunc.copy()
Portability_NFR_None_NoFunc_Labels_Capitalized['Label'] = modify_labels(Portability_NFR_None_NoFunc_Labels_Capitalized['Label'],'capitalize')

##Fault_Tolerance_NFR_None_NoFunc
Fault_Tolerance_NFR_None_NoFunc_Remove_Punctuation = Fault_Tolerance_NFR_None_NoFunc.copy()
Fault_Tolerance_NFR_None_NoFunc_Remove_Punctuation['RequirementText'] = modify_text(Fault_Tolerance_NFR_None_NoFunc_Remove_Punctuation['RequirementText'],'remove_punctuation')

Fault_Tolerance_NFR_None_NoFunc_Add_FullStops = Fault_Tolerance_NFR_None_NoFunc.copy()
Fault_Tolerance_NFR_None_NoFunc_Add_FullStops['RequirementText'] = modify_text(Fault_Tolerance_NFR_None_NoFunc_Add_FullStops['RequirementText'],'add_full_stops')

Fault_Tolerance_NFR_None_NoFunc_Labels_Uppercase = Fault_Tolerance_NFR_None_NoFunc.copy()
Fault_Tolerance_NFR_None_NoFunc_Labels_Uppercase['Label'] = modify_labels(Fault_Tolerance_NFR_None_NoFunc_Labels_Uppercase['Label'],'upper')

Fault_Tolerance_NFR_None_NoFunc_Labels_Lowercase = Fault_Tolerance_NFR_None_NoFunc.copy()
Fault_Tolerance_NFR_None_NoFunc_Labels_Lowercase['Label'] = modify_labels(Fault_Tolerance_NFR_None_NoFunc_Labels_Lowercase['Label'],'lower')

Fault_Tolerance_NFR_None_NoFunc_Labels_Capitalized = Fault_Tolerance_NFR_None_NoFunc.copy()
Fault_Tolerance_NFR_None_NoFunc_Labels_Capitalized['Label'] = modify_labels(Fault_Tolerance_NFR_None_NoFunc_Labels_Capitalized['Label'],'capitalize')

##Legal
Legal_NFR_None_NoFunc_Remove_Punctuation = Legal_NFR_None_NoFunc.copy()
Legal_NFR_None_NoFunc_Remove_Punctuation['RequirementText'] = modify_text(Legal_NFR_None_NoFunc_Remove_Punctuation['RequirementText'],'remove_punctuation')

Legal_NFR_None_NoFunc_Add_FullStops = Legal_NFR_None_NoFunc.copy()
Legal_NFR_None_NoFunc_Add_FullStops['RequirementText'] = modify_text(Legal_NFR_None_NoFunc_Add_FullStops['RequirementText'],'add_full_stops')

Legal_NFR_None_NoFunc_Labels_Uppercase = Legal_NFR_None_NoFunc.copy()
Legal_NFR_None_NoFunc_Labels_Uppercase['Label'] = modify_labels(Legal_NFR_None_NoFunc_Labels_Uppercase['Label'],'upper')

Legal_NFR_None_NoFunc_Labels_Lowercase = Legal_NFR_None_NoFunc.copy()
Legal_NFR_None_NoFunc_Labels_Lowercase['Label'] = modify_labels(Legal_NFR_None_NoFunc_Labels_Lowercase['Label'],'lower')

Legal_NFR_None_NoFunc_Labels_Capitalized = Legal_NFR_None_NoFunc.copy()
Legal_NFR_None_NoFunc_Labels_Capitalized['Label'] = modify_labels(Legal_NFR_None_NoFunc_Labels_Capitalized['Label'],'capitalize')

#### NFR grouped based on their themes (Similar)
**Theme 1: Performance and Reliability**

* Performance: The system's speed and efficiency in processing tasks.
* Availability: The system's uptime and accessibility.
* Fault Tolerance: The system's ability to continue functioning despite failures.

**Theme 2: Usability and Experience**
* Usability: The ease with which users can interact with the system.
* Look and Feel: The system's visual appeal and user interface design.

**Theme 3: Operational and Maintenance**
* Maintainability: The ease of modifying and updating the system.
* Scalability: The system's ability to handle increasing workloads.
* (excluded) Portability: The system's ability to function in different environments.
* Operational: The performance of a system based on its operational characteristics

**Theme 4: Legal and Security**
* Security: The system's protection against unauthorized access and data breaches.
* Legal: Compliance with relevant laws and regulations.

In [None]:
NFR_theme_1 = NFR_None_NoFunc[NFR_None_NoFunc['Label'].isin(['Performance', 'Availability', 'Fault Tolerance'])].copy()
NFR_theme_2 = NFR_None_NoFunc[NFR_None_NoFunc['Label'].isin(['Usability', 'Look and Feel'])].copy()
NFR_theme_3 = NFR_None_NoFunc[NFR_None_NoFunc['Label'].isin(['Maintainability', 'Scalability', 'Operational'])].copy()
NFR_theme_4 = NFR_None_NoFunc[NFR_None_NoFunc['Label'].isin(['Security', 'Legal'])].copy()

NFR_theme_1_Remove_Punctuation = NFR_Remove_Punctuation_NoFunc[NFR_Remove_Punctuation_NoFunc['Label'].isin(['Performance', 'Availability', 'Fault Tolerance'])].copy()
NFR_theme_2_Remove_Punctuation = NFR_Remove_Punctuation_NoFunc[NFR_Remove_Punctuation_NoFunc['Label'].isin(['Usability', 'Look and Feel'])].copy()
NFR_theme_3_Remove_Punctuation = NFR_Remove_Punctuation_NoFunc[NFR_Remove_Punctuation_NoFunc['Label'].isin(['Maintainability', 'Scalability', 'Operational'])].copy()
NFR_theme_4_Remove_Punctuation = NFR_Remove_Punctuation_NoFunc[NFR_Remove_Punctuation_NoFunc['Label'].isin(['Security', 'Legal'])].copy()

NFR_theme_1_Add_FullStops = NFR_Add_FullStops_NoFunc[NFR_Add_FullStops_NoFunc['Label'].isin(['Performance', 'Availability', 'Fault Tolerance'])].copy()
NFR_theme_2_Add_FullStops = NFR_Add_FullStops_NoFunc[NFR_Add_FullStops_NoFunc['Label'].isin(['Usability', 'Look and Feel'])].copy()
NFR_theme_3_Add_FullStops = NFR_Add_FullStops_NoFunc[NFR_Add_FullStops_NoFunc['Label'].isin(['Maintainability', 'Scalability', 'Operational'])].copy()
NFR_theme_4_Add_FullStops = NFR_Add_FullStops_NoFunc[NFR_Add_FullStops_NoFunc['Label'].isin(['Security', 'Legal'])].copy()

NFR_theme_1_Labels_Uppercase = NFR_Labels_Uppercase_NoFunc[NFR_Labels_Uppercase_NoFunc['Label'].isin(['PERFORMANCE', 'AVAILABILITY', 'FAULT TOLERANCE'])].copy()
NFR_theme_2_Labels_Uppercase = NFR_Labels_Uppercase_NoFunc[NFR_Labels_Uppercase_NoFunc['Label'].isin(['USABILITY', 'LOOK AND FEEL'])].copy()
NFR_theme_3_Labels_Uppercase = NFR_Labels_Uppercase_NoFunc[NFR_Labels_Uppercase_NoFunc['Label'].isin(['MAINTAINABILITY', 'SCALABILITY', 'OPERATIONAL'])].copy()
NFR_theme_4_Labels_Uppercase = NFR_Labels_Uppercase_NoFunc[NFR_Labels_Uppercase_NoFunc['Label'].isin(['SECURITY', 'LEGAL'])].copy()

NFR_theme_1_Labels_Lowercase = NFR_Labels_Lowercase_NoFunc[NFR_Labels_Lowercase_NoFunc['Label'].isin(['performance', 'availability', 'fault tolerance'])].copy()
NFR_theme_2_Labels_Lowercase = NFR_Labels_Lowercase_NoFunc[NFR_Labels_Lowercase_NoFunc['Label'].isin(['usability', 'look and feel'])].copy()
NFR_theme_3_Labels_Lowercase = NFR_Labels_Lowercase_NoFunc[NFR_Labels_Lowercase_NoFunc['Label'].isin(['maintainability', 'scalability', 'operational'])].copy()
NFR_theme_4_Labels_Lowercase = NFR_Labels_Lowercase_NoFunc[NFR_Labels_Lowercase_NoFunc['Label'].isin(['security', 'legal'])].copy()

NFR_theme_1_Labels_Capitalized = NFR_Labels_Capitalized_NoFunc[NFR_Labels_Capitalized_NoFunc['Label'].isin(['Performance', 'Availability', 'Fault Tolerance'])].copy()
NFR_theme_2_Labels_Capitalized = NFR_Labels_Capitalized_NoFunc[NFR_Labels_Capitalized_NoFunc['Label'].isin(['Usability', 'Look and Feel'])].copy()
NFR_theme_3_Labels_Capitalized = NFR_Labels_Capitalized_NoFunc[NFR_Labels_Capitalized_NoFunc['Label'].isin(['Maintainability', 'Scalability', 'Operational'])].copy()
NFR_theme_4_Labels_Capitalized = NFR_Labels_Capitalized_NoFunc[NFR_Labels_Capitalized_NoFunc['Label'].isin(['Security', 'Legal'])].copy()



#### Top 4 NFR Classes ONLY

In [None]:
NFR_top = NFR_None_NoFunc[NFR_None_NoFunc['Label'].isin(['Performance', 'Usability', 'Security', 'Operational'])].copy()
NFR_top_Remove_Punctuation = NFR_Remove_Punctuation_NoFunc[NFR_Remove_Punctuation_NoFunc['Label'].isin(['Performance', 'Usability', 'Security', 'Operational'])].copy()
NFR_top_Add_FullStops = NFR_Add_FullStops_NoFunc[NFR_Add_FullStops_NoFunc['Label'].isin(['Performance', 'Usability', 'Security', 'Operational'])].copy()
NFR_top_Labels_Uppercase = NFR_Labels_Uppercase_NoFunc[NFR_Labels_Uppercase_NoFunc['Label'].isin(['PERFORMANCE', 'USABILITY', 'SECURITY', 'OPERATIONAL'])].copy()
NFR_top_Labels_Lowercase = NFR_Labels_Lowercase_NoFunc[NFR_Labels_Lowercase_NoFunc['Label'].isin(['performance', 'usability', 'security', 'operational'])].copy()
NFR_top_Labels_Capitalized = NFR_Labels_Capitalized_NoFunc[NFR_Labels_Capitalized_NoFunc['Label'].isin(['Performance', 'Usability', 'Security', 'Operational'])].copy()

#LLMs (Models and Tokenizers)

In this cell, we recalled the models and tokenizers from the Hugging Face Models Hub in advance to avoid reloading each model every time we classify requirements. We saved these model and tokenizer instances into a list for later retrieval.

It is notworthy that we used AutoModelForSequenceClassification. This class is a general-purpose class that can be used with any pre-trained model from the Hugging Face Transformers library that supports sequence classification. It automatically detects the correct model architecture based on the provided model name or path. However, there special classes for seq. classification for some the used models, and we noticed the performance is the same, in both general and special classes.

In [None]:
!pip install transformers



In [None]:
  #llms_shortnames = ["Bloom", "T5", "Gemma", "Llama", "BART"]
  #model_names = ["bigscience/bloom-560m", "facebook/tart-full-flan-t5-xl", "google/gemma-2b", "meta-llama/Meta-Llama-3-8B", "facebook/bart-large-mnli"]
  #"meta-llama/Meta-Llama-3.1-8B-Instruct"

  #In this notebook, we used Llama as a running example!
  llms_shortnames = ["Llama"]
  model_names = ["meta-llama/Meta-Llama-3-8B"]

#Generate Definitions Using LLM (Optional)

This step assesses the model's comprehension of predefined labels assigned to the experimental datasets. By employing the '`generate`' method, we prompt the model to construct a definition for a given label, thereby evaluating its ability to articulate its understanding.

In this we used two prompts:

Prompt 1: `"Define {label} as a software requirement:"`

Prompt 2: `"Define the following software requirement label: {label}"`

In [None]:
  #llms_shortnames = ["Bloom", "T5", "Gemma", "Llama", "BART"]
  #model_names = ["bigscience/bloom-560m", "facebook/tart-full-flan-t5-xl", "google/gemma-2b", "meta-llama/Meta-Llama-3-8B", "facebook/bart-large-mnli"]
  llms_shortnames = ["Llama"]
  model_names = ["meta-llama/Meta-Llama-3-8B"]

In [None]:
import transformers
labels = ['Functional', 'Non-functional', 'Quality', 'Non-quality', 'Security', 'Non-Security',
          'Performance', 'Security', 'Operational', 'Scalability', 'Usability', 'Maintainability',
          'Reliability', 'Look and Feel', 'Availability', 'Portability', 'Legal', 'Fault Tolerance']

with tf.device(device_name):
  def generate_text(model, tokenizer, prompt, max_new_tokens=50):
    # Tokenize the input
    input_ids = tokenizer.encode(prompt, return_tensors='pt')

    # Generate text
    attention_mask = torch.ones(input_ids.shape, dtype=torch.long)
    output = model.generate(input_ids, attention_mask=attention_mask, max_new_tokens=max_new_tokens)

    # Decode the generated text
    generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
    return generated_text

  # Example usage:
  tokenizer = transformers.AutoTokenizer.from_pretrained(model_names[0])
  model_ = transformers.AutoModelForCausalLM.from_pretrained(model_names[0])

'''
  for label in labels:
    prompt = f"In requirements engineering, define the nonfunctional requirements class {label}"
    generated_definition = generate_text(model_, tokenizer, prompt)
    print(generated_definition)
'''

In [None]:
# prompt: save each generated defintion as label, defntion, model name

import pandas as pd

# Assuming 'generated_definition' is the variable containing the generated definition.

# Create an empty list to store the data
data = []

# Iterate over the labels and generated definitions
for label in labels:
  prompt = f"In Requirements Engineering (RE), the requirements class {label} is about  "
  generated_definition = generate_text(model_, tokenizer, prompt)
  print(generated_definition)
  data.append([label, generated_definition, model_names[0]])  # Assuming model_names[0] is the model name

df_2 = pd.DataFrame()
# Create a pandas DataFrame from the data
df_2 = pd.DataFrame(data, columns=['label', 'definition', 'model_name'])

# Save the DataFrame to a CSV file
df_2.to_csv('generated_definitions_def2_llama_3.csv', index=False)


In [None]:
# Iterate over the labels and generated definitions
for label in labels:
  prompt = f"The requirements class {label} is defined as  "
  generated_definition = generate_text(model_, tokenizer, prompt)
  print(generated_definition)
  data.append([label, generated_definition, model_names[0]])  # Assuming model_names[0] is the model name

df_1 = pd.DataFrame()
# Create a pandas DataFrame from the data
df_1 = pd.DataFrame(data, columns=['label', 'definition', 'model_name'])

# Save the DataFrame to a CSV file
df_1.to_csv('generated_definitions_def1_llama_3.csv', index=False)

In [None]:
# prompt: download all files except sample_data

!zip -r /content/downloaded_defintions_files_Friday.zip * -x "sample_data/*"


In [None]:
import transformers
labels = ['Functional', 'Non-functional', 'Quality', 'Non-quality', 'Security', 'Non-Security',
          'Performance', 'Security', 'Operational', 'Scalability', 'Usability', 'Maintainability',
          'Reliability', 'Look and Feel', 'Availability', 'Portability', 'Legal', 'Fault Tolerance', 'Functionality']

with tf.device(device_name):
  def generate_definitions(labels, model, tokenizer):
    for label in labels:
      prompt = f"Define the nonfunctional requirements class {label}"
      inputs = tokenizer(prompt, return_tensors="pt")
      outputs = model.generate(**inputs, max_new_tokens=50)
      definition = tokenizer.decode(outputs[0], skip_special_tokens=True)
      print(f"Definition of '{label}': {definition}")

  tokenizer = transformers.AutoTokenizer.from_pretrained(model_names[0])
  model_ = transformers.AutoModelForCausalLM.from_pretrained(model_names[0])
  generate_definitions(labels, model_, tokenizer)


#ZSL Classification
For ZSL classification, we followed inference-based or as it's called entailment-based Zero-Shot Learning (ZSL) which focuses on determining whether a given piece of text logically implies or supports a specific class or concept.
For example, if the class is "animal", and the text is "the cat chased the mouse," the model should determine that the text entails the class "animal" because a cat is an animal.

We experimented ZSL with two programming methods: one withusing torch library as we implemented a basic entailment process using PyTorch, where we computed model outputs and extracted logits for subsequent classification, and the other using HF pipeline (classifier). The latter method, i.e., HF pipeline, is used as a 'sanity check' or to verify the results we got from our code is similar to the results obtained from the pipeline. Some blogs/posts refer to different accurcies and predication scores [link](https://discuss.huggingface.co/t/new-pipeline-for-zero-shot-text-classification/681/104)



> We prevosily experimented one of the funadmental appraoches for ZSL which is called "embedding-based" approach. *Reference: Alhoshan, W., Ferrari, A., & Zhao, L. (2023). Zero-shot learning for requirements classification: An exploratory study. Information and Software Technology, 159, 107202.*


## ZSL (Inference-based) - Torch

In [None]:
# Delete all files from colab -- for new experiments

!rm -rf *

In [None]:
from line_profiler import LineProfiler
import time
import tensorflow as tf

with tf.device(device_name):
  from sklearn.metrics import precision_recall_fscore_support
  import csv
  def classify_and_evaluate_torch(df, model, tokenizer, llm_name, dataset_name, prompt_no = 1):
    # Assuming 'device' is defined elsewhere as 'cuda' or 'cpu'
    model.eval()
    model.to(device)
    # Add a padding token to the tokenizer
    tokenizer.pad_token = tokenizer.eos_token
    # Add padding token to the model config
    model.config.pad_token_id = tokenizer.pad_token_id
    torch.cuda.empty_cache()
    candidate_labels = df['Label'].unique().tolist()
    print(candidate_labels)
    definitions = df['Label Definition'].unique().tolist()
    print(definitions)
    results = []
    true_labels = []
    predicted_labels = []
    for _, row in df.iterrows():
        req = row['RequirementText']
        true_label = row['Label']
        # Create input pairs (template or prompt)
        match prompt_no:
          case 1:
            input_texts = [f"This requirement: '{req}' is about '{label}' requirement." for label in candidate_labels]
          case 2:
            input_texts = [f"This requirement: '{req}' belongs to '{label}' category." for label in candidate_labels]
          case 3:
            input_texts = [f"Is this requirement: '{req}' about {label} requirement?" for label in candidate_labels]
          case 4:
            input_texts = [f"Does this requirement: '{req}' belong to {label} category?" for label in candidate_labels]
          case 5:
            input_texts = [f"{definition}. Hence this requirement: '{req}' is about  {label} requirement." for label,definition in zip(candidate_labels,definitions)]
          case 6:
            input_texts = [f"{definition}. Hence this requirement: '{req}' belongs to  {label} category." for label,definition in zip(candidate_labels,definitions)]

        # Tokenize input pairs
        encoded_input = tokenizer(input_texts, padding=True, truncation=True, return_tensors='pt').to(device)
        #encoded_input = tokenizer(input_texts, padding=True, truncation=True, return_tensors='pt')
        # Method 1
        # Run through the model
        with torch.no_grad():
          output = model(**encoded_input)
          logits = output.logits
        # Calculate probabilities for all classes
        if(len(candidate_labels)>2):
          print("Multi-classification with softmax")
          log_probs = torch.nn.functional.log_softmax(logits, dim=1)
          all_probs = torch.exp(log_probs)
        else:
          print("Binary-classification with sigmoid")
          all_probs = torch.sigmoid(logits)
        props = all_probs.tolist()
        # Find the index of the list with the highest prop (1st element)
        max_index = max(range(len(props)), key=lambda i: props[i][0])
        # Extract the first element of the list with the highest first element
        result = props[max_index][0]
        predicted_label = candidate_labels[max_index]
        # Append results to the list
        results.append({
            "LLM": llm_name,
            "Dataset_Type_Variation": dataset_name,
            "Requirement": req,
            "Labels": candidate_labels,
            "True Label": true_label,
            "Predicted Label": predicted_label,
            #"Scores": entailment_probs.tolist() if entailment_probs is not None else None
            "Scores":  all_probs.tolist()
        })
        true_labels.append(true_label)
        predicted_labels.append(predicted_label)

    precision, recall, f1, _ = precision_recall_fscore_support(true_labels, predicted_labels, average='weighted')
    return results, precision, recall, f1

  profiler = LineProfiler()
  profiler.add_function(classify_and_evaluate_torch)
  profiler.enable_by_count()

  dataset_title = "NFR-multi-classification"
  datasets = [NFR_Remove_Punctuation_NoFunc, NFR_Add_FullStops_NoFunc, NFR_Labels_Capitalized_NoFunc, NFR_Labels_Lowercase_NoFunc, NFR_Labels_Uppercase_NoFunc]
  dataset_names = ['NFR-Multi-Remove_Punctuation', 'NFR-Multi-Add_FullStops', 'NFR-Multi-Labels_Capitalized', 'NFR-Multi-Labels_Lowercase', 'NFR-Multi-Labels_Uppercase']
  '''
  dataset_title = "NFR-Thematic-fixing-theme-3"
  datasets = [NFR_theme_3, NFR_theme_3_Remove_Punctuation, NFR_theme_3_Add_FullStops, NFR_theme_3_Labels_Uppercase, NFR_theme_3_Labels_Lowercase, NFR_theme_3_Labels_Capitalized]
  dataset_names = ['NFR-Operational-Maintenance', 'NFR-Operational-Maintenance-Remove_Punctuation', 'NFR-Operational-Maintenance-Add_FullStops', 'NFR-Operational-Maintenance-Labels_Uppercase', 'NFR-Operational-Maintenance-Labels_Lowercase', 'NFR-Operational-Maintenance-Labels_Capitalized']


  dataset_title = "NFR-Top-Four"
  datasets = [NFR_top, NFR_top_Remove_Punctuation, NFR_top_Add_FullStops, NFR_top_Labels_Uppercase, NFR_top_Labels_Lowercase, NFR_top_Labels_Capitalized]
  dataset_names = ["NFR-Top-None", "NFR-TOP_Remove_Punctuation", "NFR-TOP_Add_FullStops", "NFR-TOP_Labels_Uppercase", "NFR-TOP_Labels_Lowercase", "NFR-TOP_Labels_Capitalized"]


  dataset_title = "NFR-Thematic"
  datasets = [NFR_theme_1, NFR_theme_1_Remove_Punctuation, NFR_theme_1_Add_FullStops, NFR_theme_1_Labels_Uppercase, NFR_theme_1_Labels_Lowercase, NFR_theme_1_Labels_Capitalized,
              NFR_theme_2, NFR_theme_2_Remove_Punctuation, NFR_theme_2_Add_FullStops, NFR_theme_2_Labels_Uppercase, NFR_theme_2_Labels_Lowercase, NFR_theme_2_Labels_Capitalized,
              NFR_theme_3, NFR_theme_3_Remove_Punctuation, NFR_theme_3_Add_FullStops, NFR_theme_3_Labels_Uppercase, NFR_theme_3_Labels_Lowercase, NFR_theme_3_Labels_Capitalized,
              NFR_theme_4, NFR_theme_4_Remove_Punctuation, NFR_theme_4_Add_FullStops, NFR_theme_4_Labels_Uppercase, NFR_theme_4_Labels_Lowercase, NFR_theme_4_Labels_Capitalized]

  dataset_names = ['NFR-Performance-Reliability', 'NFR-Performance-Reliability-Remove_Punctuation', 'NFR-Performance-Reliability-Add_FullStops', 'NFR-Performance-Reliability-Labels_Uppercase', 'NFR-Performance-Reliability-Labels_Lowercase', 'NFR-Performance-Reliability-Labels_Capitalized',
              'NFR-Usability-Experience', 'NFR-Usability-Experience-Remove_Punctuation', 'NFR-Usability-Experience-Add_FullStops', 'NFR-Usability-Experience-Labels_Uppercase', 'NFR-Usability-Experience-Labels_Lowercase', 'NFR-Usability-Experience-Labels_Capitalized',
              'NFR-Operational-Maintenance', 'NFR-Operational-Maintenance-Remove_Punctuation', 'NFR-Operational-Maintenance-Add_FullStops', 'NFR-Operational-Maintenance-Labels_Uppercase', 'NFR-Operational-Maintenance-Labels_Lowercase', 'NFR-Operational-Maintenance-Labels_Capitalized',
              'NFR-Legal-Security', 'NFR-Legal-Security-Remove_Punctuation', 'NFR-Legal-Security-Add_FullStops', 'NFR-Legal-Security-Labels_Uppercase', 'NFR-Legal-Security-Labels_Lowercase', 'NFR-Legal-Security-Labels_Capitalized']
  '''

  llm_name =  llms_shortnames[0]
  for j, (model, tokenizer) in enumerate(models_and_tokenizers):
      for i, dataset in enumerate(datasets):
          # Classify and evaluate for each dataset and model
          all_results = []
          evaluation_metrics = []
          dataset_name = dataset_names[i]
          print(f"====================>>Classifying and evaluating using Model: {llms_shortnames[j]} and Dataset: {dataset_names[i]}<<====================")
          llm_name = llms_shortnames[j]
          for prompt_no in range(1,7):
            start_time = time.time()
            results, precision, recall, f1 = classify_and_evaluate_torch(dataset, model, tokenizer, llm_name, dataset_name, prompt_no)
            profiler.disable_by_count()
            profiler.print_stats()
            print(f"Precision: {precision}, Recall: {recall}, F1-Score: {f1}")
            end_time = time.time()
            elapsed_time = end_time - start_time
            # Format elapsed time as HH:MM:SS
            hours = int(elapsed_time // 3600)
            minutes = int((elapsed_time % 3600) // 60)
            seconds = int(elapsed_time % 60)
            time_string = f"Total execution time: {hours:02d}:{minutes:02d}:{seconds:02d}"
            print(time_string)
            match prompt_no:
              case 1:
                prompt_title = "Asseration-'is about'"
              case 2:
                prompt_title = "Asseration-'belongs to'"
              case 3:
                prompt_title = "Question-'is about'"
              case 4:
                prompt_title = "Question-'belongs to'"
              case 5:
                prompt_title = "Definition-'is about'"
              case 6:
                prompt_title = "Definition-'belongs to'"
            evaluation_metrics.append({
              "Prompt": prompt_title,
              "LLM": llm_name,
              "Dataset_Type_Variation": dataset_name,
              "Precision": precision,
              "Recall": recall,
              "F1-Score": f1,
              "Execution Time": time_string,
            })
            all_results.extend(results)
            # Save classification results to CSV
          with open(dataset_title+'_'+dataset_name+'_'+ llm_name+'_ZSL_Torch_classification_results_Template.csv', 'w', newline='') as csvfile:
            fieldnames = ["LLM", "Dataset_Type_Variation", "Requirement", "Labels", "True Label", "Predicted Label", "Scores"]
            writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
            writer.writeheader()
            writer.writerows(all_results)
            # Save evaluation metrics to CSV
          with open(dataset_title+'_'+dataset_name+'_'+llm_name+'_ZSL_Torch_evaluation_metrics_Template.csv', 'w', newline='') as csvfile:
            fieldnames = ["Prompt", "LLM", "Dataset_Type_Variation", "Precision", "Recall", "F1-Score", "Execution Time"]
            writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
            writer.writeheader()
            writer.writerows(evaluation_metrics)



Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.


['Performance', 'Look and Feel', 'Usability', 'Availability', 'Security', 'Fault Tolerance', 'Scalability', 'Operational', 'Legal', 'Maintainability']
['Performance requirements are quality requirements that specify the performance measure of the system.', 'Look and feel requirements are quality requirements that consider all static and dynamic aspects of the user interface, including colors, shapes, layout, typefaces, buttons, boxes, and menus.', 'Usability requirements are quality requirements that define what a system has to do to support users task performance.', 'Availability requirements are quality requirements that ensure the maximum operational time of a system.', 'Security requirements are quality requirements that ensure the integrity, confidentiality, reliability, availability, and safety of a system.', 'Fault tolerance requirements are quality requirements that ensure the system to have the ability to detect the fault and to have a backup plan.', 'Scalability requirements 

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


Timer unit: 1e-09 s

Total time: 136.098 s
File: <ipython-input-73-6be0b9da56ff>
Function: classify_and_evaluate_torch at line 8

Line #      Hits         Time  Per Hit   % Time  Line Contents
     8                                             def classify_and_evaluate_torch(df, model, tokenizer, llm_name, dataset_name, prompt_no = 1):
     9                                               # Assuming 'device' is defined elsewhere as 'cuda' or 'cpu'
    10         1   12697646.0    1e+07      0.0      model.eval()
    11         1 8303579579.0    8e+09      6.1      model.to(device)
    12                                               # Add a padding token to the tokenizer
    13         1      95819.0  95819.0      0.0      tokenizer.pad_token = tokenizer.eos_token
    14                                               # Add padding token to the model config
    15         1      51850.0  51850.0      0.0      model.config.pad_token_id = tokenizer.pad_token_id
    16         1    6323145.0

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


Timer unit: 1e-09 s

Total time: 136.098 s
File: <ipython-input-73-6be0b9da56ff>
Function: classify_and_evaluate_torch at line 8

Line #      Hits         Time  Per Hit   % Time  Line Contents
     8                                             def classify_and_evaluate_torch(df, model, tokenizer, llm_name, dataset_name, prompt_no = 1):
     9                                               # Assuming 'device' is defined elsewhere as 'cuda' or 'cpu'
    10         1   12697646.0    1e+07      0.0      model.eval()
    11         1 8303579579.0    8e+09      6.1      model.to(device)
    12                                               # Add a padding token to the tokenizer
    13         1      95819.0  95819.0      0.0      tokenizer.pad_token = tokenizer.eos_token
    14                                               # Add padding token to the model config
    15         1      51850.0  51850.0      0.0      model.config.pad_token_id = tokenizer.pad_token_id
    16         1    6323145.0

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


Timer unit: 1e-09 s

Total time: 136.098 s
File: <ipython-input-73-6be0b9da56ff>
Function: classify_and_evaluate_torch at line 8

Line #      Hits         Time  Per Hit   % Time  Line Contents
     8                                             def classify_and_evaluate_torch(df, model, tokenizer, llm_name, dataset_name, prompt_no = 1):
     9                                               # Assuming 'device' is defined elsewhere as 'cuda' or 'cpu'
    10         1   12697646.0    1e+07      0.0      model.eval()
    11         1 8303579579.0    8e+09      6.1      model.to(device)
    12                                               # Add a padding token to the tokenizer
    13         1      95819.0  95819.0      0.0      tokenizer.pad_token = tokenizer.eos_token
    14                                               # Add padding token to the model config
    15         1      51850.0  51850.0      0.0      model.config.pad_token_id = tokenizer.pad_token_id
    16         1    6323145.0

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


Timer unit: 1e-09 s

Total time: 136.098 s
File: <ipython-input-73-6be0b9da56ff>
Function: classify_and_evaluate_torch at line 8

Line #      Hits         Time  Per Hit   % Time  Line Contents
     8                                             def classify_and_evaluate_torch(df, model, tokenizer, llm_name, dataset_name, prompt_no = 1):
     9                                               # Assuming 'device' is defined elsewhere as 'cuda' or 'cpu'
    10         1   12697646.0    1e+07      0.0      model.eval()
    11         1 8303579579.0    8e+09      6.1      model.to(device)
    12                                               # Add a padding token to the tokenizer
    13         1      95819.0  95819.0      0.0      tokenizer.pad_token = tokenizer.eos_token
    14                                               # Add padding token to the model config
    15         1      51850.0  51850.0      0.0      model.config.pad_token_id = tokenizer.pad_token_id
    16         1    6323145.0

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


Timer unit: 1e-09 s

Total time: 136.098 s
File: <ipython-input-73-6be0b9da56ff>
Function: classify_and_evaluate_torch at line 8

Line #      Hits         Time  Per Hit   % Time  Line Contents
     8                                             def classify_and_evaluate_torch(df, model, tokenizer, llm_name, dataset_name, prompt_no = 1):
     9                                               # Assuming 'device' is defined elsewhere as 'cuda' or 'cpu'
    10         1   12697646.0    1e+07      0.0      model.eval()
    11         1 8303579579.0    8e+09      6.1      model.to(device)
    12                                               # Add a padding token to the tokenizer
    13         1      95819.0  95819.0      0.0      tokenizer.pad_token = tokenizer.eos_token
    14                                               # Add padding token to the model config
    15         1      51850.0  51850.0      0.0      model.config.pad_token_id = tokenizer.pad_token_id
    16         1    6323145.0

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


Timer unit: 1e-09 s

Total time: 136.098 s
File: <ipython-input-73-6be0b9da56ff>
Function: classify_and_evaluate_torch at line 8

Line #      Hits         Time  Per Hit   % Time  Line Contents
     8                                             def classify_and_evaluate_torch(df, model, tokenizer, llm_name, dataset_name, prompt_no = 1):
     9                                               # Assuming 'device' is defined elsewhere as 'cuda' or 'cpu'
    10         1   12697646.0    1e+07      0.0      model.eval()
    11         1 8303579579.0    8e+09      6.1      model.to(device)
    12                                               # Add a padding token to the tokenizer
    13         1      95819.0  95819.0      0.0      tokenizer.pad_token = tokenizer.eos_token
    14                                               # Add padding token to the model config
    15         1      51850.0  51850.0      0.0      model.config.pad_token_id = tokenizer.pad_token_id
    16         1    6323145.0

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


Timer unit: 1e-09 s

Total time: 136.098 s
File: <ipython-input-73-6be0b9da56ff>
Function: classify_and_evaluate_torch at line 8

Line #      Hits         Time  Per Hit   % Time  Line Contents
     8                                             def classify_and_evaluate_torch(df, model, tokenizer, llm_name, dataset_name, prompt_no = 1):
     9                                               # Assuming 'device' is defined elsewhere as 'cuda' or 'cpu'
    10         1   12697646.0    1e+07      0.0      model.eval()
    11         1 8303579579.0    8e+09      6.1      model.to(device)
    12                                               # Add a padding token to the tokenizer
    13         1      95819.0  95819.0      0.0      tokenizer.pad_token = tokenizer.eos_token
    14                                               # Add padding token to the model config
    15         1      51850.0  51850.0      0.0      model.config.pad_token_id = tokenizer.pad_token_id
    16         1    6323145.0

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


Timer unit: 1e-09 s

Total time: 136.098 s
File: <ipython-input-73-6be0b9da56ff>
Function: classify_and_evaluate_torch at line 8

Line #      Hits         Time  Per Hit   % Time  Line Contents
     8                                             def classify_and_evaluate_torch(df, model, tokenizer, llm_name, dataset_name, prompt_no = 1):
     9                                               # Assuming 'device' is defined elsewhere as 'cuda' or 'cpu'
    10         1   12697646.0    1e+07      0.0      model.eval()
    11         1 8303579579.0    8e+09      6.1      model.to(device)
    12                                               # Add a padding token to the tokenizer
    13         1      95819.0  95819.0      0.0      tokenizer.pad_token = tokenizer.eos_token
    14                                               # Add padding token to the model config
    15         1      51850.0  51850.0      0.0      model.config.pad_token_id = tokenizer.pad_token_id
    16         1    6323145.0

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


Timer unit: 1e-09 s

Total time: 136.098 s
File: <ipython-input-73-6be0b9da56ff>
Function: classify_and_evaluate_torch at line 8

Line #      Hits         Time  Per Hit   % Time  Line Contents
     8                                             def classify_and_evaluate_torch(df, model, tokenizer, llm_name, dataset_name, prompt_no = 1):
     9                                               # Assuming 'device' is defined elsewhere as 'cuda' or 'cpu'
    10         1   12697646.0    1e+07      0.0      model.eval()
    11         1 8303579579.0    8e+09      6.1      model.to(device)
    12                                               # Add a padding token to the tokenizer
    13         1      95819.0  95819.0      0.0      tokenizer.pad_token = tokenizer.eos_token
    14                                               # Add padding token to the model config
    15         1      51850.0  51850.0      0.0      model.config.pad_token_id = tokenizer.pad_token_id
    16         1    6323145.0

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


Timer unit: 1e-09 s

Total time: 136.098 s
File: <ipython-input-73-6be0b9da56ff>
Function: classify_and_evaluate_torch at line 8

Line #      Hits         Time  Per Hit   % Time  Line Contents
     8                                             def classify_and_evaluate_torch(df, model, tokenizer, llm_name, dataset_name, prompt_no = 1):
     9                                               # Assuming 'device' is defined elsewhere as 'cuda' or 'cpu'
    10         1   12697646.0    1e+07      0.0      model.eval()
    11         1 8303579579.0    8e+09      6.1      model.to(device)
    12                                               # Add a padding token to the tokenizer
    13         1      95819.0  95819.0      0.0      tokenizer.pad_token = tokenizer.eos_token
    14                                               # Add padding token to the model config
    15         1      51850.0  51850.0      0.0      model.config.pad_token_id = tokenizer.pad_token_id
    16         1    6323145.0

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


Timer unit: 1e-09 s

Total time: 136.098 s
File: <ipython-input-73-6be0b9da56ff>
Function: classify_and_evaluate_torch at line 8

Line #      Hits         Time  Per Hit   % Time  Line Contents
     8                                             def classify_and_evaluate_torch(df, model, tokenizer, llm_name, dataset_name, prompt_no = 1):
     9                                               # Assuming 'device' is defined elsewhere as 'cuda' or 'cpu'
    10         1   12697646.0    1e+07      0.0      model.eval()
    11         1 8303579579.0    8e+09      6.1      model.to(device)
    12                                               # Add a padding token to the tokenizer
    13         1      95819.0  95819.0      0.0      tokenizer.pad_token = tokenizer.eos_token
    14                                               # Add padding token to the model config
    15         1      51850.0  51850.0      0.0      model.config.pad_token_id = tokenizer.pad_token_id
    16         1    6323145.0

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


Timer unit: 1e-09 s

Total time: 136.098 s
File: <ipython-input-73-6be0b9da56ff>
Function: classify_and_evaluate_torch at line 8

Line #      Hits         Time  Per Hit   % Time  Line Contents
     8                                             def classify_and_evaluate_torch(df, model, tokenizer, llm_name, dataset_name, prompt_no = 1):
     9                                               # Assuming 'device' is defined elsewhere as 'cuda' or 'cpu'
    10         1   12697646.0    1e+07      0.0      model.eval()
    11         1 8303579579.0    8e+09      6.1      model.to(device)
    12                                               # Add a padding token to the tokenizer
    13         1      95819.0  95819.0      0.0      tokenizer.pad_token = tokenizer.eos_token
    14                                               # Add padding token to the model config
    15         1      51850.0  51850.0      0.0      model.config.pad_token_id = tokenizer.pad_token_id
    16         1    6323145.0

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


Timer unit: 1e-09 s

Total time: 136.098 s
File: <ipython-input-73-6be0b9da56ff>
Function: classify_and_evaluate_torch at line 8

Line #      Hits         Time  Per Hit   % Time  Line Contents
     8                                             def classify_and_evaluate_torch(df, model, tokenizer, llm_name, dataset_name, prompt_no = 1):
     9                                               # Assuming 'device' is defined elsewhere as 'cuda' or 'cpu'
    10         1   12697646.0    1e+07      0.0      model.eval()
    11         1 8303579579.0    8e+09      6.1      model.to(device)
    12                                               # Add a padding token to the tokenizer
    13         1      95819.0  95819.0      0.0      tokenizer.pad_token = tokenizer.eos_token
    14                                               # Add padding token to the model config
    15         1      51850.0  51850.0      0.0      model.config.pad_token_id = tokenizer.pad_token_id
    16         1    6323145.0

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


Timer unit: 1e-09 s

Total time: 136.098 s
File: <ipython-input-73-6be0b9da56ff>
Function: classify_and_evaluate_torch at line 8

Line #      Hits         Time  Per Hit   % Time  Line Contents
     8                                             def classify_and_evaluate_torch(df, model, tokenizer, llm_name, dataset_name, prompt_no = 1):
     9                                               # Assuming 'device' is defined elsewhere as 'cuda' or 'cpu'
    10         1   12697646.0    1e+07      0.0      model.eval()
    11         1 8303579579.0    8e+09      6.1      model.to(device)
    12                                               # Add a padding token to the tokenizer
    13         1      95819.0  95819.0      0.0      tokenizer.pad_token = tokenizer.eos_token
    14                                               # Add padding token to the model config
    15         1      51850.0  51850.0      0.0      model.config.pad_token_id = tokenizer.pad_token_id
    16         1    6323145.0

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


Timer unit: 1e-09 s

Total time: 136.098 s
File: <ipython-input-73-6be0b9da56ff>
Function: classify_and_evaluate_torch at line 8

Line #      Hits         Time  Per Hit   % Time  Line Contents
     8                                             def classify_and_evaluate_torch(df, model, tokenizer, llm_name, dataset_name, prompt_no = 1):
     9                                               # Assuming 'device' is defined elsewhere as 'cuda' or 'cpu'
    10         1   12697646.0    1e+07      0.0      model.eval()
    11         1 8303579579.0    8e+09      6.1      model.to(device)
    12                                               # Add a padding token to the tokenizer
    13         1      95819.0  95819.0      0.0      tokenizer.pad_token = tokenizer.eos_token
    14                                               # Add padding token to the model config
    15         1      51850.0  51850.0      0.0      model.config.pad_token_id = tokenizer.pad_token_id
    16         1    6323145.0

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


Timer unit: 1e-09 s

Total time: 136.098 s
File: <ipython-input-73-6be0b9da56ff>
Function: classify_and_evaluate_torch at line 8

Line #      Hits         Time  Per Hit   % Time  Line Contents
     8                                             def classify_and_evaluate_torch(df, model, tokenizer, llm_name, dataset_name, prompt_no = 1):
     9                                               # Assuming 'device' is defined elsewhere as 'cuda' or 'cpu'
    10         1   12697646.0    1e+07      0.0      model.eval()
    11         1 8303579579.0    8e+09      6.1      model.to(device)
    12                                               # Add a padding token to the tokenizer
    13         1      95819.0  95819.0      0.0      tokenizer.pad_token = tokenizer.eos_token
    14                                               # Add padding token to the model config
    15         1      51850.0  51850.0      0.0      model.config.pad_token_id = tokenizer.pad_token_id
    16         1    6323145.0

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


Timer unit: 1e-09 s

Total time: 136.098 s
File: <ipython-input-73-6be0b9da56ff>
Function: classify_and_evaluate_torch at line 8

Line #      Hits         Time  Per Hit   % Time  Line Contents
     8                                             def classify_and_evaluate_torch(df, model, tokenizer, llm_name, dataset_name, prompt_no = 1):
     9                                               # Assuming 'device' is defined elsewhere as 'cuda' or 'cpu'
    10         1   12697646.0    1e+07      0.0      model.eval()
    11         1 8303579579.0    8e+09      6.1      model.to(device)
    12                                               # Add a padding token to the tokenizer
    13         1      95819.0  95819.0      0.0      tokenizer.pad_token = tokenizer.eos_token
    14                                               # Add padding token to the model config
    15         1      51850.0  51850.0      0.0      model.config.pad_token_id = tokenizer.pad_token_id
    16         1    6323145.0

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


Timer unit: 1e-09 s

Total time: 136.098 s
File: <ipython-input-73-6be0b9da56ff>
Function: classify_and_evaluate_torch at line 8

Line #      Hits         Time  Per Hit   % Time  Line Contents
     8                                             def classify_and_evaluate_torch(df, model, tokenizer, llm_name, dataset_name, prompt_no = 1):
     9                                               # Assuming 'device' is defined elsewhere as 'cuda' or 'cpu'
    10         1   12697646.0    1e+07      0.0      model.eval()
    11         1 8303579579.0    8e+09      6.1      model.to(device)
    12                                               # Add a padding token to the tokenizer
    13         1      95819.0  95819.0      0.0      tokenizer.pad_token = tokenizer.eos_token
    14                                               # Add padding token to the model config
    15         1      51850.0  51850.0      0.0      model.config.pad_token_id = tokenizer.pad_token_id
    16         1    6323145.0

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


Timer unit: 1e-09 s

Total time: 136.098 s
File: <ipython-input-73-6be0b9da56ff>
Function: classify_and_evaluate_torch at line 8

Line #      Hits         Time  Per Hit   % Time  Line Contents
     8                                             def classify_and_evaluate_torch(df, model, tokenizer, llm_name, dataset_name, prompt_no = 1):
     9                                               # Assuming 'device' is defined elsewhere as 'cuda' or 'cpu'
    10         1   12697646.0    1e+07      0.0      model.eval()
    11         1 8303579579.0    8e+09      6.1      model.to(device)
    12                                               # Add a padding token to the tokenizer
    13         1      95819.0  95819.0      0.0      tokenizer.pad_token = tokenizer.eos_token
    14                                               # Add padding token to the model config
    15         1      51850.0  51850.0      0.0      model.config.pad_token_id = tokenizer.pad_token_id
    16         1    6323145.0

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


Timer unit: 1e-09 s

Total time: 136.098 s
File: <ipython-input-73-6be0b9da56ff>
Function: classify_and_evaluate_torch at line 8

Line #      Hits         Time  Per Hit   % Time  Line Contents
     8                                             def classify_and_evaluate_torch(df, model, tokenizer, llm_name, dataset_name, prompt_no = 1):
     9                                               # Assuming 'device' is defined elsewhere as 'cuda' or 'cpu'
    10         1   12697646.0    1e+07      0.0      model.eval()
    11         1 8303579579.0    8e+09      6.1      model.to(device)
    12                                               # Add a padding token to the tokenizer
    13         1      95819.0  95819.0      0.0      tokenizer.pad_token = tokenizer.eos_token
    14                                               # Add padding token to the model config
    15         1      51850.0  51850.0      0.0      model.config.pad_token_id = tokenizer.pad_token_id
    16         1    6323145.0

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


Timer unit: 1e-09 s

Total time: 136.098 s
File: <ipython-input-73-6be0b9da56ff>
Function: classify_and_evaluate_torch at line 8

Line #      Hits         Time  Per Hit   % Time  Line Contents
     8                                             def classify_and_evaluate_torch(df, model, tokenizer, llm_name, dataset_name, prompt_no = 1):
     9                                               # Assuming 'device' is defined elsewhere as 'cuda' or 'cpu'
    10         1   12697646.0    1e+07      0.0      model.eval()
    11         1 8303579579.0    8e+09      6.1      model.to(device)
    12                                               # Add a padding token to the tokenizer
    13         1      95819.0  95819.0      0.0      tokenizer.pad_token = tokenizer.eos_token
    14                                               # Add padding token to the model config
    15         1      51850.0  51850.0      0.0      model.config.pad_token_id = tokenizer.pad_token_id
    16         1    6323145.0

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


Timer unit: 1e-09 s

Total time: 136.098 s
File: <ipython-input-73-6be0b9da56ff>
Function: classify_and_evaluate_torch at line 8

Line #      Hits         Time  Per Hit   % Time  Line Contents
     8                                             def classify_and_evaluate_torch(df, model, tokenizer, llm_name, dataset_name, prompt_no = 1):
     9                                               # Assuming 'device' is defined elsewhere as 'cuda' or 'cpu'
    10         1   12697646.0    1e+07      0.0      model.eval()
    11         1 8303579579.0    8e+09      6.1      model.to(device)
    12                                               # Add a padding token to the tokenizer
    13         1      95819.0  95819.0      0.0      tokenizer.pad_token = tokenizer.eos_token
    14                                               # Add padding token to the model config
    15         1      51850.0  51850.0      0.0      model.config.pad_token_id = tokenizer.pad_token_id
    16         1    6323145.0

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


Timer unit: 1e-09 s

Total time: 136.098 s
File: <ipython-input-73-6be0b9da56ff>
Function: classify_and_evaluate_torch at line 8

Line #      Hits         Time  Per Hit   % Time  Line Contents
     8                                             def classify_and_evaluate_torch(df, model, tokenizer, llm_name, dataset_name, prompt_no = 1):
     9                                               # Assuming 'device' is defined elsewhere as 'cuda' or 'cpu'
    10         1   12697646.0    1e+07      0.0      model.eval()
    11         1 8303579579.0    8e+09      6.1      model.to(device)
    12                                               # Add a padding token to the tokenizer
    13         1      95819.0  95819.0      0.0      tokenizer.pad_token = tokenizer.eos_token
    14                                               # Add padding token to the model config
    15         1      51850.0  51850.0      0.0      model.config.pad_token_id = tokenizer.pad_token_id
    16         1    6323145.0

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


Timer unit: 1e-09 s

Total time: 136.098 s
File: <ipython-input-73-6be0b9da56ff>
Function: classify_and_evaluate_torch at line 8

Line #      Hits         Time  Per Hit   % Time  Line Contents
     8                                             def classify_and_evaluate_torch(df, model, tokenizer, llm_name, dataset_name, prompt_no = 1):
     9                                               # Assuming 'device' is defined elsewhere as 'cuda' or 'cpu'
    10         1   12697646.0    1e+07      0.0      model.eval()
    11         1 8303579579.0    8e+09      6.1      model.to(device)
    12                                               # Add a padding token to the tokenizer
    13         1      95819.0  95819.0      0.0      tokenizer.pad_token = tokenizer.eos_token
    14                                               # Add padding token to the model config
    15         1      51850.0  51850.0      0.0      model.config.pad_token_id = tokenizer.pad_token_id
    16         1    6323145.0

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


Timer unit: 1e-09 s

Total time: 136.098 s
File: <ipython-input-73-6be0b9da56ff>
Function: classify_and_evaluate_torch at line 8

Line #      Hits         Time  Per Hit   % Time  Line Contents
     8                                             def classify_and_evaluate_torch(df, model, tokenizer, llm_name, dataset_name, prompt_no = 1):
     9                                               # Assuming 'device' is defined elsewhere as 'cuda' or 'cpu'
    10         1   12697646.0    1e+07      0.0      model.eval()
    11         1 8303579579.0    8e+09      6.1      model.to(device)
    12                                               # Add a padding token to the tokenizer
    13         1      95819.0  95819.0      0.0      tokenizer.pad_token = tokenizer.eos_token
    14                                               # Add padding token to the model config
    15         1      51850.0  51850.0      0.0      model.config.pad_token_id = tokenizer.pad_token_id
    16         1    6323145.0

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


Timer unit: 1e-09 s

Total time: 136.098 s
File: <ipython-input-73-6be0b9da56ff>
Function: classify_and_evaluate_torch at line 8

Line #      Hits         Time  Per Hit   % Time  Line Contents
     8                                             def classify_and_evaluate_torch(df, model, tokenizer, llm_name, dataset_name, prompt_no = 1):
     9                                               # Assuming 'device' is defined elsewhere as 'cuda' or 'cpu'
    10         1   12697646.0    1e+07      0.0      model.eval()
    11         1 8303579579.0    8e+09      6.1      model.to(device)
    12                                               # Add a padding token to the tokenizer
    13         1      95819.0  95819.0      0.0      tokenizer.pad_token = tokenizer.eos_token
    14                                               # Add padding token to the model config
    15         1      51850.0  51850.0      0.0      model.config.pad_token_id = tokenizer.pad_token_id
    16         1    6323145.0

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


Timer unit: 1e-09 s

Total time: 136.098 s
File: <ipython-input-73-6be0b9da56ff>
Function: classify_and_evaluate_torch at line 8

Line #      Hits         Time  Per Hit   % Time  Line Contents
     8                                             def classify_and_evaluate_torch(df, model, tokenizer, llm_name, dataset_name, prompt_no = 1):
     9                                               # Assuming 'device' is defined elsewhere as 'cuda' or 'cpu'
    10         1   12697646.0    1e+07      0.0      model.eval()
    11         1 8303579579.0    8e+09      6.1      model.to(device)
    12                                               # Add a padding token to the tokenizer
    13         1      95819.0  95819.0      0.0      tokenizer.pad_token = tokenizer.eos_token
    14                                               # Add padding token to the model config
    15         1      51850.0  51850.0      0.0      model.config.pad_token_id = tokenizer.pad_token_id
    16         1    6323145.0

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


Timer unit: 1e-09 s

Total time: 136.098 s
File: <ipython-input-73-6be0b9da56ff>
Function: classify_and_evaluate_torch at line 8

Line #      Hits         Time  Per Hit   % Time  Line Contents
     8                                             def classify_and_evaluate_torch(df, model, tokenizer, llm_name, dataset_name, prompt_no = 1):
     9                                               # Assuming 'device' is defined elsewhere as 'cuda' or 'cpu'
    10         1   12697646.0    1e+07      0.0      model.eval()
    11         1 8303579579.0    8e+09      6.1      model.to(device)
    12                                               # Add a padding token to the tokenizer
    13         1      95819.0  95819.0      0.0      tokenizer.pad_token = tokenizer.eos_token
    14                                               # Add padding token to the model config
    15         1      51850.0  51850.0      0.0      model.config.pad_token_id = tokenizer.pad_token_id
    16         1    6323145.0

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


Timer unit: 1e-09 s

Total time: 136.098 s
File: <ipython-input-73-6be0b9da56ff>
Function: classify_and_evaluate_torch at line 8

Line #      Hits         Time  Per Hit   % Time  Line Contents
     8                                             def classify_and_evaluate_torch(df, model, tokenizer, llm_name, dataset_name, prompt_no = 1):
     9                                               # Assuming 'device' is defined elsewhere as 'cuda' or 'cpu'
    10         1   12697646.0    1e+07      0.0      model.eval()
    11         1 8303579579.0    8e+09      6.1      model.to(device)
    12                                               # Add a padding token to the tokenizer
    13         1      95819.0  95819.0      0.0      tokenizer.pad_token = tokenizer.eos_token
    14                                               # Add padding token to the model config
    15         1      51850.0  51850.0      0.0      model.config.pad_token_id = tokenizer.pad_token_id
    16         1    6323145.0

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


Timer unit: 1e-09 s

Total time: 136.098 s
File: <ipython-input-73-6be0b9da56ff>
Function: classify_and_evaluate_torch at line 8

Line #      Hits         Time  Per Hit   % Time  Line Contents
     8                                             def classify_and_evaluate_torch(df, model, tokenizer, llm_name, dataset_name, prompt_no = 1):
     9                                               # Assuming 'device' is defined elsewhere as 'cuda' or 'cpu'
    10         1   12697646.0    1e+07      0.0      model.eval()
    11         1 8303579579.0    8e+09      6.1      model.to(device)
    12                                               # Add a padding token to the tokenizer
    13         1      95819.0  95819.0      0.0      tokenizer.pad_token = tokenizer.eos_token
    14                                               # Add padding token to the model config
    15         1      51850.0  51850.0      0.0      model.config.pad_token_id = tokenizer.pad_token_id
    16         1    6323145.0

In [None]:
ram_gb = virtual_memory().total / 1e9
print('Your runtime has {:.1f} gigabytes of available RAM\n'.format(ram_gb))

if ram_gb < 20:
  print('Not using a high-RAM runtime')
else:
  print('You are using a high-RAM runtime!')

##ZSL (Inference-based) - HF Pipeline (Not-used)


reference: https://github.com/huggingface/transformers/blob/main/src/transformers/pipelines/text_classification.py

In [None]:
zzwith tf.device(device_name):
  from transformers import pipeline
  import numpy as np
  import torch
  import time
  from sklearn.metrics import precision_recall_fscore_support
  import csv
  def classify_and_evaluate(df, model, tokenizer, llm_name, dataset_name):
      torch.cuda.empty_cache()
      model.to(device)
      # Create a zero-shot classification pipeline
      # Add a padding token to the tokenizer
      tokenizer.pad_token = tokenizer.eos_token
      # Add padding token to the model config
      model.config.pad_token_id = tokenizer.pad_token_id
      classifier = pipeline("zero-shot-classification", model=model,  device = device, tokenizer=tokenizer, batch_size=2) #fix it to be running on GPU
      candidate_labels = df['Label'].unique().tolist()
      results = []
      true_labels = []
      predicted_labels = []

      for _, row in df.iterrows():
          req = row['RequirementText']
          true_label = row['Label']
          result = classifier(req, candidate_labels)
          predicted_label = result['labels'][0]
          results.append({
              "LLM": llm_name,
              "Dataset_Type_Variation": dataset_name,
              "Requirement": req,
              "True Label": true_label,
              "Predicted Label": predicted_label,
              "Scores": result['scores']
          })
          true_labels.append(true_label)
          predicted_labels.append(predicted_label)

      precision, recall, f1, _ = precision_recall_fscore_support(true_labels, predicted_labels, average='weighted')
      return results, precision, recall, f1

  # Classify and evaluate for each dataset and model
  all_results = []
  evaluation_metrics = []
  dataset_title = "NFR-Thematic"
  datasets = [NFR_theme_1, NFR_theme_1_Remove_Punctuation, NFR_theme_1_Add_FullStops, NFR_theme_1_Labels_Uppercase, NFR_theme_1_Labels_Lowercase, NFR_theme_1_Labels_Capitalized,
              NFR_theme_2, NFR_theme_2_Remove_Punctuation, NFR_theme_2_Add_FullStops, NFR_theme_2_Labels_Uppercase, NFR_theme_2_Labels_Lowercase, NFR_theme_2_Labels_Capitalized,
              NFR_theme_3, NFR_theme_3_Remove_Punctuation, NFR_theme_3_Add_FullStops, NFR_theme_3_Labels_Uppercase, NFR_theme_3_Labels_Lowercase, NFR_theme_3_Labels_Capitalized,
              NFR_theme_4, NFR_theme_4_Remove_Punctuation, NFR_theme_4_Add_FullStops, NFR_theme_4_Labels_Uppercase, NFR_theme_4_Labels_Lowercase, NFR_theme_4_Labels_Capitalized]
  dataset_names = ['NFR-Performance-Reliability', 'NFR-Performance-Reliability-Remove_Punctuation', 'NFR-Performance-Reliability-Add_FullStops', 'NFR-Performance-Reliability-Labels_Uppercase', 'NFR-Performance-Reliability-Labels_Lowercase', 'NFR-Performance-Reliability-Labels_Capitalized',
              'NFR-Usability-Experience', 'NFR-Usability-Experience-Remove_Punctuation', 'NFR-Usability-Experience-Add_FullStops', 'NFR-Usability-Experience-Labels_Uppercase', 'NFR-Usability-Experience-Labels_Lowercase', 'NFR-Usability-Experience-Labels_Capitalized',
              'NFR-Operational-Maintenance', 'NFR-Operational-Maintenance-Remove_Punctuation', 'NFR-Operational-Maintenance-Add_FullStops', 'NFR-Operational-Maintenance-Labels_Uppercase', 'NFR-Operational-Maintenance-Labels_Lowercase', 'NFR-Operational-Maintenance-Labels_Capitalized',
              'NFR-Legal-Security', 'NFR-Legal-Security-Remove_Punctuation', 'NFR-Legal-Security-Add_FullStops', 'NFR-Legal-Security-Labels_Uppercase', 'NFR-Legal-Security-Labels_Lowercase', 'NFR-Legal-Security-Labels_Capitalized']

  '''
  ################# Binary Classification Datasets #################

  ##Functional Datasets
  dataset_title = 'Functional'
  datasets = [Functional_None, Functional_Add_FullStops, Functional_Remove_Punctuation, Functional_Labels_Uppercase, Functional_Labels_Lowercase, Functional_Labels_Capitalized]
  dataset_names = ['Functional-Binary-None', 'Functional-Binary-Add_FullStops', 'Functional-Binary-Remove_Punctuation', 'Functional-Binary-Labels_Uppercase',
                   'Functional-Binary-Labels_Lowercase', 'Functional-Binary-Labels_Capitalized']

  ## Quality Datasets
  dataset_title = 'Quality'
  datasets = [Quality_None, Quality_Add_FullStops, Quality_Remove_Punctuation, Quality_Labels_Uppercase, Quality_Labels_Lowercase, Quality_Labels_Capitalized]
  dataset_names = ['Quality-Binary-None', 'Quality-Binary-Add_FullStops', 'Quality-Binary-Remove_Punctuation', 'Quality-Binary-Labels_Uppercase',
                   'Quality-Binary-Labels_Lowercase', 'Quality-Binary-Labels_Capitalized']


  ## SeqReq Datasets
  dataset_title = 'SeqReq'
  datasets = [SeqReq_None, SeqReq_Add_FullStops, SeqReq_Remove_Punctuation, SeqReq_Labels_Uppercase, SeqReq_Labels_Lowercase, SeqReq_Labels_Capitalized]
  dataset_names = ['SeqReq-Binary-None', 'SeqReq-Binary-Add_FullStops', 'SeqReq-Binary-Remove_Punctuation', 'SeqReq-Binary-Labels_Uppercase',
                   'SeqReq-Binary-Labels_Lowercase', 'SeqReq-Binary-Labels_Capitalized']

  ################# Multi-Classification Datasets #################

  ## NFR Datasets
  dataset_title = 'NFR'
  datasets  = [NFR_None, NFR_Add_FullStops, NFR_Remove_Punctuation, NFR_Labels_Uppercase, NFR_Labels_Lowercase, NFR_Labels_Capitalized]
  dataset_names = ['NFR-Multi-None', 'NFR-Multi-Add_FullStops', 'NFR-Multi-Remove_Punctuation', 'NFR-Multi-Labels_Uppercase',
                   'NFR-Multi-Labels_Lowercase', 'NFR-Multi-Labels_Capitalized']

  dataset_title = "NFR-NoFunc"
  datasets = [NFR_None_NoFunc, NFR_Add_FullStops_NoFunc, NFR_Remove_Punctuation_NoFunc, NFR_Labels_Uppercase_NoFunc, NFR_Labels_Lowercase_NoFunc, NFR_Labels_Capitalized_NoFunc]
  dataset_names = ['NFR-Multi-None', 'NFR-Multi-Add_FullStops', 'NFR-Multi-Remove_Punctuation', 'NFR-Multi-Labels_Uppercase', 'NFR-Multi-Labels_Lowercase', 'NFR-Multi-Labels_Capitalized']

  dataset_title = "NFR-Thematic"
  datasets = [NFR_theme_1, NFR_theme_1_Remove_Punctuation, NFR_theme_1_Add_FullStops, NFR_theme_1_Labels_Uppercase, NFR_theme_1_Labels_Lowercase, NFR_theme_1_Labels_Capitalized,
              NFR_theme_2, NFR_theme_2_Remove_Punctuation, NFR_theme_2_Add_FullStops, NFR_theme_2_Labels_Uppercase, NFR_theme_2_Labels_Lowercase, NFR_theme_2_Labels_Capitalized,
              NFR_theme_3, NFR_theme_3_Remove_Punctuation, NFR_theme_3_Add_FullStops, NFR_theme_3_Labels_Uppercase, NFR_theme_3_Labels_Lowercase, NFR_theme_3_Labels_Capitalized,
              NFR_theme_4, NFR_theme_4_Remove_Punctuation, NFR_theme_4_Add_FullStops, NFR_theme_4_Labels_Uppercase, NFR_theme_4_Labels_Lowercase, NFR_theme_4_Labels_Capitalized]
  dataset_names = ['NFR-Performance-Reliability', 'NFR-Performance-Reliability-Remove_Punctuation', 'NFR-Performance-Reliability-Add_FullStops', 'NFR-Performance-Reliability-Labels_Uppercase', 'NFR-Performance-Reliability-Labels_Lowercase', 'NFR-Performance-Reliability-Labels_Capitalized',
              'NFR-Usability-Experience', 'NFR-Usability-Experience-Remove_Punctuation', 'NFR-Usability-Experience-Add_FullStops', 'NFR-Usability-Experience-Labels_Uppercase', 'NFR-Usability-Experience-Labels_Lowercase', 'NFR-Usability-Experience-Labels_Capitalized',
              'NFR-Operational-Maintenance', 'NFR-Operational-Maintenance-Remove_Punctuation', 'NFR-Operational-Maintenance-Add_FullStops', 'NFR-Operational-Maintenance-Labels_Uppercase', 'NFR-Operational-Maintenance-Labels_Lowercase', 'NFR-Operational-Maintenance-Labels_Capitalized',
              'NFR-Legal-Security', 'NFR-Legal-Security-Remove_Punctuation', 'NFR-Legal-Security-Add_FullStops', 'NFR-Legal-Security-Labels_Uppercase', 'NFR-Legal-Security-Labels_Lowercase', 'NFR-Legal-Security-Labels_Capitalized']


  ################# OVR-Classification Datasets (NOT USED) #################
  ## OVR-Classification NFR dataset except Functional

  dataset_title = "NFR-NoFunc-OVR"
  datasets = [Performance_NFR_None_NoFunc, Performance_NFR_None_NoFunc_Remove_Punctuation, Performance_NFR_None_NoFunc_Add_FullStops, Performance_NFR_None_NoFunc_Labels_Uppercase, Performance_NFR_None_NoFunc_Labels_Lowercase, Performance_NFR_None_NoFunc_Labels_Capitalized,
  Availability_NFR_None_NoFunc, Availability_NFR_None_NoFunc_Remove_Punctuation,  Availability_NFR_None_NoFunc_Add_FullStops, Availability_NFR_None_NoFunc_Labels_Uppercase, Availability_NFR_None_NoFunc_Labels_Lowercase, Availability_NFR_None_NoFunc_Labels_Capitalized,
  Operational_NFR_None_NoFunc, Operational_NFR_None_NoFunc_Remove_Punctuation, Operational_NFR_None_NoFunc_Add_FullStops, Operational_NFR_None_NoFunc_Labels_Uppercase, Operational_NFR_None_NoFunc_Labels_Lowercase, Operational_NFR_None_NoFunc_Labels_Capitalized,
  Usability_NFR_None_NoFunc, Usability_NFR_None_NoFunc_Remove_Punctuation, Usability_NFR_None_NoFunc_Add_FullStops, Usability_NFR_None_NoFunc_Labels_Uppercase, Usability_NFR_None_NoFunc_Labels_Lowercase, Usability_NFR_None_NoFunc_Labels_Capitalized,
  Maintainability_NFR_None_NoFunc, Maintainability_NFR_None_NoFunc_Remove_Punctuation, Maintainability_NFR_None_NoFunc_Add_FullStops, Maintainability_NFR_None_NoFunc_Labels_Uppercase, Maintainability_NFR_None_NoFunc_Labels_Lowercase, Maintainability_NFR_None_NoFunc_Labels_Capitalized, Scalability_NFR_None_NoFunc_Remove_Punctuation,
  Scalability_NFR_None_NoFunc, Scalability_NFR_None_NoFunc_Add_FullStops, Scalability_NFR_None_NoFunc_Labels_Uppercase, Scalability_NFR_None_NoFunc_Labels_Lowercase, Scalability_NFR_None_NoFunc_Labels_Capitalized,
  Portability_NFR_None_NoFunc, Portability_NFR_None_NoFunc_Remove_Punctuation, Portability_NFR_None_NoFunc_Add_FullStops, Portability_NFR_None_NoFunc_Labels_Uppercase, Portability_NFR_None_NoFunc_Labels_Lowercase, Portability_NFR_None_NoFunc_Labels_Capitalized,
  Security_NFR_None_NoFunc, Security_NFR_None_NoFunc_Remove_Punctuation, Security_NFR_None_NoFunc_Add_FullStops, Security_NFR_None_NoFunc_Labels_Uppercase, Security_NFR_None_NoFunc_Labels_Lowercase, Security_NFR_None_NoFunc_Labels_Capitalized,
  Legal_NFR_None_NoFunc, Legal_NFR_None_NoFunc_Remove_Punctuation,  Legal_NFR_None_NoFunc_Add_FullStops, Legal_NFR_None_NoFunc_Labels_Uppercase, Legal_NFR_None_NoFunc_Labels_Lowercase, Legal_NFR_None_NoFunc_Labels_Capitalized,
  Fault_Tolerance_NFR_None_NoFunc, Fault_Tolerance_NFR_None_NoFunc_Remove_Punctuation, Fault_Tolerance_NFR_None_NoFunc_Add_FullStops, Fault_Tolerance_NFR_None_NoFunc_Labels_Uppercase, Fault_Tolerance_NFR_None_NoFunc_Labels_Lowercase, Fault_Tolerance_NFR_None_NoFunc_Labels_Capitalized]

  dataset_names = ['Performance-NFR-None', 'Performance-NFR-Add_FullStops', 'Performance-NFR-Labels_Uppercase', 'Performance-NFR-Labels_Lowercase', 'Performance-NFR-Labels_Capitalized',
              'Availability-NFR-None', 'Availability-NFR-Add_FullStops', 'Availability-NFR-Labels_Uppercase', 'Availability-NFR-Labels_Lowercase', 'Availability-NFR-Labels_Capitalized',
              'Operational-NFR-None', 'Operational-NFR-Add_FullStops', 'Operational-NFR-Labels_Uppercase', 'Operational-NFR-Labels_Lowercase', 'Operational-NFR-Labels_Capitalized',
              'Usability-NFR-None', 'Usability-NFR-Add_FullStops', 'Usability-NFR-Labels_Uppercase', 'Usability-NFR-Labels_Lowercase', 'Usability-NFR-Labels_Capitalized',
              'Maintainability-NFR-None', 'Maintainability-NFR-Add_FullStops', 'Maintainability-NFR-Labels_Uppercase', 'Maintainability-NFR-Labels_Lowercase', 'Maintainability-NFR-Labels_Capitalized',
              'Scalability-NFR-None', 'Scalability-NFR-Add_FullStops', 'Scalability-NFR-Labels_Uppercase', 'Scalability-NFR-Labels_Lowercase', 'Scalability-NFR-Labels_Capitalized',
              'Portability-NFR-None', 'Portability-NFR-Add_FullStops', 'Portability-NFR-Labels_Uppercase', 'Portability-NFR-Labels_Lowercase', 'Portability-NFR-Labels_Capitalized',
              'Security-NFR-None', 'Security-NFR-Add_FullStops', 'Security-NFR-Labels_Uppercase', 'Security-NFR-Labels_Lowercase', 'Security-NFR-Labels_Capitalized',
              'Legal-NFR-None', 'Legal-NFR-Add_FullStops', 'Legal-NFR-Labels_Uppercase', 'Legal-NFR-Labels_Lowercase', 'Legal-NFR-Labels_Capitalized',
              'Fault_Tolerance_NFR-None', 'Fault_Tolerance_NFR-Add_FullStops', 'Fault_Tolerance_NFR-Labels_Uppercase', 'Fault_Tolerance_NFR-Labels_Lowercase', 'Fault_Tolerance_NFR-Labels_Capitalized']

  ################# Combined Dataset (NOT COMPLETED) #################
  ## Combined Datasets
  dataset_title = 'Combined'
  datasets = [Functional_None, Functional_Add_FullStops, Functional_Remove_Punctuation, Functional_Labels_Uppercase, Functional_Labels_Lowercase, Functional_Labels_Capitalized,
              Quality_None, Quality_Add_FullStops, Quality_Remove_Punctuation, Quality_Labels_Uppercase, Quality_Labels_Lowercase, Quality_Labels_Capitalized,
              SeqReq_None, SeqReq_Add_FullStops, SeqReq_Remove_Punctuation, SeqReq_Labels_Uppercase, SeqReq_Labels_Lowercase, SeqReq_Labels_Capitalized,
              NFR_None, NFR_Add_FullStops, NFR_Remove_Punctuation, NFR_Labels_Uppercase, NFR_Labels_Lowercase, NFR_Labels_Capitalized]
  dataset_names = ['Functional-Binary-None', 'Functional-Binary-Add_FullStops', 'Functional-Binary-Remove_Punctuation', 'Functional-Binary-Labels_Uppercase', 'Functional-Binary-Labels_Lowercase', 'Functional-Binary-Labels_Capitalized',
                   'Quality-Binary-None', 'Quality-Binary-Add_FullStops', 'Quality-Binary-Remove_Punctuation', 'Quality-Binary-Labels_Uppercase', 'Quality-Binary-Labels_Lowercase', 'Quality-Binary-Labels_Capitalized',
                   'SeqReq-Binary-None', 'SeqReq-Binary-Add_FullStops', 'SeqReq-Binary-Remove_Punctuation', 'SeqReq-Binary-Labels_Uppercase', 'SeqReq-Binary-Labels_Lowercase', 'SeqReq-Binary-Labels_Capitalized',
                   'NFR-Multi-None', 'NFR-Multi-Add_FullStops', 'NFR-Multi-Remove_Punctuation', 'NFR-Multi-Labels_Uppercase', 'NFR-Multi-Labels_Lowercase', 'NFR-Multi-Labels_Capitalized']

  '''


  for i, dataset in enumerate(datasets):
      start_time = time.time()
      print(f"Classifying and evaluating for {dataset_names[i]} dataset...")
      profiler = LineProfiler()
      profiler.add_function(classify_and_evaluate)
      profiler.enable_by_count()
      dataset_name = dataset_names[i]
      for j, (model, tokenizer) in enumerate(models_and_tokenizers):
          print(f"Classifying and evaluating for {llms_shortnames[j]} model...")
          llm_name = llms_shortnames[j]
          results, precision, recall, f1 = classify_and_evaluate(dataset, model, tokenizer, llm_name, dataset_name)
          profiler.disable_by_count()
          profiler.print_stats()
          all_results.extend(results)
          print(f"Precision: {precision}, Recall: {recall}, F1-Score: {f1}")
          end_time = time.time()
          elapsed_time = end_time - start_time
          # Format elapsed time as HH:MM:SS
          hours = int(elapsed_time // 3600)
          minutes = int((elapsed_time % 3600) // 60)
          seconds = int(elapsed_time % 60)
          time_string = f"Total execution time: {hours:02d}:{minutes:02d}:{seconds:02d}"
          print(time_string)
          evaluation_metrics.append({
              "LLM": llm_name,
              "Dataset_Type_Variation": dataset_name,
              "Precision": precision,
              "Recall": recall,
              "F1-Score": f1,
              "Execution Time": time_string
          })

  # Save classification results to CSV
  with open(dataset_title+'_'+ llm_name+'_ZSL_HFpipeline_classification_results_Template_None.csv', 'w', newline='') as csvfile:
      fieldnames = ["LLM", "Dataset_Type_Variation", "Requirement", "True Label", "Predicted Label", "Scores"]
      writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
      writer.writeheader()
      writer.writerows(all_results)

  # Save evaluation metrics to CSV
  with open(dataset_title+'_'+ llm_name+'_ZSL_HFpipeline_evaluation_metrics_Template_None.csv', 'w', newline='') as csvfile:
      fieldnames = ["LLM", "Dataset_Type_Variation", "Precision", "Recall", "F1-Score", "Execution Time"]
      writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
      writer.writeheader()
      writer.writerows(evaluation_metrics)


In [None]:
ram_gb = virtual_memory().total / 1e9
print('Your runtime has {:.1f} gigabytes of available RAM\n'.format(ram_gb))

if ram_gb < 20:
  print('Not using a high-RAM runtime')
else:
  print('You are using a high-RAM runtime!')

#Saved Results

In [None]:
# Download all the files in this colab except sample_data into zipped folder
from google.colab import files

!zip -r {'NFR-multi-classification-'+ llm_name+'_all.zip'} . -x "sample_data/*"
files.download('NFR-multi-classification-'+ llm_name+'_all.zip')


  adding: .config/ (stored 0%)
  adding: .config/configurations/ (stored 0%)
  adding: .config/configurations/config_default (deflated 15%)
  adding: .config/gce (stored 0%)
  adding: .config/.last_opt_in_prompt.yaml (stored 0%)
  adding: .config/hidden_gcloud_config_universe_descriptor_data_cache_configs.db (deflated 97%)
  adding: .config/.last_update_check.json (deflated 22%)
  adding: .config/logs/ (stored 0%)
  adding: .config/logs/2024.11.21/ (stored 0%)
  adding: .config/logs/2024.11.21/14.25.30.368557.log (deflated 87%)
  adding: .config/logs/2024.11.21/14.25.43.887178.log (deflated 57%)
  adding: .config/logs/2024.11.21/14.25.31.482956.log (deflated 58%)
  adding: .config/logs/2024.11.21/14.25.44.623490.log (deflated 56%)
  adding: .config/logs/2024.11.21/14.24.55.851126.log (deflated 92%)
  adding: .config/logs/2024.11.21/14.25.17.504089.log (deflated 58%)
  adding: .config/config_sentinel (stored 0%)
  adding: .config/.last_survey_prompt.yaml (stored 0%)
  adding: .config/ac

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>