# Automated Requirements-Test Case Linker
This notebook uses Natural Language Processing (NLP) to suggest links between requirements and test cases based on the semantic similarity of their descriptions. It leverages the sentence-transformers library for embedding text and calculating cosine similarity.

## 1. Setup and Data Loading
First, we'll import the necessary libraries and load your requirements and test case data from CSV files. Ensure that requirements.csv and testcases.csv are in the same directory as your notebook, or provide the full file paths.

In [11]:
# Import necessary libraries
import pandas as pd
from sentence_transformers import SentenceTransformer, util

# Load data
try:
    reqs = pd.read_csv('requirements.csv')
    tests = pd.read_csv('testcases.csv')
    print("✅ Data loaded successfully.")
except FileNotFoundError:
    print("❌ Error: Make sure 'requirements.csv' and 'testcases.csv' are in the same directory.")
    # You might want to exit or handle this error more robustly in a real application

✅ Data loaded successfully.


## 2. Load NLP Model
Next, we'll load the pre-trained NLP model from sentence-transformers. The 'all-MiniLM-L6-v2' model is a good choice for general-purpose sentence embeddings due to its balance of performance and efficiency.

In [12]:
# Load NLP model
print("Loading NLP model 'all-MiniLM-L6-v2'...")
model = SentenceTransformer('all-MiniLM-L6-v2')
print("✅ Model loaded.")

Loading NLP model 'all-MiniLM-L6-v2'...
✅ Model loaded.


## 3. Encode Descriptions
We'll convert the textual descriptions of your requirements and test cases into numerical vectors (embeddings) using the loaded NLP model. These embeddings capture the semantic meaning of the text, allowing for similarity comparisons.

In [None]:
# Encode descriptions
print("Encoding requirement descriptions...")
req_embeddings = model.encode(reqs['Description'].tolist(), convert_to_tensor=True)
print("Encoding test case descriptions...")
test_embeddings = model.encode(tests['Description'].tolist(), convert_to_tensor=True)
print("✅ Descriptions encoded.")

## 4. Calculate Similarity and Suggest Matches
Finally, we'll compute the cosine similarity between each requirement embedding and all test case embeddings. The test case with the highest similarity score will be suggested as a potential match for each requirement.

In [None]:
# Compare using cosine similarity
print("Calculating cosine similarity between requirements and test cases...")
cosine_scores = util.pytorch_cos_sim(req_embeddings, test_embeddings)
print("✅ Similarity calculated.")

# Match requirements with the most similar test case
print("\n--- Suggested Matches ---")
for i, req_text in enumerate(reqs['Description']):
    best_match_idx = cosine_scores[i].argmax()
    best_score = cosine_scores[i][best_match_idx].item()
    matched_test = tests['Description'][best_match_idx.item()]

    print(f"**Requirement {reqs['Requirement_ID'][i]}:**")
    print(f"  ↳ *{req_text}*")
    print(f"**Suggested Match:** {tests['TestCase_ID'][best_match_idx.item()]} - {matched_test}")
    print(f"**Similarity Score:** {best_score:.4f}")
    print("-" * 80)