# LLM Prompting for Requirement Extraction in Digital Ecosystems: Iteration 1
This Jupyter notebook has been developed by Isabel Rutten (Leiden University), as part of her Master thesis. 
In this notebook, we explore different prompting techniques and LLMs for the most effective requirement extraction in digital ecosystems. 
A specific example of a digital ecosystem is the Digital Product Passport. We will use the established set of requirements for the Digital Product Passport as the ground truth. Based on that information, we can determine how complete the LLM output is and compare between prompting techniques and LLMs. 
Hence, this notebook is structured as follows: 
1. LLMs and setup
1. Prompting techniques
1. Experiments
1. Results 
1. Conclusion

In [2]:
# import sys
# !pip install tabulate
!pip install pandas

Defaulting to user installation because normal site-packages is not writeable



[notice] A new release of pip is available: 25.2 -> 25.3
[notice] To update, run: python.exe -m pip install --upgrade pip


## 1. LLMs and setup
A popular set of LLMs has been established in literature. In short, they are as follows: 

a. **TODO**

b. **TODO** 

We will set up access to these LLMs in the following code block. 

In [3]:
# TODO: setup server to access LLMs


# TODO: access selected LLMs
# TODO: determine relevant LLMs, this is just a draft
llm1 = "LLM1"

llm2 = "LLM2"

LLM_names = [llm1, llm2]

## 2. Prompting techniques
A specific set of LLM prompting techniques has been established in literature. In short, they are as follows: 

a. **Fine-Tuning**: update weights of model with an existing dataset

b. **Few-Shot**: give several examples of a problem and its solution

c. **One-Shot**: give just 1 example

d. **Zero-Shot**: only tell about the task

e. TODO



In [4]:
# TODO: determine relevant prompting techniques, this is just a draft
# TODO: determine whether to put input (i.e. prompts, weights) in this section instead of the experiments section
prompt_techniques = ["Fine-Tuning", "Few-Shot", "One-Shot", "Zero-Shot"]

## 3. Experiments
Here, we execute the different combinations of LLMs and prompting techniques to gain requirement lists for each. 

In [5]:
# variable output is used for comparison
output = []

### a. Fine-Tuning

In [6]:
# TODO: determine update of weights model with an existing dataset


# TODO: use fine-tuning on the selected LLMs, to get the requirements as output
ft_llm1 = "todo"
ft_llm2 = "todo"

# save output from fine-tuning and the selected LLMs
ft_output = [ft_llm1, ft_llm2]
output.append(ft_output)

### b. Few-Shot

In [7]:
# TODO: few-shot prompt contains several examples
fs_prompt = "todo"

# TODO: use few-shot prompt as input for selected LLMs, to get the requirements as output
fs_llm1 = "todo"
fs_llm2 = "todo"

# save output from few-shot and the selected LLMs
fs_output = [fs_llm1, fs_llm2]
output.append(fs_output)

### c. One-Shot

In [8]:
# TODO: one-shot prompt contains one example
os_prompt = "todo"

# TODO: use one-shot prompt as input for selected LLMs, to get the requirements as output
os_llm1 = "todo"
os_llm2 = "todo"

# save output from one-shot and the selected LLMs
os_output = [os_llm1, os_llm2]
output.append(os_output)

### d. Zero-Shot

In [9]:
# TODO: zero-shot prompt contains only the task
zs_prompt = "todo"

# TODO: use zero-shot prompt as input for selected LLMs, to get the requirements as output
zs_llm1 = "todo"
zs_llm2 = "todo"

# save output from zero-shot and the selected LLMs
zs_output = [zs_llm1, zs_llm2]
output.append(zs_output)

### e. ?

## 4. Results
Here, we compare the LLM output from the experiments to the ground truth: the complete list of requirements as established from experts. 

In [10]:
# from tabulate import tabulate
import random
import pandas as pd

# compare groundtruth to output
# TODO: determine ground truth
ground_truth = "todo"

# for each prompting technique, we compare the output for the LLMs with the ground truth
result = []
for index_i, item in enumerate(prompt_techniques): 
    compare = []
    for index_j, item in enumerate(LLM_names):
        # TODO: figure out how to compare two lists of requirements (ground_truth AND output[index_i][index_j]) and return its similarity
        compare.append(random.randint(1,100))
    result.append(compare)

# fill a dataframe with the results 
data = {
    prompt_techniques[0]: result[0], 
    prompt_techniques[1]: result[1], 
    prompt_techniques[2]: result[2], 
    prompt_techniques[3]: result[3], 
}
df = pd.DataFrame(data, index = LLM_names)

# print the dataframe
print(df)

      Fine-Tuning  Few-Shot  One-Shot  Zero-Shot
LLM1           78        16        35          4
LLM2           50        36        97         48


In [11]:
# Alternative code solutions separately to keep clean

# ALT: trying to not hard code filling the dataframe, but append is no longer supported :/
# df = pd.DataFrame()
# for index, item in enumerate(prompt_techniques): 
#     # res = result[index]
#     # df = df.append("item": res)
#     # print(index)
#     # print(item)
#     # print(prompt_techniques[index])
#     # print(result[index])
#     # df = pd.concat([df, pd.DataFrame({prompt_techniques[index]: result[index]})])
#     # print(df)
#     df = df._append({prompt_techniques[index]: result[index]}, ignore_index=True)
# print(df)

# ALT: using tabular to display table
# # show comparison in a table
# scores = [
#     [prompt_techniques[0], result[0][0], result[0][1]], 
#     [prompt_techniques[1], result[1][0], result[1][1]], 
#     [prompt_techniques[2], result[2][0], result[2][1]], 
#     [prompt_techniques[3], result[3][0], result[3][1]], 
# ]
# headers = ["Prompt technique"]
# headers.extend(LLM_names)
# print(tabulate(scores, headers = headers, tablefmt = "fancy_grid"))


## 5. Conclusion
TODO: draw conclusions from dataframe

