# Exercise: User Story generation

User stories are brief, simple descriptions of a feature or requirement from the perspective of an end user. They are used in Agile software development to capture the user’s needs and goals in a way that prioritizes functionality and value. A typical user story follows the format:

"As a [type of user], I want [a specific goal] so that [reason or benefit]."

This format ensures the focus stays on delivering solutions that meet the user’s expectations. User stories guide development by defining clear, actionable tasks for teams while remaining flexible and open to adaptation as the project evolves.

Our objective is to create an agent architecture like the following one:

1. **Problem Specification (Domain Specification):**
The process begins with the definition of a problem description, which outlines the domain or context of the system. This serves as the input to the LLM and sets the foundation for generating user stories. Our problem specification is provided as the input to the process.

2. **Role(s) Extraction ("Who" Aspect):**
A prompt is used to extract roles from the problem description.
Roles represent the different types of users or stakeholders involved in the system This step answers "who" the user stories are about.

3. **Function(s) Extraction ("What" Aspect):**
Another prompt is used to extract functions associated with each role.
Functions describe the specific actions or goals the roles aim to perform within the system. This step answers "what" the users want to achieve.

4. **Purpose(s) Extraction ("Why" Aspect):**
A subsequent prompt is used to determine the purposes behind the functions identified. Purposes explain the reasons or motivations behind the desired actions of the roles. This step answers "why" the users need the functions.

5. **User Stories Generation:**
Combining the roles (who), functions (what), and purposes (why), a final prompt generates complete user stories. These user stories take the standard Agile format: "As a \<who\>, I want \<what\>, so that \<why\>."

**Feedback Loop and Prompt Chain**:
The process is iterative, as each step informs the next and ensures that the roles, functions, and purposes are fully aligned. The LLM is guided by a structured "prompt chain," where prompts are carefully designed to extract specific information at each stage.

**Output**:
The final output consists of user stories that capture the needs and goals of the system's users in a concise, actionable format. These stories guide development teams in implementing the desired features and functionalities.





In [1]:

from transformers import AutoTokenizer, AutoModelForCausalLM
from huggingface_hub import login
import torch



#----------------Google Colab only---------------------
from google.colab import userdata
login(userdata.get('HF_TOKEN'))
#----------------Google Colab only---------------------

#detect the device available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

model_id = "meta-llama/Llama-3.2-3B-Instruct"
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16).to(device)
tokenizer = AutoTokenizer.from_pretrained(model_id)

# We define a method to ask any prompt to llama
def make_a_query(prompt: str, max_new_tokens:int = 200):
    """
    Send a prompt to the Llama model and get a response.

    Args:
    - prompt (str): The input question or statement to the model.
    - max_new_tokens (int): The maximum length of the response.

    Returns:
    - str: The model's generated response.
    """

    # Both the tokenizer and the model are global variables

    # Set pad_token_id if missing
    if tokenizer.pad_token_id is None:
        tokenizer.pad_token_id = tokenizer.eos_token_id


    # Tokenize the input with padding and truncation
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    inputs = tokenizer(prompt, return_tensors="pt", padding=True, truncation=True).to(device)

    # Compute the lenght of the input prompt to be able to extract the model's response later
    input_ids = inputs["input_ids"]
    prompt_length = input_ids.shape[1]

    # Generate a response
    output = model.generate(
        inputs['input_ids'],
        attention_mask=inputs['attention_mask'],
        max_new_tokens=max_new_tokens,  # Limit the number of new tokens generated (e.g., a single word)
        #temperature=0.3,  # Reduce randomness, use with do_sample = True
        repetition_penalty=2.0,  # Penalize repetition
        no_repeat_ngram_size=3,  # Avoid repeating bigrams
        do_sample= False,  # Set to False to use Greedy or Beam search
        num_beams=3, # Use with do_sample = False
        eos_token_id=tokenizer.eos_token_id,  # End generation at EOS token
        pad_token_id=tokenizer.pad_token_id,  # Avoid padding tokens
        early_stopping=True,
    )

    generated_tokens = output[0, prompt_length:]

    # Decode the response into human-readable text
    response = tokenizer.decode(generated_tokens, skip_special_tokens=True).strip()

    return response


#reference requirements_text

requirements_text = "The proposed platform is designed to enhance the hiking experience for various user groups, including visitors, local guides, platform managers, and hut workers. The platform provides a centralized repository of hiking routes, hut information, and parking facilities. It also enables interactive features such as real-time hike tracking, personalized recommendations, and group hike planning. By combining these capabilities, the platform seeks to foster safe, informed, and collaborative hiking experiences.\
The platform will be deployed as a cloud-based web and mobile application accessible to all stakeholders. The distribution strategy includes an app available on major mobile operating systems, such as iOS and Android, alongside a responsive web interface. It will require an internet connection for features like real-time tracking, notifications, and user authentication, though some offline capabilities, such as pre-downloaded hike information, will also be available.\
User authentication will be role-based, ensuring that only authorized users, such as verified hut workers and platform managers, can access sensitive or administrative features.\
Visitors are the primary users of the platform. They can browse a comprehensive list of hiking trails, filter them based on specific criteria such as difficulty, length, or starting point, and view detailed descriptions. To access advanced features like personalized recommendations, visitors can create user accounts by registering on the platform. Registered users can record their fitness parameters, enabling the system to suggest trails tailored to their capabilities.\
During a hike, visitors can record their progress by marking reference points and sharing their live location through a broadcasting URL. They can also initiate group activities by planning hikes, adding group members, and confirming group participation. The platform allows visitors to start, terminate, and track their hikes, with notifications for unfinished hikes or late group members to ensure safety and accountability.\
Local guides enrich the platform by contributing essential information. They can add detailed descriptions of hikes, parking facilities, and huts, ensuring hikers have accurate and comprehensive data. Local guides also link parking lots and huts to specific trails as starting or arrival points, enhancing the planning process.\
To aid in the visual representation and accessibility of information, local guides can upload pictures of huts and connect these locations directly to hikes. This integration simplifies route planning and helps visitors visualize their journey.\
Platform managers oversee the operational integrity and safety of the platform. They verify new hut worker registrations, ensuring that only authorized personnel can update hut-related data. Managers can also broadcast weather alerts for specific areas, notifying all hikers in those regions through push notifications. This ensures that users stay informed about potentially hazardous conditions.\
The platform manager's role includes maintaining an organized and secure user system while facilitating collaboration between local guides, hut workers, and visitors.\
Hut workers are critical to the maintenance of up-to-date trail and accommodation information. After registering and being verified, hut workers can log into the platform to add or update information about their assigned huts, including uploading pictures and describing the facilities available. They can also monitor and report on the condition of nearby trails, ensuring hikers receive current information.\
Hut workers play a vital role in providing situational updates for hikers. For instance, if a nearby trail is impacted by severe weather or physical obstructions, they can communicate these conditions through the platform. This enhances the safety and preparedness of all hikers relying on the platform."

config.json:   0%|          | 0.00/878 [00:00<?, ?B/s]

`torch_dtype` is deprecated! Use `dtype` instead!


model.safetensors.index.json:   0%|          | 0.00/20.9k [00:00<?, ?B/s]

Fetching 2 files:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/4.97G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/1.46G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/189 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/54.5k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/9.09M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/296 [00:00<?, ?B/s]

In [None]:


# Example usage - without chat template
prompt = """<|begin_of_text|><|start_header_id|>system<|end_header_id|>
You are an expert on world capitals.
Respond with only the capital city of the given country. Do not repeat the question.<|eot_id|>

Query: What is the capital of France?
Answer:
"""

response = make_a_query(prompt)

print(f"Prompt: {prompt}\nResponse: {response}")


# Example usage - with chat template
messages = [
    {"role": "system", "content": "You are an expert on world capitals. Respond with only the capital city of the given country. Do not repeat the question."},
    {"role": "user", "content": "What is the capital of France?"}
]

chat = tokenizer.apply_chat_template(messages,
                                     tokenize=False, #only the special tokens are added, we'll apply tokenisation later
                                     add_generation_prompt=True) # adds the placeholder for the assistant's response


response = make_a_query(chat)

print(f"Prompt: {prompt}\nResponse: {response}")






config.json:   0%|          | 0.00/878 [00:00<?, ?B/s]

`torch_dtype` is deprecated! Use `dtype` instead!


model.safetensors.index.json:   0%|          | 0.00/20.9k [00:00<?, ?B/s]

Fetching 2 files:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/4.97G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/1.46G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/189 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/54.5k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/9.09M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/296 [00:00<?, ?B/s]

The following generation flags are not valid and may be ignored: ['temperature', 'top_p']. Set `TRANSFORMERS_VERBOSITY=info` for more details.


Prompt: <|begin_of_text|><|start_header_id|>system<|end_header_id|>
You are an expert on world capitals.
Respond with only the capital city of the given country. Do not repeat the question.<|eot_id|>

Query: What is the capital of France?
Answer:

Response: Paris
Prompt: <|begin_of_text|><|start_header_id|>system<|end_header_id|>
You are an expert on world capitals.
Respond with only the capital city of the given country. Do not repeat the question.<|eot_id|>

Query: What is the capital of France?
Answer:

Response: Paris


## Extracting the Actors

To automate the extraction of roles ("Who" aspect) from a problem description, we utilize a language model (LLM) to identify key actors or stakeholders involved in the system. Roles, such as "visitors," "local guides," "platform managers," and "hut workers," represent the primary users or contributors to the system and are essential for generating accurate and actionable user stories. This step focuses on using the LLM to extract these roles and then validating the quality of the extraction by comparing the results against a reference dataset.

The LLM is provided with a detailed problem description and prompted to extract the roles (actors) associated with the system. The prompt is carefully designed to ensure the LLM captures all relevant **roles** based on the context of the problem. The output is a list of roles that the LLM determines as key stakeholders in the system.







In [5]:
#define three different prompts for extracting users, using the zero-shot, one-shot and few-shot approach. then evaluate the difference in terms of capability to format the output and actual results

# IMPORTANT: wheter you use the chat template or not, remember to add the shots into the correct structure composed by user's query and assistant's response (see previous lab)

query = "Now, extract the roles from the following requirements text:"

shots = [ # This is a list of lists, useful to separate each shot from each other
    [
        {'role': 'user',
          'content': """Requirements Text:
"The application is designed to streamline food delivery operations. Customers can browse menus and place orders. Delivery personnel use the app to view delivery assignments and update order statuses. Restaurant managers can manage their menus and track incoming orders."""},
        {'role': 'assistant'},
        {'content': """Extracted Roles:
1. Customers
2. Delivery Personnel
3. Restaurant Managers"""},
    ],

    [
      {'role': 'user',
      'content': """Requirements Text:
The platform supports the education sector by providing tools for students, teachers, and administrators. Students can access course materials, submit assignments, and track their progress. Teachers can create content, grade submissions, and communicate with students. Administrators oversee user registrations and system performance."""},
      {'role': 'assistant',
      'content': """Extracted Roles:
  1. Students
  2. Teachers
  3. Administrators"""}
    ],

    [
        {'role': 'user',
         'content': """Requirements Text:
The system provides real-time updates for commuters, transit operators, and city planners. Commuters can check schedules and delays. Transit operators monitor vehicle statuses and manage timetables. City planners analyze system data to improve routes and infrastructure."""},
        {'role': 'assistant',
         'content': """Extracted Roles:
1. Commuters
2. Transit Operators
3. City Planners"""}
    ]
]


system_prompt = """You are an expert requirements analyst specialized in extracting roles for user stories from system requirements. \
Your task is to identify all distinct user roles mentioned in the given requirements text. Roles refer to the specific types of users or stakeholders who interact with the system. \
These roles should align with the personas or actors described in the requirements.

Output the list of roles in a clear and concise manner. Do not include duplicates or irrelevant terms. Generate only the list and nothing else."""

# Build the message list for each promting technique

messages_zero_shot = [
    {'role': 'system', 'content': system_prompt},
    {'role': 'user', 'content': f"{query}\n\nRequirements Text:\n{requirements_text}"}
]
prompt_zero_shot = tokenizer.apply_chat_template(messages_zero_shot,
                                                 tokenize=False,
                                                 add_generation_prompt=True)

messages_one_shot = [
    {'role': 'system', 'content': system_prompt},
    shots[0][0],
    shots[0][1],
    {'role': 'user', 'content': f"{query}\n\nRequirements Text:\n{requirements_text}"}
]
prompt_one_shot = tokenizer.apply_chat_template(messages_one_shot,
                                                tokenize=False,
                                                add_generation_prompt=True)

messages_few_shots = [
    {'role': 'system', 'content': system_prompt},
    shots[0][0],
    shots[0][1],
    shots[1][0],
    shots[1][1],
    shots[2][0],
    shots[2][1],
    {'role': 'user', 'content': f"{query}\n\nRequirements Text:\n{requirements_text}"}
]
prompt_few_shots = tokenizer.apply_chat_template(messages_few_shots,
                                                 tokenize=False,
                                                 add_generation_prompt=True)



# Call the LLM function for the three prompts and print the results
roles_extracted_zero_shot = make_a_query(prompt_zero_shot, max_new_tokens=2000)

print("\n\nExtracted Roles - zero shot: \n", roles_extracted_zero_shot)




roles_extracted_one_shot = make_a_query(prompt_one_shot, max_new_tokens=2000)

print("\n\nExtracted Roles - one shot: \n", roles_extracted_one_shot)



roles_extracted_few_shots = make_a_query(prompt_few_shots, max_new_tokens=2000)

print("\n\nExtracted Roles - few shots: \n", roles_extracted_few_shots)












Extracted Roles - zero shot: 
 1. Visitors
2. Local Guides
3. Platform Managers
4. Hut Workers


Extracted Roles - one shot: 
 1. Visitors
2. Local Guides
3. Platform Managers
4. Hut Workers


Extracted Roles - few shots: 
 1. Visitors 
2. Local Guides 
3. Platform Managers 
4. Hut Workers


While no external dataset is used to refine or influence the role extraction process, we validate the extracted roles against a reference dataset contained in a CSV file named stories.csv. This file includes a specific column listing all the predefined roles associated with the system. These roles serve as the ground truth for validation purposes.


To assess the accuracy and completeness of the LLM’s role extraction, we compute the precision and recall metrics by comparing the LLM's output to the ground truth roles:

- **Precision**: Measures how many of the roles extracted by the LLM are correct (i.e., exist in the reference dataset).

  $P = \frac{True\ Positives}{True\ Positives + False\ Positives}$


- **Recall**: Measures how many of the ground truth roles were successfully identified by the LLM.

  $R = \frac{True\ Positives}{True\ Positives + False\ Negatives}$


After extracting the roles using the LLM, the results are compared to the roles in the reference column from the CSV file. For each role identified by the LLM:

- If the role exists in the reference dataset, it is considered a true positive.
- If the role does not exist in the reference dataset, it is marked as a false positive.
- Any role in the reference dataset that the LLM fails to identify is considered a false negative.

The final output of this process includes:
- A list of roles extracted by the LLM.
- Precision and recall metrics, which provide a quantitative assessment of the LLM’s performance in role extraction.

In [6]:
import csv
import re

#define a function to extract unique roles from the .csv files with user stories

def extract_unique_roles(file_path):
    roles = set()  # Use a set to ensure roles are unique
    role_pattern = r"As a(n)? ([^I]+)"  # Regex to match the role in the user story

    with open(file_path, 'r', encoding='utf-8') as csvfile:
        reader = csv.reader(csvfile)
        next(reader)  # Skip the header row
        for row in reader:
            if len(row) > 4:  # Ensure the row has enough fields
                description = row[4]
                match = re.search(role_pattern, description)
                if match:
                    role = match.group(2).strip()
                    roles.add(role)

    return list(roles)


#this function takes a list of roles, makes all the text lower case, removes spaces, and removes the s at the end

def normalize_roles(role_list):
    return {role.lower().strip().rstrip('s') for role in role_list}


# find the unique roles and put them into a list
file_path = 'stories.csv'
expected_roles = extract_unique_roles(file_path)
print("EXPECTED ROLES: ", expected_roles)       #EXPECTED RESULT FOR ROLES


#now extract unique roles from the result provided by the LLM
#for simplicity, we only consider the few shots prompt for this purpose

actual_roles = [role.strip() for role in re.split(r'\d+\.\s', roles_extracted_few_shots) if role.strip()]


print("ACTUAL ROLES: ", actual_roles)     #ACTUAL RESULT FOR ROLES



#Compute precision and recall
#use the normalized roles, i.e., don't consider spaces and 's' at the end of the roles

normalized_actual = normalize_roles(actual_roles)
normalized_expected = normalize_roles(expected_roles)

# Set operators are used
true_positives = normalized_actual & normalized_expected
false_positives = normalized_actual - normalized_expected
false_negatives = normalized_expected - normalized_actual

precision = len(true_positives) / len(normalized_actual) if normalized_actual else 0
recall = len(true_positives) / len(normalized_expected) if normalized_expected else 0


# Print results
print("True Positives:", true_positives)
print("False Positives:", false_positives)
print("False Negatives:", false_negatives)
print("Precision:", precision)
print("Recall:", recall)




EXPECTED ROLES:  ['hut worker', 'friend (of a hiker)', 'local guide', 'visitor', 'platform manager', 'hiker']
ACTUAL ROLES:  ['Visitors', 'Local Guides', 'Platform Managers', 'Hut Workers']
True Positives: {'local guide', 'platform manager', 'visitor', 'hut worker'}
False Positives: set()
False Negatives: {'hiker', 'friend (of a hiker)'}
Precision: 1.0
Recall: 0.6666666666666666


## Extracting the Functions

In this step, the primary goal is to automate the identification of functions required by the roles identified in the previous step (Role Extraction). The function refers to the tasks or actions that the roles need to perform within the system. To achieve this, a language model (LLM) is used to analyze the problem description, and based on the roles identified, it is prompted to extract the functions those roles desire to perform.

For example, after identifying the role of "visitors" or "local guides," the LLM is prompted to determine the specific actions these roles want to carry out within the system, such as "view hikes," "add descriptions," or "manage registrations." This extraction process is critical for understanding the functionality needed in the system to meet the requirements of its stakeholders.

By leveraging the context and the roles passed from the previous step, the LLM extracts a list of desired functions. This list helps define the operational scope of the system and lays the foundation for generating user stories in subsequent steps. The extracted functions are validated to ensure completeness and alignment with the roles' needs.


In [14]:


current_roles = actual_roles


def extract_functions(text):

    # Clean the extracted text by splitting it into lines
    function_lines = text.strip().split('\n')

    # Initialize an empty list to store the functions
    functions = []

    # Iterate over the lines and remove any numbers or dots at the beginning
    for line in function_lines[:15]:  # Limit to first 15 functions
        # Use regex to remove any unwanted numbers, dots, and extra spaces
        function = re.sub(r'[^a-zA-Z\s]', '', line).strip()
        if function:  # Only add non-empty lines
            functions.append(function)

    return functions



def create_user_stories(role, functions):
    # Generate user stories in the desired format
    user_stories = [(role, f"As a {role}, I want to {function}") for function in functions]
    return user_stories


user_stories = list()

for role in current_roles:

    print("extracting stories for role", role)

    n = 0
    while True:
        #1. Loop over the list of roles and extract the piece of text from requirements related to that role

        n = n + 1
        print("tentative", n, "for", role)

        system_prompt = f""""
        You are an expert requirements analyst. Your task is to extract only the relevant **functions** for the given role from the full system requirements provided below. Focus only on the sections and tasks that are related to the **{role}** user. Do not include information related to other roles. Just generate the list of functions without introduction."""

        example1 =  [
                {
                    'role': 'user',
                    'content': """**Requirements Text**:
                        The application is designed to streamline food delivery operations. Customers can browse menus and place orders. Delivery personnel use the app to view delivery assignments and update order statuses. Restaurant managers can manage their menus and track incoming orders."""
                },

                {
                    'role': 'assistant',
                    'content': """**Functions for Delivery personnel**:
                                        "1. View delivery assignments.
                                        2. Update order status."""
                }
        ]

        example2 =  [
                {
                    'role': 'user',
                    'content': """**Requirements Text**:"
                                The platform supports the education sector by providing tools for students, teachers, and administrators. Students can access course materials, submit assignments, and track their progress. Teachers can create content, grade submissions, and communicate with students. Administrators oversee user registrations and system performance."""
                },

                {
                    'role': 'assistant',
                    'content': """**Functions for Teachers**:
                                  1. Create content.
                                  2. Grade submissions.
                                  3. Communicate with students."""
                }
        ]


        example3 =  [
                {
                    'role': 'user',
                    'content': """**Requirements Text**:
                                The system provides real-time updates for commuters, transit operators, and city planners. Commuters can check schedules and delays. Transit operators monitor vehicle statuses and manage timetables. City planners analyze system data to improve routes and infrastructure."""
                },

                {
                    'role': 'assistant',
                    'content': """**Functions for City Planners**:
                                      1. Analyze system data."""
                }
        ]

        query = f"""Now, please review the full system requirements text below.

        Please extract only the relevant sections and tasks that describe the features, actions, and functionalities specifically for the **{role}** role. The output should be **concise** and focused solely on the **{role}** user. You should list **at most 15 functions**. Do not include duplicates or functions for other roles other than {role}. Just output the list, no more words needed.

        **Requirements Text**:
        {requirements_text}"""

        messages = [
            {'role': 'system', 'content': system_prompt},
            example1[0],
            example1[1],
            example2[0],
            example2[1],
            example3[0],
            example3[1],
            {'role': 'user', 'content': query}

        ]

        prompt = tokenizer.apply_chat_template(messages,
                                                tokenize=False,
                                                add_generation_prompt=True)

        response = make_a_query(prompt, max_new_tokens=2500)



        functions = extract_functions(response)

        print(f"{len(functions)} functions have been extracted")

        tmp_user_stories = create_user_stories(role, functions)

        print("corresponding user stories: ", tmp_user_stories)


        if len(functions) > 0 or n > 10:
            break



    user_stories = user_stories + tmp_user_stories



print(user_stories)


extracting stories for role Visitors
tentative 1 for Visitors
15 functions have been extracted
corresponding user stories:  [('Visitors', 'As a Visitors, I want to Browse hiking trails'), ('Visitors', 'As a Visitors, I want to Filter hiking trails by criteria'), ('Visitors', 'As a Visitors, I want to View detailed trail descriptions'), ('Visitors', 'As a Visitors, I want to Create user account'), ('Visitors', 'As a Visitors, I want to Record fitness parameters'), ('Visitors', 'As a Visitors, I want to Receive personalized trail recommendations'), ('Visitors', 'As a Visitors, I want to Record progress during a hike'), ('Visitors', 'As a Visitors, I want to Share live location'), ('Visitors', 'As a Visitors, I want to Initiate group activities'), ('Visitors', 'As a Visitors, I want to Plan hikes'), ('Visitors', 'As a Visitors, I want to Add group members'), ('Visitors', 'As a Visitors, I want to Confirm group participation'), ('Visitors', 'As a Visitors, I want to Track hikes'), ('Visito

##Extracting the purposes

In this step, the focus is on extracting the purposes behind the functions identified in the previous step. The purpose refers to the underlying reasons or motivations that drive a role to perform a particular function within the system. The language model (LLM) is prompted to identify these reasons by analyzing the problem description and the functions extracted in the previous step.

After the functions (what the roles want to do) are identified, the LLM is tasked with understanding and extracting the why behind each function. For example, a role like "hiker" may want to "register for a hike" (function), but the purpose behind this action might be "so that they can track their progress" or "to ensure they can participate in the hike." Similarly, a "local guide" may want to "add a hike description," and the purpose could be "to provide helpful information to other users."

The extracted purposes help provide a deeper understanding of why each function is important to the roles. These purposes are essential for refining the system’s objectives and ensuring that the design aligns with the actual needs and motivations of the stakeholders. The LLM identifies these purposes by analyzing both the context of the roles and the corresponding functions, ensuring that all relevant purposes are captured to provide clarity for the user stories that will follow in the next step.

In [18]:
#we take the pairs of role - function obtained in the previous step
#for simplicity, we only consider two user stories at this point
#to complete the task, we would require to perform an iteration over all the stories defined - this would be very computationally expensive and is out of the scope of the current exercise

user_story1 = user_stories[0]
user_story2 = user_stories[1]

print(user_story1)

role=user_story1[0]
function=user_story1[1]

print(role)

system_prompt = """
You are an expert requirements analyst. Your task is to complete user stories by adding the purpose (the "why") behind a given function based on the full system requirements provided below. The user story format should be: "As a <role>, I want to <function>, so that <purpose>."""

example1 =  [
                {
                    'role': 'user',
                    'content': """**User Story (without purpose)**: "As a Customer, I want to browse the menu.
                    **Requirements Text**: "The application allows customers to browse menus and place orders from a variety of restaurants."""
                },

                {
                    'role': 'assistant',
                    'content': """**Completed User Story**: "As a Customer, I want to browse the menu, so that I can explore and select food items to place an order."""
                }
        ]

example2 =  [
        {
            'role': 'user',
            'content': """**User Story (without purpose)**: "As a Teacher, I want to grade submissions."
**Requirements Text**: "Teachers can grade student submissions through the platform's grading tools."""
        },

        {
            'role': 'assistant',
            'content': """**Completed User Story**: "As a Teacher, I want to grade submissions, so that I can evaluate student performance and provide feedback."""
        }
]


example3 =  [
        {
            'role': 'user',
            'content': """**User Story (without purpose)**: "As a Transit Operator, I want to manage timetables."
            **Requirements Text**: "Transit operators can view and manage vehicle schedules, ensuring that vehicles adhere to the set timetables."""
        },

        {
            'role': 'assistant',
            'content': """**Completed User Story**: "As a Transit Operator, I want to manage timetables, so that I can ensure the vehicles run on time and provide reliable service.."""
        }
]

query = f"""Now, please complete the following user story by adding the purpose ("so that <purpose>") based on the provided requirements text:

**User Story (without purpose)**: "As a {role}, I want to {function}."
**Requirements Text**: {requirements_text}"""


messages = [
    {'role': 'system', 'content': system_prompt},
    example1[0],
    example1[1],
    example2[0],
    example2[1],
    example3[0],
    example3[1],
    {'role': 'user', 'content': query}

]

prompt = tokenizer.apply_chat_template(messages,
                                        tokenize=False,
                                        add_generation_prompt=True)

response = make_a_query(prompt, max_new_tokens=2500)
full_story = response.split("**Completed User Story**:")[-1]
print(full_story)







('Visitors', 'As a Visitors, I want to Browse hiking trails')
Visitors
Here is the completed user story:

**As a Visitor, 
I want toBrowse hiking trails,
so that I may plan and prepare for a safe and enjoyable hiking experience.**

This user story captures the primary purpose of the feature, which is to enable visitors to browse and plan their hiking trails effectively.
