In [1]:
%pip install google-generativeai langchain langchain_google_genai

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 24.1.1 -> 25.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [2]:
from dotenv import load_dotenv

# Load API key from .env file
load_dotenv()

True

In [3]:
import os
import json
import re
import google.generativeai as genai

# Step 1: Setup Gemini API
genai.configure(api_key=os.environ["GEMINI_API_KEY"])

  from .autonotebook import tqdm as notebook_tqdm


In [4]:
problem_statement = """
Metro station wants to establish a TicketDistributor machine that issues tickets for
passengers travelling on metro rails. Travellers have options of selecting a ticket for a single
trip, round trips or multiple trips. They can also issue a metro pass for regular passengers or
a time card for a day, a week or a month according to their requirements. The discounts on
tickets will be provided to frequent travelling passengers. The machine is also supposed to
read the metro pass and time cards issued by the metro counters or machine. The ticket rates
differ based on whether the traveller is a child or an adult. The machine is also required to
recognize original as well as fake currency notes. The typical transaction consists of a user
using the display interface to select the type and quantity of tickets and then choosing a
payment method of either cash, credit/debit card or smartcard. The tickets are printed and
dispensed to the user. Also, the messaging facilities after every transaction are required on
the registered number. The system can also be operated comfortably by a touch-screen. A
large number of heavy components are to be used. We do not want our system to slow down,
and also the usability of the machine.
The TicketDistributor must be able to handle several exceptions, such as aborting the
transaction for incomplete transactions, the insufficient amount given by the travellers to the
machine, money return in case of an aborted transaction, change return after a successful
transaction, showing insufficient balance in the card, updated information printed on the
tickets e.g. departure time, date, time, price, valid from, valid till, validity duration, ticket
issued from and destination station. In case of exceptions, an error message is to be displayed.
We do not want user feedback after every development stage but after every two stages to
save time. The machine is required to work in a heavy load environment such that in the
morning and evening time on weekdays, and weekends performance and efficiency would
not be affected."""

### Step 1 - Identify Initial and Final state 

In [5]:
def identify_initial_final_states(problem_statement):

    prompt = f"""
    You are an expert software analyst. Your task is to identify Initial state and Final state from the given problem statement inorder to generate an activity diagram.     

    Output Format (Strict JSON):
    Return only valid JSON in the following example format:
    {{
        "initial": ["Start transaction"],
        "final": ["End transaction"]
    }}

    Problem Statement:
    {problem_statement}

    Now extract the initial and final states..
    """

    model = genai.GenerativeModel("gemini-1.5-flash") 
    response = model.generate_content(prompt, generation_config={"temperature": 0.0, "top_p": 1, "top_k": 1})
    
    # Debugging: Print raw response
    response_text = response.text.strip()
    print("Raw Response from Gemini:\n", response_text)

    # Remove triple backticks and 'json' keyword
    cleaned_text = re.sub(r"```json|```", "", response_text).strip()

    # Additional Debugging: Print cleaned text before parsing
    print("Cleaned JSON:\n", repr(cleaned_text))  # Use repr() to detect hidden characters

     # Check if cleaned_text is empty
    if not cleaned_text:
        print("Error: Cleaned JSON is empty. Cannot parse.")
        return {}

    # Parse JSON safely
    try:
        output = json.loads(cleaned_text)  # Convert string to JSON
        return output.get("initial", {})  + output.get("final", {})
    except json.JSONDecodeError as e:
        print("JSON parsing error:", str(e))
        return {}

identify_initial_final_states = identify_initial_final_states(problem_statement)
print(identify_initial_final_states)

Raw Response from Gemini:
 ```json
{
  "initial": ["Start transaction"],
  "final": ["End transaction"]
}
```
Cleaned JSON:
 '{\n  "initial": ["Start transaction"],\n  "final": ["End transaction"]\n}'
['Start transaction', 'End transaction']


### Identify Actors & Actions

In [6]:
def identified_actions(problem_statement):

    prompt = f"""
    You are an expert software analyst. Your task is to identify potential Actors from the given problem statement and their respective actions they perform. 
    Follow this detailed step-by-step process carefully to ensure accurate results:

    ### **Step-by-Step Approach:**
    1. **Identify Actions for Each Actors:**
       - Carefully analyze the problem statement for any **verbs** or **actions** linked to each actors.
       - If a verb is associated with an object or role, treat it as a candidate for a class operation.
       - Ensure that actions reflect the **core behavior** or responsibility of the actor.
       - Ignore vague or irrelevant actions.

    2. **Ensure Coherence Between Actors and Actions:**
       - Make sure that the Actions are relevant to the actor.

    3. **Ignore Unrelated or Redundant Terms:**
       - Ignore adjectives, adverbs, and irrelevant terms unless they provide meaningful context.
       - Focus on meaningful, domain-relevant terms only.

    Output Format (Strict JSON):
    Return only valid JSON in the following example format:
    {{
        "actions": {{
            "Actor1": ["action1", "action2"],
            "Actor2": ["action1", "action2"],
            ...
        }}
    }}

    **Problem Statement:**
    {problem_statement}

    Now extract the actions for each actor.
    """

    model = genai.GenerativeModel("gemini-1.5-flash") 
    response = model.generate_content(prompt, generation_config={"temperature": 0.0, "top_p": 1, "top_k": 1})
    
    # Debugging: Print raw response
    response_text = response.text.strip()
    print("Raw Response from Gemini:\n", response_text)

    # Remove triple backticks and 'json' keyword
    cleaned_text = re.sub(r"```json|```", "", response_text).strip()

    # Additional Debugging: Print cleaned text before parsing
    print("Cleaned JSON:\n", repr(cleaned_text))  # Use repr() to detect hidden characters

    # Check if cleaned_text is empty
    if not cleaned_text:
        print("Error: Cleaned JSON is empty. Cannot parse.")
        return {}

    # Parse JSON safely
    try:
        output = json.loads(cleaned_text)  # Convert string to JSON
        return output.get("actions", {})  
    except json.JSONDecodeError as e:
        print("JSON parsing error:", str(e))
        return {}

identified_actions = identified_actions(problem_statement)
print("\nIdentified Actions", identified_actions)

Raw Response from Gemini:
 ```json
{
  "actions": {
    "Traveller": [
      "select ticket type",
      "select ticket quantity",
      "choose payment method",
      "insert cash",
      "insert credit/debit card",
      "use smartcard"
    ],
    "TicketDistributor": [
      "issue tickets",
      "issue metro pass",
      "issue time card",
      "read metro pass",
      "read time card",
      "calculate fare",
      "apply discounts",
      "dispense tickets",
      "return change",
      "return money",
      "display error message",
      "print tickets",
      "send message",
      "recognize currency",
      "abort transaction",
      "handle insufficient funds",
      "check card balance"
    ],
    "System": [
      "handle heavy load",
      "maintain performance",
      "maintain efficiency"
    ]
  }
}
```
Cleaned JSON:
 '{\n  "actions": {\n    "Traveller": [\n      "select ticket type",\n      "select ticket quantity",\n      "choose payment method",\n      "insert cash

### Identify intermediate activities

In [7]:
def identified_activities(problem_statement, actions):

    prompt = f"""
    You are an expert software analyst. Your task is to identify activities from the given problem statement and their respective actions performed by actors. 
    Follow this detailed step-by-step process carefully to ensure accurate results:

    Output Format (Strict JSON):
    Return only valid JSON in the following example format:
    {{
        "activities": ["activity1", "activity2"]
    }}

    **Actions Identified:**
    {actions}

    **Problem Statement:**
    {problem_statement}

    Now extract the activities.
    """

    model = genai.GenerativeModel("gemini-1.5-flash") 
    response = model.generate_content(prompt, generation_config={"temperature": 0.0, "top_p": 1, "top_k": 1})
    
    # Debugging: Print raw response
    response_text = response.text.strip()
    print("Raw Response from Gemini:\n", response_text)

    # Remove triple backticks and 'json' keyword
    cleaned_text = re.sub(r"```json|```", "", response_text).strip()

    # Additional Debugging: Print cleaned text before parsing
    print("Cleaned JSON:\n", repr(cleaned_text))  # Use repr() to detect hidden characters

    # Check if cleaned_text is empty
    if not cleaned_text:
        print("Error: Cleaned JSON is empty. Cannot parse.")
        return {}

    # Parse JSON safely
    try:
        output = json.loads(cleaned_text)  # Convert string to JSON
        return output.get("activities", {})  
    except json.JSONDecodeError as e:
        print("JSON parsing error:", str(e))
        return {}

# Get identified classes and their operations
identified_activities = identified_activities(problem_statement, identified_actions)
print("\nIdentified Activities", identified_activities)

Raw Response from Gemini:
 ```json
{
  "activities": ["Ticket Purchase", "Metro Pass Issuance", "Time Card Issuance", "Metro Pass/Time Card Reading", "Fare Calculation", "Discount Application", "Payment Processing", "Ticket Dispensing", "Change Return", "Error Handling", "Transaction Management", "System Maintenance"]
}
```
Cleaned JSON:
 '{\n  "activities": ["Ticket Purchase", "Metro Pass Issuance", "Time Card Issuance", "Metro Pass/Time Card Reading", "Fare Calculation", "Discount Application", "Payment Processing", "Ticket Dispensing", "Change Return", "Error Handling", "Transaction Management", "System Maintenance"]\n}'

Identified Activities ['Ticket Purchase', 'Metro Pass Issuance', 'Time Card Issuance', 'Metro Pass/Time Card Reading', 'Fare Calculation', 'Discount Application', 'Payment Processing', 'Ticket Dispensing', 'Change Return', 'Error Handling', 'Transaction Management', 'System Maintenance']


### Identify Control Flow

In [8]:
def identify_controlflow(problem_statement, actions):

    prompt = f"""
    You are an expert software analyst. Your task is to identify **all types of control flows** between the given actions and actors based on the problem statement.
    Control flows are logical flow between actions or order of executions in activity diagram.

    Each flow should specify:
      Source : Starting activity
      Target : Next activity
      Condition : Decision criteria (if applicable)
      Type : Type of flow (control, decision, fork, join)
    ✅ Ensure the output follows **strict JSON format** as shown below:  
      {{
          "flows": [
                {{
                  "source": "<Source Activity>",
                  "target": "<Target Activity>",
                  "type": "<Type of Flow>",
                  "condition": "<Condition if any>"
                }}
              ]
      }}

    **actions Identified:**
    {actions}

    **Problem Statement:**
    {problem_statement}

    Now extract **all types of control flows** between the actions.
    """

    model = genai.GenerativeModel("gemini-1.5-flash") 
    response = model.generate_content(prompt, generation_config={"temperature": 0, "top_p": 1, "top_k": 1})

   # Debugging: Print raw response
    response_text = response.text.strip()
    print("Raw Response from Gemini:\n", response_text)

    # Remove triple backticks and 'json' keyword
    cleaned_text = re.sub(r"```json|```", "", response_text).strip()

    # Additional Debugging: Print cleaned text before parsing
    print("Cleaned JSON:\n", repr(cleaned_text))  # Use repr() to detect hidden characters

    # Check if cleaned_text is empty
    if not cleaned_text:
        print("Error: Cleaned JSON is empty. Cannot parse.")
        return {}

    # Parse JSON safely
    try:
        output = json.loads(cleaned_text)  # Convert string to JSON
        return output.get("flows", {})  
    except json.JSONDecodeError as e:
        print("JSON parsing error:", str(e))
        return {}

identify_controlflow = identify_controlflow(problem_statement, identified_actions)

print("ControlFlows Identified:", identify_controlflow)

Raw Response from Gemini:
 ```json
{
  "flows": [
    {
      "source": "Traveller: select ticket type",
      "target": "TicketDistributor: calculate fare",
      "type": "control",
      "condition": null
    },
    {
      "source": "Traveller: select ticket quantity",
      "target": "TicketDistributor: calculate fare",
      "type": "control",
      "condition": null
    },
    {
      "source": "TicketDistributor: calculate fare",
      "target": "TicketDistributor: apply discounts",
      "type": "control",
      "condition": "Frequent traveller"
    },
    {
      "source": "TicketDistributor: calculate fare",
      "target": "Traveller: choose payment method",
      "type": "control",
      "condition": null
    },
    {
      "source": "Traveller: choose payment method",
      "target": "Traveller: insert cash",
      "type": "decision",
      "condition": "Payment method = Cash"
    },
    {
      "source": "Traveller: choose payment method",
      "target": "Traveller: inse

### Generate plantUML Script

list

#### Debugging