In [16]:
%pip install google-generativeai langchain langchain_google_genai

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 24.1.1 -> 25.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [17]:
from dotenv import load_dotenv

# Load API key from .env file
load_dotenv()

True

In [18]:
import os
import json
import re
import google.generativeai as genai

# Step 1: Setup Gemini API
genai.configure(api_key=os.environ["GEMINI_API_KEY"])

In [19]:
problem_statement = """
Metro station wants to establish a TicketDistributor machine that issues tickets for
passengers travelling on metro rails. Travellers have options of selecting a ticket for a single
trip, round trips or multiple trips. They can also issue a metro pass for regular passengers or
a time card for a day, a week or a month according to their requirements. The discounts on
tickets will be provided to frequent travelling passengers. The machine is also supposed to
read the metro pass and time cards issued by the metro counters or machine. The ticket rates
differ based on whether the traveller is a child or an adult. The machine is also required to
recognize original as well as fake currency notes. The typical transaction consists of a user
using the display interface to select the type and quantity of tickets and then choosing a
payment method of either cash, credit/debit card or smartcard. The tickets are printed and
dispensed to the user. Also, the messaging facilities after every transaction are required on
the registered number. The system can also be operated comfortably by a touch-screen. A
large number of heavy components are to be used. We do not want our system to slow down,
and also the usability of the machine.
The TicketDistributor must be able to handle several exceptions, such as aborting the
transaction for incomplete transactions, the insufficient amount given by the travellers to the
machine, money return in case of an aborted transaction, change return after a successful
transaction, showing insufficient balance in the card, updated information printed on the
tickets e.g. departure time, date, time, price, valid from, valid till, validity duration, ticket
issued from and destination station. In case of exceptions, an error message is to be displayed.
We do not want user feedback after every development stage but after every two stages to
save time. The machine is required to work in a heavy load environment such that in the
morning and evening time on weekdays, and weekends performance and efficiency would
not be affected."""

### Step 1 - Identify Actions

In [20]:
def identify_actions(problem_statement):
    """
    Identifies potential actions from a given problem statement using Gemini API with CoT prompting.
    """
    prompt = f"""
    You are an expert software analyst. Your task is to identify **key actions** from the given problem statement. 
    Follow this detailed step-by-step process carefully to ensure accurate results:

    ### **Step-by-Step Approach:**
    1. **Identify Core Actions:**
       - Carefully analyze the problem statement for any **verbs** or **actions** that represent key behaviors of the system.
       - Focus on verbs or phrases that reflect meaningful state changes or user interactions.
       - Ignore vague, ambiguous, or irrelevant actions unless they provide useful context.

    2. **Group and Format Actions:**
       - Convert identified actions into **camelCase**.
       - Ensure that the actions are logically grouped and contextually accurate.
       - Remove any redundant or overlapping actions.

    3. **Ensure Completeness and Relevance:**
       - Ensure that the identified actions reflect the actual flow of the system.
       - Remove any vague or incomplete terms.

    ### **Output Format (Strict JSON):**
    Return only valid JSON in the following format:
    {{
        "actions": ["action1", "action2", "action3"]
    }}

    **Problem Statement:**
    {problem_statement}

    Now extract the key actions based on the above rules.
    """

    try:
        # Generate response using Gemini model
        model = genai.GenerativeModel("gemini-1.5-flash")
        response = model.generate_content(prompt, generation_config={"temperature": 0.0, "top_p": 1, "top_k": 1})
        
        # Debugging: Print raw response
        response_text = response.text.strip()
        print("Raw Response from Gemini:\n", response_text)

        # Clean up response (remove markdown formatting)
        cleaned_text = re.sub(r"```json|```", "", response_text).strip()

        # Additional Debugging: Print cleaned text before parsing
        print("Cleaned JSON:\n", repr(cleaned_text))  # Use repr() to detect hidden characters

        # Check if cleaned_text is empty
        if not cleaned_text:
            print("Error: Cleaned JSON is empty. Cannot parse.")
            return {}

        # Parse JSON safely
        output = json.loads(cleaned_text)
        return output.get("actions", [])

    except json.JSONDecodeError as e:
        print("JSON parsing error:", str(e))
        return {}

    except Exception as e:
        print("Error:", str(e))
        return {}

# Identify actions using Gemini
identified_actions = identify_actions(problem_statement)

# Display Result
print("\nIdentified Actions:", identified_actions)


Raw Response from Gemini:
 ```json
{
  "actions": ["selectTicketType", "selectTicketQuantity", "choosePaymentMethod", "processCashPayment", "processCardPayment", "processSmartcardPayment", "printTicket", "dispenseTicket", "sendTransactionMessage", "readMetroPass", "readTimeCard", "returnChange", "returnMoney", "displayErrorMessage", "abortTransaction", "checkSufficientBalance", "verifyCurrency", "updateTicketInformation"]
}
```
Cleaned JSON:
 '{\n  "actions": ["selectTicketType", "selectTicketQuantity", "choosePaymentMethod", "processCashPayment", "processCardPayment", "processSmartcardPayment", "printTicket", "dispenseTicket", "sendTransactionMessage", "readMetroPass", "readTimeCard", "returnChange", "returnMoney", "displayErrorMessage", "abortTransaction", "checkSufficientBalance", "verifyCurrency", "updateTicketInformation"]\n}'

Identified Actions: ['selectTicketType', 'selectTicketQuantity', 'choosePaymentMethod', 'processCashPayment', 'processCardPayment', 'processSmartcardPaymen

### Step 2 - Define Activity Nodes

In [21]:
def define_activity_nodes(actions):
    """
    Identifies activity nodes from given actions using Gemini API with CoT prompting.
    """
    prompt = f"""
    You are an expert software analyst. Your task is to define **activity nodes** for an activity diagram 
    based on the identified actions. Follow this structured step-by-step process carefully:

    ### **Step-by-Step Approach:**
    1. **Map Actions to Nodes:**
       - For each identified action, create a corresponding node.
       - Ensure each node reflects the system’s behavior accurately.

    2. **Classify Nodes:**
       - If it represents the starting point → classify it as an **Initial Node**.
       - If it represents a state change or process → classify it as an **Action Node**.
       - If it represents a branching point → classify it as a **Decision Node**.
       - If it merges control flow → classify it as a **Merge Node**.
       - If it splits control into parallel flows → classify it as a **Fork Node**.
       - If it synchronizes control flow → classify it as a **Join Node**.
       - If it represents the termination point → classify it as a **Final Node**.

    3. **Ensure Completeness and Logical Flow:**
       - Start from the **Initial Node**.
       - Include all major actions.
       - Ensure proper transitions between nodes.
       - Conclude with a **Final Node**.

    4. **Output Format (Strict JSON):**
    Return only valid JSON in the following format:
    {{
        "nodes": [
            {{"type": "Initial Node", "name": "start"}},
            {{"type": "Action Node", "name": "issueTicket"}},
            {{"type": "Decision Node", "name": "isPaymentValid"}},
            {{"type": "Merge Node", "name": "mergeValidation"}},
            {{"type": "Fork Node", "name": "splitPaymentMethods"}},
            {{"type": "Join Node", "name": "syncCompletion"}},
            {{"type": "Final Node", "name": "end"}}
        ]
    }}

    **Identified Actions:**
    {actions}

    Now define the activity nodes based on the above rules.
    """

    try:
        # Generate response using Gemini model
        model = genai.GenerativeModel("gemini-1.5-flash")
        response = model.generate_content(prompt, generation_config={"temperature": 0.0, "top_p": 1, "top_k": 1})
        
        # Debugging: Print raw response
        response_text = response.text.strip()
        print("Raw Response from Gemini:\n", response_text)

        # Clean up response (remove markdown formatting)
        cleaned_text = re.sub(r"```json|```", "", response_text).strip()

        # Additional Debugging: Print cleaned text before parsing
        print("Cleaned JSON:\n", repr(cleaned_text))

        # Check if cleaned_text is empty
        if not cleaned_text:
            print("Error: Cleaned JSON is empty. Cannot parse.")
            return {}

        # Parse JSON safely
        output = json.loads(cleaned_text)
        return output.get("nodes", [])

    except json.JSONDecodeError as e:
        print("JSON parsing error:", str(e))
        return {}

    except Exception as e:
        print("Error:", str(e))
        return {}

# Generate Activity Nodes using Gemini
identified_nodes = define_activity_nodes(identified_actions)

# Display Result
print("\nIdentified Activity Nodes:", identified_nodes)


Raw Response from Gemini:
 ```json
{
  "nodes": [
    {
      "type": "Initial Node",
      "name": "start"
    },
    {
      "type": "Action Node",
      "name": "selectTicketType"
    },
    {
      "type": "Action Node",
      "name": "selectTicketQuantity"
    },
    {
      "type": "Action Node",
      "name": "updateTicketInformation"
    },
    {
      "type": "Action Node",
      "name": "choosePaymentMethod"
    },
    {
      "type": "Decision Node",
      "name": "isPaymentValid"
    },
    {
      "type": "Fork Node",
      "name": "splitPaymentMethods"
    },
    {
      "type": "Action Node",
      "name": "processCashPayment"
    },
    {
      "type": "Action Node",
      "name": "returnChange"
    },
    {
      "type": "Action Node",
      "name": "processCardPayment"
    },
    {
      "type": "Action Node",
      "name": "processSmartcardPayment"
    },
    {
      "type": "Action Node",
      "name": "readMetroPass"
    },
    {
      "type": "Action Node",
      

### Step 3 - Establish Control Flow

In [22]:
def establish_control_flow(nodes):
    """
    Establishes control flow between activity nodes using Gemini API with CoT prompting.
    """
    prompt = f"""
    You are an expert software analyst. Your task is to establish **control flow** between the defined activity nodes 
    for an activity diagram. Follow this structured step-by-step process carefully:

    ### **Step-by-Step Approach:**
    1. **Establish Transition Rules:**
       - Ensure a clear transition from the **Initial Node** to the first **Action Node**.
       - Maintain logical progression between nodes.
       - Ensure a proper transition to the **Final Node** at the end.

    2. **Map Conditions and Decision Flow:**
       - If a node represents a decision point → create a **Decision Node**.
       - Define the possible outcomes for the decision and link them to corresponding nodes.
       - If the flow merges after a decision → create a **Merge Node**.

    3. **Handle Parallel and Synchronous Flow:**
       - If multiple flows can occur simultaneously → create a **Fork Node**.
       - Ensure synchronization → use a **Join Node** after all parallel flows complete.

    4. **Ensure Logical Consistency:**
       - All nodes should be linked correctly without any dead ends.
       - Ensure there are no unconnected nodes.

    5. **Remove Redundancies:**
       - Remove repeated or overlapping transitions.

    ### **Output Format (Strict JSON):**
    Return only valid JSON in the following format:
    {{
        "control_flow": [
            {{"source": "start", "target": "issueTicket", "type": "direct"}},
            {{"source": "issueTicket", "target": "isPaymentValid", "type": "decision"}},
            {{"source": "isPaymentValid", "target": "completeTransaction", "type": "merge"}},
            {{"source": "completeTransaction", "target": "end", "type": "direct"}}
        ]
    }}

    **Defined Nodes:**
    {nodes}

    Now establish the control flow between these nodes.
    """

    try:
        # Generate response using Gemini model
        model = genai.GenerativeModel("gemini-1.5-flash")
        response = model.generate_content(prompt, generation_config={"temperature": 0.0, "top_p": 1, "top_k": 1})
        
        # Debugging: Print raw response
        response_text = response.text.strip()
        print("Raw Response from Gemini:\n", response_text)

        # Clean up response (remove markdown formatting)
        cleaned_text = re.sub(r"```json|```", "", response_text).strip()

        # Additional Debugging: Print cleaned text before parsing
        print("Cleaned JSON:\n", repr(cleaned_text))

        # Check if cleaned_text is empty
        if not cleaned_text:
            print("Error: Cleaned JSON is empty. Cannot parse.")
            return {}

        # Parse JSON safely
        output = json.loads(cleaned_text)
        return output.get("control_flow", [])

    except json.JSONDecodeError as e:
        print("JSON parsing error:", str(e))
        return {}

    except Exception as e:
        print("Error:", str(e))
        return {}

# Generate Control Flow using Gemini
control_flow = establish_control_flow(identified_nodes)

# Display Result
print("\nGenerated Control Flow:", control_flow)


Raw Response from Gemini:
 ```json
{
  "control_flow": [
    {"source": "start", "target": "selectTicketType", "type": "direct"},
    {"source": "selectTicketType", "target": "selectTicketQuantity", "type": "direct"},
    {"source": "selectTicketQuantity", "target": "updateTicketInformation", "type": "direct"},
    {"source": "updateTicketInformation", "target": "choosePaymentMethod", "type": "direct"},
    {"source": "choosePaymentMethod", "target": "isPaymentValid", "type": "direct"},
    {"source": "isPaymentValid", "target": "mergeValidation", "type": "true"},
    {"source": "isPaymentValid", "target": "displayErrorMessage", "type": "false"},
    {"source": "displayErrorMessage", "target": "abortTransaction", "type": "direct"},
    {"source": "abortTransaction", "target": "end", "type": "direct"},
    {"source": "mergeValidation", "target": "splitPaymentMethods", "type": "direct"},
    {"source": "splitPaymentMethods", "target": "processCashPayment", "type": "fork"},
    {"source":

### Step 4 - Generate plantUML Script

In [23]:
import os

def generate_plantuml(nodes, control_flow, output_file="activity_diagram.puml"):
    """
    Generates a PlantUML activity diagram from identified nodes and control flow.
    """
    plantuml_code = "@startuml\n\n"
    
    # Define Initial and Final Nodes
    for node in nodes:
        if node["type"] == "Initial Node":
            plantuml_code += f"(*) -> :{node['name']} ;\n"
        elif node["type"] == "Final Node":
            plantuml_code += f":{node['name']} ; -> (*)\n"

    # Handle Other Nodes and Control Flow
    for flow in control_flow:
        source = flow["source"]
        target = flow["target"]
        flow_type = flow.get("type", "direct")

        if flow_type == "decision":
            plantuml_code += f"if ({source}) then\n"
            plantuml_code += f"  -> :{target} ;\n"
            plantuml_code += "else\n"
            plantuml_code += f"  -> :Alternative Path ;\n"
            plantuml_code += "endif\n"
        elif flow_type == "merge":
            plantuml_code += f":{source} ; -> :{target} ;\n"
        elif flow_type == "fork":
            plantuml_code += f"fork\n"
            plantuml_code += f"  -> :{target} ;\n"
            plantuml_code += "end fork\n"
        elif flow_type == "join":
            plantuml_code += f"fork again\n"
            plantuml_code += f"  -> :{target} ;\n"
            plantuml_code += "end fork\n"
        else:
            plantuml_code += f":{source} ; -> :{target} ;\n"

    plantuml_code += "\n@enduml"

    # Save to file
    with open(output_file, "w") as file:
        file.write(plantuml_code)

    print(f"PlantUML file '{output_file}' generated successfully.")

# Generate UML
generate_plantuml(identified_nodes, control_flow)

# ✅ To render, use:
# `plantuml activity_diagram.puml`


PlantUML file 'activity_diagram.puml' generated successfully.


### Debugging...

In [30]:
import os

def generate_plantuml(nodes, control_flow, output_file="debug.puml"):
    """
    Generates a PlantUML activity diagram script from identified nodes and control flow.
        """
    
    prompt = f"""
    You are an expert UML Diagram Developer. Your task is to generate plantUML script for defined activity nodes and control flows for an activity diagram.

    **Defined Nodes:**
    {nodes}

    **Defined Control_flow**
    {control_flow}

    Now generate the PlantUML activity diagram. 
    **Return only the PlantUML code.**  
    """
    

    try:
        # Generate response using Gemini model
        model = genai.GenerativeModel("gemini-1.5-flash")
        response = model.generate_content(prompt, generation_config={"temperature": 0.7, "top_p": 1, "top_k": 1})
        
        # Debugging: Print raw response
        response_text = response.text.strip()
        print("Raw Response from Gemini:\n", response_text)

        # Clean up response (remove markdown formatting)
        cleaned_text = re.sub(r"```plantuml|```", "", response_text).strip()

        # Additional Debugging: Print cleaned text before saving
        print("Cleaned Code:\n", repr(cleaned_text))

        # ✅ Save to file
        with open(output_file, "w") as file:
            file.write(cleaned_text)

        print(f"✅ PlantUML file '{output_file}' generated successfully.")

        return cleaned_text

    except Exception as e:
        print("Error:", str(e))
        return None

# Generate UML
plantuml_code = generate_plantuml(identified_nodes, control_flow)

# ✅ To render, use:
# `plantuml output.puml`

Raw Response from Gemini:
 ```plantuml
@startuml
start

:selectTicketType;
:selectTicketQuantity;
:updateTicketInformation;
:choosePaymentMethod;

if (isPaymentValid) then (yes)
    fork
        :processCashPayment;
        :returnChange;
    fork again
        :processCardPayment;
    fork again
        :processSmartcardPayment;
        :readMetroPass;
        :readTimeCard;
        :checkSufficientBalance;
    end fork
    :verifyCurrency;
    :returnMoney;
    :syncCompletion;
else (no)
    :displayErrorMessage;
    :abortTransaction;
endif

:sendTransactionMessage;
:printTicket;
:dispenseTicket;

if (isTransactionSuccessful) then (yes)
    :mergeValidation;
else (no)
    :displayErrorMessage;
    :abortTransaction;
    :mergeValidation;
endif

stop

@enduml
```
Cleaned Code:
 '@startuml\nstart\n\n:selectTicketType;\n:selectTicketQuantity;\n:updateTicketInformation;\n:choosePaymentMethod;\n\nif (isPaymentValid) then (yes)\n    fork\n        :processCashPayment;\n        :returnChang