In [1]:
%pip install google-generativeai langchain langchain_google_genai

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 24.1.1 -> 25.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [2]:
from dotenv import load_dotenv

# Load API key from .env file
load_dotenv()

True

In [3]:
import os
import json
import re
import google.generativeai as genai

# Step 1: Setup Gemini API
genai.configure(api_key=os.environ["GEMINI_API_KEY"])

  from .autonotebook import tqdm as notebook_tqdm


In [16]:
problem_statement = """A mail-order company wants to automate its order
processing. The initial version of the order processing system should be accessible to customers
via the web. Customers can also call the company by phone and interact with the system via a
customer representative. It is highly likely that the company will enhance this system in upcoming
years with new features.
The system allows customers to place orders, check the status of their orders, cancel an existing
order, and request a catalog. Customers may also return a product, but this is only possible through
the phone, not available on the web. When placing an order, the customer identifies himself either
by means of a customer number (for existing registered customers) or by providing his name and
address. He then selects a number of products by giving the product number or by selecting
products from the online catalog. For each product, information such as price, a description, and
a picture (only on demand as they are usually high-resolution images of large size) is presented to
the customer. Also, the availability of the product is obtained from the inventory. The customer
indicates whether he wants to buy the product and in what quantity. When all desired products
have been selected, the customer provides a shipping address and a credit card number and a
billing address (if different from the shipping address). Then an overview of the ordered products
and the total cost are presented. If the customer approves, the order is submitted. Credit card
number, billing address, and a specification of the cost of the order are used on the invoice, which
is forwarded to the accounting system (an existing software module). Orders are forwarded to the
shipping company, where they are filled and shipped.
Customers who spent over a certain amount within the past year are promoted to be gold cus-
tomers. Gold customers have additional rights such as being able to return products in an extended
time period as well as earning more bonus points with each purchase. In addition, in cases where
a product is on back order, gold customers have the option to sign up for an email notification
for when the particular product becomes available."""

### Step 1 - Identify Actions

In [17]:
def identify_actions(problem_statement):
    """
    Identifies potential actions from a given problem statement using Gemini API with CoT prompting.
    """
    prompt = f"""
    You are an expert software analyst. Your task is to identify **key actions** from the given problem statement. 
    Follow this detailed step-by-step process carefully to ensure accurate results:

    ### **Step-by-Step Approach:**
    1. **Identify Core Actions:**
       - Carefully analyze the problem statement for any **verbs** or **actions** that represent key behaviors of the system.
       - Focus on verbs or phrases that reflect meaningful state changes or user interactions.
       - Ignore vague, ambiguous, or irrelevant actions unless they provide useful context.

    2. **Group and Format Actions:**
       - Convert identified actions into **camelCase**.
       - Ensure that the actions are logically grouped and contextually accurate.
       - Remove any redundant or overlapping actions.

    3. **Ensure Completeness and Relevance:**
       - Ensure that the identified actions reflect the actual flow of the system.
       - Remove any vague or incomplete terms.

    ### **Output Format (Strict JSON):**
    Return only valid JSON in the following format:
    {{
        "actions": ["action1", "action2", "action3"]
    }}

    **Problem Statement:**
    {problem_statement}

    Now extract the key actions based on the above rules.
    """

    try:
        # Generate response using Gemini model
        model = genai.GenerativeModel("gemini-1.5-flash")
        response = model.generate_content(prompt, generation_config={"temperature": 0.0, "top_p": 1, "top_k": 1})
        
        # Debugging: Print raw response
        response_text = response.text.strip()
        print("Raw Response from Gemini:\n", response_text)

        # Clean up response (remove markdown formatting)
        cleaned_text = re.sub(r"```json|```", "", response_text).strip()


        # Additional Debugging: Print cleaned text before parsing
        print("Cleaned JSON:\n", repr(cleaned_text))  # Use repr() to detect hidden characters

        # Check if cleaned_text is empty
        if not cleaned_text:
            print("Error: Cleaned JSON is empty. Cannot parse.")
            return {}

        # Parse JSON safely
        output = json.loads(cleaned_text)
        return output.get("actions", [])

    except json.JSONDecodeError as e:
        print("JSON parsing error:", str(e))
        return {}

    except Exception as e:
        print("Error:", str(e))
        return {}

# Identify actions using Gemini
identified_actions = identify_actions(problem_statement)

# Display Result
print("\nIdentified Actions:", identified_actions)

Raw Response from Gemini:
 ```json
{
  "actions": ["placeOrder", "checkOrderStatus", "cancelOrder", "requestCatalog", "returnProductByPhone", "identifyCustomer", "selectProduct", "getProductInformation", "checkProductAvailability", "provideShippingAddress", "provideCreditCardNumber", "provideBillingAddress", "submitOrder", "forwardOrderToShipping", "promoteToGoldCustomer", "returnProductExtended", "earnBonusPoints", "signUpForEmailNotification"]
}
```
Cleaned JSON:
 '{\n  "actions": ["placeOrder", "checkOrderStatus", "cancelOrder", "requestCatalog", "returnProductByPhone", "identifyCustomer", "selectProduct", "getProductInformation", "checkProductAvailability", "provideShippingAddress", "provideCreditCardNumber", "provideBillingAddress", "submitOrder", "forwardOrderToShipping", "promoteToGoldCustomer", "returnProductExtended", "earnBonusPoints", "signUpForEmailNotification"]\n}'

Identified Actions: ['placeOrder', 'checkOrderStatus', 'cancelOrder', 'requestCatalog', 'returnProductByPho

### Step 2 - Define Activity Nodes

In [18]:
def define_activity_nodes(actions):
    """
    Identifies activity nodes from given actions using Gemini API with CoT prompting.
    """
    prompt = f"""
    You are an expert software analyst. Your task is to define **activity nodes** for an activity diagram 
    based on the identified actions. Follow this structured step-by-step process carefully:

    ### **Step-by-Step Approach:**
    1. **Map Actions to Nodes:**
       - For each identified action, create a corresponding node.
       - Ensure each node reflects the system’s behavior accurately.

    2. **Classify Nodes:**
       - If it represents the starting point → classify it as an **Initial Node**.
       - If it represents a state change or process → classify it as an **Action Node**.
       - If it represents a branching point → classify it as a **Decision Node**.
       - If it merges control flow → classify it as a **Merge Node**.
       - If it splits control into parallel flows → classify it as a **Fork Node**.
       - If it synchronizes control flow → classify it as a **Join Node**.
       - If it represents the termination point → classify it as a **Final Node**.

    3. **Ensure Completeness and Logical Flow:**
       - Start from the **Initial Node**.
       - Include all major actions.
       - Ensure proper transitions between nodes.
       - Conclude with a **Final Node**.

    4. **Output Format (Strict JSON):**
    Return only valid JSON in the following format:
    {{
        "nodes": [
            {{"type": "Initial Node", "name": "start"}},
            {{"type": "Action Node", "name": "issueTicket"}},
            {{"type": "Decision Node", "name": "isPaymentValid"}},
            {{"type": "Merge Node", "name": "mergeValidation"}},
            {{"type": "Fork Node", "name": "splitPaymentMethods"}},
            {{"type": "Join Node", "name": "syncCompletion"}},
            {{"type": "Final Node", "name": "end"}}
        ]
    }}

    **Identified Actions:**
    {actions}

    Now define the activity nodes based on the above rules.
    """

    try:
        # Generate response using Gemini model
        model = genai.GenerativeModel("gemini-1.5-flash")
        response = model.generate_content(prompt, generation_config={"temperature": 0.0, "top_p": 1, "top_k": 1})
        
        # Debugging: Print raw response
        response_text = response.text.strip()
        print("Raw Response from Gemini:\n", response_text)

        # Clean up response (remove markdown formatting)
        cleaned_text = re.sub(r"```json|```", "", response_text).strip()

        # Additional Debugging: Print cleaned text before parsing
        print("Cleaned JSON:\n", repr(cleaned_text))

        # Check if cleaned_text is empty
        if not cleaned_text:
            print("Error: Cleaned JSON is empty. Cannot parse.")
            return {}

        # Parse JSON safely
        output = json.loads(cleaned_text)
        return output.get("nodes", [])

    except json.JSONDecodeError as e:
        print("JSON parsing error:", str(e))
        return {}

    except Exception as e:
        print("Error:", str(e))
        return {}

# Generate Activity Nodes using Gemini
identified_nodes = define_activity_nodes(identified_actions)

# Display Result
print("\nIdentified Activity Nodes:", identified_nodes)

Raw Response from Gemini:
 ```json
{
  "nodes": [
    {
      "type": "Initial Node",
      "name": "start"
    },
    {
      "type": "Action Node",
      "name": "identifyCustomer"
    },
    {
      "type": "Action Node",
      "name": "requestCatalog"
    },
    {
      "type": "Action Node",
      "name": "selectProduct"
    },
    {
      "type": "Action Node",
      "name": "getProductInformation"
    },
    {
      "type": "Action Node",
      "name": "checkProductAvailability"
    },
    {
      "type": "Decision Node",
      "name": "isProductAvailable"
    },
    {
      "type": "Action Node",
      "name": "placeOrder"
    },
    {
      "type": "Action Node",
      "name": "provideShippingAddress"
    },
    {
      "type": "Action Node",
      "name": "provideBillingAddress"
    },
    {
      "type": "Action Node",
      "name": "provideCreditCardNumber"
    },
    {
      "type": "Action Node",
      "name": "submitOrder"
    },
    {
      "type": "Decision Node",
    

### Step 3 - Establish Control Flow

In [19]:
def establish_control_flow(nodes):
    """
    Establishes control flow between activity nodes using Gemini API with CoT prompting.
    """
    prompt = f"""
    You are an expert software analyst. Your task is to establish **control flow** between the defined activity nodes 
    for an activity diagram. Follow this structured step-by-step process carefully:

    ### **Step-by-Step Approach:**
    1. **Establish Transition Rules:**
       - Ensure a clear transition from the **Initial Node** to the first **Action Node**.
       - Maintain logical progression between nodes.
       - Ensure a proper transition to the **Final Node** at the end.

    2. **Map Conditions and Decision Flow:**
       - If a node represents a decision point → create a **Decision Node**.
       - Define the possible outcomes for the decision and link them to corresponding nodes.
       - If the flow merges after a decision → create a **Merge Node**.

    3. **Handle Parallel and Synchronous Flow:**
       - If multiple flows can occur simultaneously → create a **Fork Node**.
       - Ensure synchronization → use a **Join Node** after all parallel flows complete.

    4. **Ensure Logical Consistency:**
       - All nodes should be linked correctly without any dead ends.
       - Ensure there are no unconnected nodes.

    5. **Remove Redundancies:**
       - Remove repeated or overlapping transitions.

    ### **Output Format (Strict JSON):**
    Return only valid JSON in the following format:
    {{
        "control_flow": [
            {{"source": "start", "target": "issueTicket", "type": "direct"}},
            {{"source": "issueTicket", "target": "isPaymentValid", "type": "decision"}},
            {{"source": "isPaymentValid", "target": "completeTransaction", "type": "merge"}},
            {{"source": "completeTransaction", "target": "end", "type": "direct"}}
        ]
    }}

    **Defined Nodes:**
    {nodes}

    Now establish the control flow between these nodes.
    """

    try:
        # Generate response using Gemini model
        model = genai.GenerativeModel("gemini-1.5-flash")
        response = model.generate_content(prompt, generation_config={"temperature": 0.7, "top_p": 1, "top_k": 1})
        
        # Debugging: Print raw response
        response_text = response.text.strip()
        print("Raw Response from Gemini:\n", response_text)

        # Clean up response (remove markdown formatting)
        cleaned_text = re.sub(r"```json|```", "", response_text).strip()

        # Additional Debugging: Print cleaned text before parsing
        print("Cleaned JSON:\n", repr(cleaned_text))

        # Check if cleaned_text is empty
        if not cleaned_text:
            print("Error: Cleaned JSON is empty. Cannot parse.")
            return {}

        # Parse JSON safely
        output = json.loads(cleaned_text)
        return output.get("control_flow", [])

    except json.JSONDecodeError as e:
        print("JSON parsing error:", str(e))
        return {}

    except Exception as e:
        print("Error:", str(e))
        return {}

# Generate Control Flow using Gemini
control_flow = establish_control_flow(identified_nodes)

# Display Result
print("\nGenerated Control Flow:", control_flow)

Raw Response from Gemini:
 ```json
{
  "control_flow": [
    {"source": "start", "target": "identifyCustomer", "type": "direct"},
    {"source": "identifyCustomer", "target": "requestCatalog", "type": "direct"},
    {"source": "requestCatalog", "target": "selectProduct", "type": "direct"},
    {"source": "selectProduct", "target": "getProductInformation", "type": "direct"},
    {"source": "getProductInformation", "target": "checkProductAvailability", "type": "direct"},
    {"source": "checkProductAvailability", "target": "isProductAvailable", "type": "direct"},
    {"source": "isProductAvailable", "target": "placeOrder", "type": "true"},
    {"source": "isProductAvailable", "target": "requestCatalog", "type": "false"},
    {"source": "placeOrder", "target": "provideShippingAddress", "type": "direct"},
    {"source": "provideShippingAddress", "target": "provideBillingAddress", "type": "direct"},
    {"source": "provideBillingAddress", "target": "provideCreditCardNumber", "type": "direct

### Step 4 - Generate plantUML Script

In [21]:
import os

def generate_plantuml(nodes, control_flow, output_file="approach1.puml"):
    """
    Generates a PlantUML activity diagram script from identified nodes and control flow.
    """
    
    prompt = f"""
    You are an expert UML Diagram Developer. Your task is to generate plantUML script for defined activity nodes and control flows for an activity diagram.

    **Defined Nodes:**
    {nodes}

    **Defined Control_flow**
    {control_flow}

    Now generate the PlantUML activity diagram. 
    **Return only the PlantUML code.**  
    """
    

    try:
        # Generate response using Gemini model
        model = genai.GenerativeModel("gemini-1.5-flash")
        response = model.generate_content(prompt, generation_config={"temperature": 0.0, "top_p": 1, "top_k": 1})
        
        # Debugging: Print raw response
        response_text = response.text.strip()
        print("Raw Response from Gemini:\n", response_text)

        # Clean up response (remove markdown formatting)
        cleaned_text = re.sub(r"```plantuml|```", "", response_text).strip()

        # Additional Debugging: Print cleaned text before saving
        print("Cleaned Code:\n", repr(cleaned_text))

        # ✅ Save to file
        with open(output_file, "w") as file:
            file.write(cleaned_text)

        print(f"✅ PlantUML file '{output_file}' generated successfully.")

        return cleaned_text

    except Exception as e:
        print("Error:", str(e))
        return None

# Generate UML
plantuml_code = generate_plantuml(identified_nodes, control_flow)

# ✅ To render, use:
# `plantuml output.puml`

Raw Response from Gemini:
 ```plantuml
@startuml
start

:identifyCustomer;
:requestCatalog;
:selectProduct;
:getProductInformation;
:checkProductAvailability;

if (isProductAvailable) then (true)
  :placeOrder;
  :provideShippingAddress;
  :provideBillingAddress;
  :provideCreditCardNumber;
  :submitOrder;
  if (isPaymentValid) then (true)
    :mergePaymentValidation;
    :forwardOrderToShipping;
    :earnBonusPoints;
    :promoteToGoldCustomer;
    :signUpForEmailNotification;
    :end;
  else (false)
    :cancelOrder;
    :checkOrderStatus;
    if (isOrderCancelled) then (true)
      fork
        :returnProductByPhone;
      fork again
        :returnProductExtended;
      end fork
      :joinReturnMethods;
      :end;
    else (false)
      :end;
    endif
  endif
else (false)
  :requestCatalog;
endif

@enduml
```
Cleaned Code:
 '@startuml\nstart\n\n:identifyCustomer;\n:requestCatalog;\n:selectProduct;\n:getProductInformation;\n:checkProductAvailability;\n\nif (isProductAvailable) t

### Debugging...

In [9]:
import os

def generate_plantuml(nodes, control_flow, output_file="debug.puml"):
    """
    Generates a PlantUML activity diagram script from identified nodes and control flow.
    """
    
    prompt = f"""
    You are an expert UML Diagram Developer. Your task is to generate plantUML script for defined activity nodes and control flows for an activity diagram.

    **Defined Nodes:**
    {nodes}

    **Defined Control_flow**
    {control_flow}

    Now generate the PlantUML activity diagram. 
    **Return only the PlantUML code.**  
    """
    

    try:
        # Generate response using Gemini model
        model = genai.GenerativeModel("gemini-1.5-flash")
        response = model.generate_content(prompt, generation_config={"temperature": 0.7, "top_p": 1, "top_k": 1})
        
        # Debugging: Print raw response
        response_text = response.text.strip()
        print("Raw Response from Gemini:\n", response_text)

        # Clean up response (remove markdown formatting)
        cleaned_text = re.sub(r"```plantuml|```", "", response_text).strip()

        # Additional Debugging: Print cleaned text before saving
        print("Cleaned Code:\n", repr(cleaned_text))

        # ✅ Save to file
        with open(output_file, "w") as file:
            file.write(cleaned_text)

        print(f"✅ PlantUML file '{output_file}' generated successfully.")

        return cleaned_text

    except Exception as e:
        print("Error:", str(e))
        return None

# Generate UML
plantuml_code = generate_plantuml(identified_nodes, control_flow)

# ✅ To render, use:
# `plantuml output.puml`

Raw Response from Gemini:
 ```plantuml
@startuml
start

:Start Process;

if (Condition A?) then (Yes)
  :Action 1;
  :Action 2;
else (No)
  :Action 3;
  :Action 4;
endif

:End Process;

stop
@enduml
```
Cleaned Code:
 '@startuml\nstart\n\n:Start Process;\n\nif (Condition A?) then (Yes)\n  :Action 1;\n  :Action 2;\nelse (No)\n  :Action 3;\n  :Action 4;\nendif\n\n:End Process;\n\nstop\n@enduml'
✅ PlantUML file 'debug.puml' generated successfully.
