<a href="https://colab.research.google.com/github/beloveddie/smart-contract-audit-loop/blob/main/main.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Human-in-the-Loop Smart Contract Auditing Workflow

built using the LlamaIndex's Workflow API...

_modeled after your [story-crafting notebook](https://github.com/beloveddie/AI-Craft/blob/main/docs/docs/examples/workflow/human_in_the_loop_story_crafting.ipynb)._

## 🧱 STEP 1: Define the Data Model (AuditSegment)

We’ll start by creating a structured output model that the LLM will use to:

* Summarize the function

* Identify potential risks

* Suggest improvements

In [1]:
# install the necessary librarie
!pip install llama-index

Collecting llama-index
  Downloading llama_index-0.12.29-py3-none-any.whl.metadata (12 kB)
Collecting llama-index-agent-openai<0.5.0,>=0.4.0 (from llama-index)
  Downloading llama_index_agent_openai-0.4.6-py3-none-any.whl.metadata (727 bytes)
Collecting llama-index-cli<0.5.0,>=0.4.1 (from llama-index)
  Downloading llama_index_cli-0.4.1-py3-none-any.whl.metadata (1.5 kB)
Collecting llama-index-core<0.13.0,>=0.12.29 (from llama-index)
  Downloading llama_index_core-0.12.29-py3-none-any.whl.metadata (2.6 kB)
Collecting llama-index-embeddings-openai<0.4.0,>=0.3.0 (from llama-index)
  Downloading llama_index_embeddings_openai-0.3.1-py3-none-any.whl.metadata (684 bytes)
Collecting llama-index-indices-managed-llama-cloud>=0.4.0 (from llama-index)
  Downloading llama_index_indices_managed_llama_cloud-0.6.11-py3-none-any.whl.metadata (3.6 kB)
Collecting llama-index-llms-openai<0.4.0,>=0.3.0 (from llama-index)
  Downloading llama_index_llms_openai-0.3.32-py3-none-any.whl.metadata (3.3 kB)
Colle

In [2]:
from typing import List
from llama_index.core.bridge.pydantic import BaseModel, Field

class AuditSegment(BaseModel):
    """Structured audit result for a single function."""
    summary: str = Field(description="A concise explanation of what the function does.")
    risks: List[str] = Field(description="A list of potential security vulnerabilities or logical flaws.")
    suggestions: List[str] = Field(description="Recommended improvements or fixes.")

## 🧠 STEP 2: Create the Prompt Template

We’ll design the prompt to guide the LLM to generate structured output matching `AuditSegment`.

In [3]:
from llama_index.core.prompts import PromptTemplate

AUDIT_TEMPLATE = """
You are a smart contract auditor. Analyze the following Solidity function:

{function_code}

Return your findings in structured format:
1. Summary of what the function does,
2. List of potential vulnerabilities,
3. Suggestions for fixing issues or improving security.

Output must match the AuditSegment schema:
- summary: string
- risks: List of strings
- suggestions: List of strings
"""

## 🧭 STEP 3: Define the Auditing Workflow Class
We’ll build a class `SmartContractAuditWorkflow`, similar to the [story-crafting one](https://github.com/beloveddie/AI-Craft/blob/main/docs/docs/examples/workflow/human_in_the_loop_story_crafting.ipynb), with two main steps:

**⚙️ Workflow Steps:**

1. `create_audit_segment` — LLM analyzes a function and outputs an `AuditSegment`.

2. `prompt_human_review` — Human sees the AI's review and gives feedback.

In [4]:
# set OPENAI_API_KEY from Colab Secrets
from google.colab import userdata
import os

os.environ["OPENAI_API_KEY"] = userdata.get("OPENAI_API_KEY")

In [36]:
from llama_index.llms.openai import OpenAI
from llama_index.core.workflow import (
    Context,
    Event,
    StartEvent,
    StopEvent,
    Workflow,
    step,
)
from llama_index.core.prompts import PromptTemplate
from llama_index.core.bridge.pydantic import BaseModel, Field
from typing import List
from dataclasses import dataclass

# -- Your existing AuditSegment model --
class AuditSegment(BaseModel):
    summary: str = Field(description="What the function does.")
    risks: List[str] = Field(description="Security or logic risks.")
    suggestions: List[str] = Field(description="Fixes or improvements.")

# -- Event classes --
class NewAuditEvent(Event):
    segment: AuditSegment
    function_code: str

class HumanReviewEvent(Event):
    function_id: str

# -- Templates --
AUDIT_TEMPLATE = """
You are a Solidity smart contract auditor. Analyze the following function:

{function_code}

Return the following:
- A brief summary of what it does
- A list of potential risks or vulnerabilities
- Suggestions to improve the function’s security or design

Return as a structured object with:
- summary: string
- risks: list of strings
- suggestions: list of strings
"""

FEEDBACK_PROMPT = """
You previously audited this Solidity function:

{function_code}

Original Audit Output:
SUMMARY:
{old_summary}

RISKS:
{old_risks}

SUGGESTIONS:
{old_suggestions}

A human reviewer provided the following feedback:
"{human_feedback}"

Using the feedback, revise your audit and return a complete structured object like this:

{
  "summary": "...",
  "risks": ["...", "..."],
  "suggestions": ["...", "..."]
}
"""

# -- Workflow Definition --
class SmartContractAuditWorkflow(Workflow):
    def __init__(self, function_codes: List[str], **kwargs):
        super().__init__(**kwargs)
        self.llm = OpenAI("gpt-4o-mini")
        self.function_codes = function_codes

    @step
    async def create_audit_segment(
        self, ctx: Context, ev: StartEvent | HumanReviewEvent
    ) -> NewAuditEvent | StopEvent:
        segments = await ctx.get("audit_segments", [])
        index = len(segments)

        if index < len(self.function_codes):
            code = self.function_codes[index]
            audit = self.llm.structured_predict(
                AuditSegment,
                PromptTemplate(AUDIT_TEMPLATE),
                function_code=code,
            )
            segments.append(audit)
            await ctx.set("audit_segments", segments)
            return NewAuditEvent(segment=audit, function_code=code)
        else:
            return StopEvent(result=segments)

    @step
    async def prompt_human_review(
        self, ctx: Context, ev: NewAuditEvent
    ) -> HumanReviewEvent:
        segment = ev.segment

        print("\n📄 Function Code:\n", ev.function_code)
        print("\n🧠 Summary:\n", segment.summary)
        print("\n⚠️ Risks:\n", "\n- ".join(segment.risks))
        print("\n🔧 Suggestions:\n", "\n- ".join(segment.suggestions))

        feedback = input("\n💬 Feedback on this audit? Leave blank to accept, or type your comments: ").strip()

        if feedback:
          while True:
              print("\n🔁 Reprocessing audit with your feedback...\n")

              updated_segment = self.llm.structured_predict(
                  AuditSegment,
                  PromptTemplate(FEEDBACK_PROMPT),
                  function_code=ev.function_code,
                  old_summary=segment.summary,
                  old_risks="\n".join(segment.risks),
                  old_suggestions="\n".join(segment.suggestions),
                  human_feedback=feedback
              )

              print("✅ Updated Audit Segment:\n")
              print("🧠 Summary:", updated_segment.summary)
              print("⚠️ Risks:\n- " + "\n- ".join(updated_segment.risks))
              print("🔧 Suggestions:\n- " + "\n- ".join(updated_segment.suggestions))

              segment = updated_segment

              feedback = input("\n💬 More feedback? Press Enter to confirm and continue, or type more comments: ").strip()
              if not feedback:
                  break


        # Update the segment in context
        segments = await ctx.get("audit_segments")
        segments[-1] = segment
        await ctx.set("audit_segments", segments)

        return HumanReviewEvent(function_id="FUNC_" + str(len(segments)))

## 🧪 STEP 4: Running the Workflow
Here we will:

1. Define sample **Solidity functions** to audit,

2. Instantiate the workflow with those functions,

3. Run it and observe AI + Human interactions.

In [40]:
import nest_asyncio
import asyncio

# Required in notebooks to allow nested async loops
nest_asyncio.apply()

# 🧾 Example Solidity functions for testing
function_list = [
    """
    function withdraw() public {
    require(balances[msg.sender] > 0, "No funds to withdraw");
    payable(msg.sender).transfer(balances[msg.sender]);
    balances[msg.sender] = 0;
    }
    """,
    # """
    # function withdraw() public {
    #     require(balances[msg.sender] > 0);
    #     payable(msg.sender).transfer(balances[msg.sender]);
    #     balances[msg.sender] = 0;
    # }
    # """,
    # """
    # function mint(address to, uint256 amount) external onlyOwner {
    #     _balances[to] += amount;
    #     _totalSupply += amount;
    # }
    # """
]

# 🚀 Instantiate the workflow
audit_workflow = SmartContractAuditWorkflow(function_codes=function_list)

# ✅ Run the workflow
result = await audit_workflow.run()


📄 Function Code:
 
    function withdraw() public {
    require(balances[msg.sender] > 0, "No funds to withdraw");
    payable(msg.sender).transfer(balances[msg.sender]);
    balances[msg.sender] = 0;
    }
    

🧠 Summary:
 The withdraw function allows users to withdraw their funds from the contract. It checks if the user has a positive balance, transfers the balance to the user's address, and then resets the user's balance to zero.

⚠️ Risks:
 Reentrancy attack: If the recipient is a contract, it could call back into the withdraw function before the balance is set to zero, leading to potential multiple withdrawals.
- Gas limit issues: If the transfer fails due to gas limits or other reasons, the user's balance will not be reset, potentially causing issues in future withdrawals.

🔧 Suggestions:
 Implement a checks-effects-interactions pattern to mitigate reentrancy attacks by updating the balance before transferring funds.
- Consider using a pull-over-push model where users can claim

## ✅ STEP 6: Generate Final Audit Report

In [41]:
print("\n📒 FINAL AUDIT REPORT")
for idx, segment in enumerate(result):
    print(f"\n=== AUDIT: FUNCTION {idx + 1} ===")
    print("🧠 Summary:\n", segment.summary)
    print("⚠️ Risks:")
    for r in segment.risks:
        print("- " + r)
    print("🔧 Suggestions:")
    for s in segment.suggestions:
        print("- " + s)


📒 FINAL AUDIT REPORT

=== AUDIT: FUNCTION 1 ===
🧠 Summary:
 The withdraw function enables users to retrieve their funds from the contract. It verifies that the user has a positive balance, transfers the corresponding amount to the user's address, and subsequently sets the user's balance to zero.
⚠️ Risks:
- Reentrancy attack: If the recipient is a contract, it could invoke the withdraw function again before the user's balance is reset, allowing for multiple withdrawals.
- Gas limit issues: If the transfer fails due to gas limits or other reasons, the user's balance will remain unchanged, potentially causing problems for future withdrawals.
🔧 Suggestions:
- Implement the checks-effects-interactions pattern to mitigate reentrancy attacks by updating the user's balance before transferring funds.
- Consider adopting a pull-over-push model, allowing users to claim their funds instead of transferring them directly, which reduces the risk of reentrancy.
- Add a mechanism to handle failed tra