FiniteMonkey is an intelligent vulnerability mining engine based on large language models, requiring no pre-trained knowledge base or fine-tuning. Its core feature is using task-driven and prompt engineering approaches to guide models in vulnerability analysis through carefully designed prompts.
- Task-driven rather than problem-driven
- Prompt-driven rather than code-driven
- Focus on prompt design rather than model design
- Leverage "deception" and hallucination as key mechanisms
As of May 2024, this tool has helped discover over $60,000 worth of bug bounties.
2024.11.19: Released version 1.0 - Validated LLM-based auditing and productization feasibility
Earlier Updates:
- 2024.08.02: Project renamed to finite-monkey-engine
- 2024.08.01: Added Func, Tact language support
- 2024.07.23: Added Cairo, Move language support
- 2024.07.01: Updated license
- 2024.06.01: Added Python language support
- 2024.05.18: Improved false positive rate (~20%)
- 2024.05.16: Added cross-contract vulnerability confirmation
- 2024.04.29: Added basic Rust language support
- PostgreSQL database
- OpenAI API access
- Python environment
-
Place project in
src/dataset/agent-v1-c4
directory -
Configure project in
datasets.json
:
{
"StEverVault2": {
"path": "StEverVault",
"files": [],
"functions": []
}
}
-
Create database using
src/db.sql
-
Configure
.env
:
# Database connection
DATABASE_URL=postgresql://user:password@localhost:5432/dbname
# API settings
OPENAI_API_BASE="api.example.com"
OPENAI_API_KEY=sk-your-api-key-here
# Model settings
VUL_MODEL_ID=gpt-4-turbo
CLAUDE_MODEL=claude-3-5-sonnet-20240620
# Azure configuration
AZURE_API_KEY="your-azure-api-key"
AZURE_API_BASE="https://your-resource.openai.azure.com/"
AZURE_API_VERSION="2024-02-15-preview"
AZURE_DEPLOYMENT_NAME="your-deployment"
# API selection
AZURE_OR_OPENAI="OPENAI" # Options: OPENAI, AZURE, CLAUDE
# Scan parameters
BUSINESS_FLOW_COUNT=4
SWITCH_FUNCTION_CODE=False
SWITCH_BUSINESS_CODE=True
- Solidity (.sol)
- Rust (.rs)
- Python (.py)
- Move (.move)
- Cairo (.cairo)
- Tact (.tact)
- Func (.fc)
- Java (.java)
- Pseudo-Solidity (.fr) - For scanning Solidity pseudocode
- If interrupted due to network/API issues, resume scanning using the same project_id in main.py
- Results include detailed annotations:
- Focus on entries marked "yes" in result column
- Filter "dont need In-project other contract" in category column
- Check specific code in business_flow_code column
- Find code location in name column
- Best suited for logic vulnerability mining in real projects
- Not recommended for academic vulnerability testing
- GPT-4-turbo recommended for best results
- Average scan time for medium-sized projects: 2-3 hours
- Estimated cost for 10 iterations on medium projects: $20-30
- Current false positive rate: 30-65% (depends on project size)
- claude 3.5 sonnet in scanning provides better results with acceptable time cost, GPT-3 not fully tested
- Deceptive prompt theory adaptable to any language with minor modifications
- ANTLR AST parsing recommended for better code slicing results
- Currently supports Solidity, plans to expand language support
- DeepSeek-R1 is recommended for better confirmation results
- Excels at code understanding and logic vulnerability detection
- Weaker at control flow vulnerability detection
- Designed for real projects, not academic test cases
- Progress automatically saved per scan
- claude-3.5-sonnet offers best performance in scanning compared to other models
- deepseek-R1 offers best performance in confirmation compared to other models
- 10 iterations for medium projects take about 4 hours
- Results include detailed categorization
Apache License 2.0
Pull Requests welcome!
Note: Project name inspired by Large Language Monkeys paper
Would you like me to explain or break down the code?