RevEngSecure: LLM-Augmented Reverse Engineering for Design-Level Software Defect and Security Analysis
This repository presents RevEngSecure (Code-to-Design) — an AI-driven reverse-engineering framework that unifies static analysis, semantic reasoning, and security compliance evaluation. The system extracts structural and contextual knowledge from C/C++ software by constructing a Code Property Graph (CPG), computing sixteen analytical metrics (M1–M16), and employing Large Language Models (LLMs) to reconstruct UML Use Case Diagrams and detailed behavioral metadata. In its final stage, RevEngSecure performs design-level security analysis aligned with international standards such as OWASP ASVS, CWE, STRIDE, NIST SP 800-53/63, and ISO/IEC 27001/27034, generating a traceable security architecture report that links detected design flaws directly to their corresponding code-level artifacts.
The Code-to-Design (RevEngSecure) pipeline performs an end-to-end transformation from raw source code to design-level insight and secure architecture evaluation.
Through a sequence of automated stages — static analysis, CPG generation, metric extraction, LLM-driven design recovery, and standards-based security verification — the framework reconstructs accurate high-level documentation for legacy or undocumented systems.
This enables developers, security analysts, and researchers to visualize software architecture, assess design integrity, and identify latent security weaknesses embedded within code structure and logic.
- Static Code Analysis: Uses Joern CPG to extract structural and semantic information from C/C++ projects
- Comprehensive Metrics Extraction: Implements 16 distinct metrics (M1–M16) covering functions, control flow, I/O operations, security patterns, and domain semantics
- AI-Powered Use Case Generation: Employs GPT-5 to infer use cases, actors, and relationships from code metadata
- Security Architecture Analysis: Generates detailed security reports mapping design flaws to OWASP, STRIDE, CWE, and NIST standards
- PlantUML Diagram Generation: Automatically produces renderable UML diagrams from extracted use case specifications
The system follows a modular pipeline architecture:
- CPG Generation: Parses C/C++ source code into a Code Property Graph using Joern
- Metrics Extraction: Computes 16 metrics covering structural, semantic, and security aspects
- LLM Analysis: Processes metrics JSON through GPT-5 to generate use case specifications
- Diagram Rendering: Converts PlantUML specifications to visual diagrams
- Security Analysis: Produces comprehensive security architecture reports
The system implements a comprehensive metrics suite (M1–M16) designed to capture evidence for use case extraction:
- M2: Entry Points (main, WinMain, DllMain, public methods)
- M7: CLI Arguments Usage (argv/argc occurrences)
- M8: I/O Operations (console, file, network, environment)
- M9: Name/Text Cues (semantic hints from identifiers and literals)
- M10: TF-IDF Domain Terms (bag-of-words and term frequency analysis)
- M11: Comments (documentation evidence)
- M14: Security Patterns (unsafe calls, high coupling, unvalidated input)
- M16: Similarity to UCD Terms (domain vocabulary matching)
- M3: Call Graph Edges (function call relationships)
- M4/M6: Control Flow Structures (branches, loops, switches)
- M12: Complexity Metrics (cyclomatic complexity, recursion, workflow depth)
- M13: Cross-Module Calls (inter-module dependencies)
- M15: Relations (inheritance, shared globals)
- M1: Function Count (structural size)
- M5: CFG Node Sum (control flow graph complexity)
This project computes design-relevant metrics (M1–M16) directly from the Code Property Graph (CPG) using Joern.
Below are example queries showing how each metric is derived. You can run these in the Joern shell (./joern) or inside a Joern script.
Note: in the code below,
cpgis the loaded code property graph andcleanMethodsis a filtered list of real (non-synthetic) methods.
Count distinct method full names (excluding empty/synthetic ones):
val cleanMethods = cpg.method.l.filter(_.fullName.nonEmpty)
val m1Pairs = cleanMethods.map { m =>
val fn = m.fullName.trim
val sg = Option(m.signature).map(_.trim).getOrElse("")
(fn, sg)
}.distinct
val M1 = m1Pairs.size
println(s"M1 = $M1")Detect classic entry functions (main, WinMain, DllMain) plus public methods:
val entryPointNames = List("main","WinMain","DllMain")
val entryMethods = cpg.method.nameExact(entryPointNames:_*).l
val publicMethods = cpg.method.where(_.modifier.modifierType("PUBLIC")).l
val entryPointFullNames =
(entryMethods.map(_.fullName) ++ publicMethods.map(_.fullName))
.map(_.trim)
.filter(fn => fn.nonEmpty)
.distinct
val M2 = entryPointFullNames.size
println(s"M2 = $M2")Build caller–callee edges from call sites:
case class Edge(caller:String, callee:String, file:String)
val callEdges = cpg.call.l.map { call =>
val caller = call.method.fullName
val callee = Option(call.methodFullName).getOrElse(call.name)
val file = call.file.name.headOption.getOrElse("")
Edge(caller, callee, file)
}.distinct
val M3_callEdgeCount = callEdges.size
println(s"M3_callEdges = $M3_callEdgeCount")Count if, switch, and loop constructs from control structures:
def lower(s:String) = s.toLowerCase
def isIfCode(s:String) = lower(s).startsWith("if")
def isSwitchCode(s:String) = lower(s).startsWith("switch")
def isLoopCode(s:String) = {
val t = lower(s)
t.startsWith("for") || t.startsWith("while") || t.startsWith("do")
}
val m4All = cleanMethods.flatMap { m =>
m.controlStructure.l.map(_.code.trim)
}
val M4_if = m4All.count(isIfCode)
val M4_switch = m4All.count(isSwitchCode)
val M4_loop = m4All.count(isLoopCode)
val M4_total = m4All.size
println(s"M4_if = $M4_if, M4_switch = $M4_switch, M4_loop = $M4_loop, M4_total = $M4_total")Traverse cfgNext / cfgPrev to measure graph size and depth:
case class CfgStats(nodes:Int, edges:Int, longest:Int)
def cfgStats(m:Method): CfgStats = {
val nodes = m.cfgNode.l
val nodeCount = nodes.size
val edgeCount = nodes.map(_.cfgNext.size).sum
val orders = nodes.map(_.order)
val longest = if (orders.isEmpty) 1 else (orders.max - orders.min + 1)
CfgStats(nodeCount, edgeCount, longest)
}
val cfgByFunc = cleanMethods.map(m => m.fullName -> cfgStats(m)).toMap
val M5_nodesSum = cfgByFunc.values.map(_.nodes).sum
val M5_edgesSum = cfgByFunc.values.map(_.edges).sum
val M5_longestMax = cfgByFunc.values.map(_.longest).foldLeft(0)(math.max)
println(s"M5_cfg_nodes_sum = $M5_nodesSum")
println(s"M5_cfg_edges_sum = $M5_edgesSum")
println(s"M5_cfg_longest_max = $M5_longestMax")Count all control structures across the whole program:
val branchCount = cpg.controlStructure.code.l.count(isIfCode)
val loopCount = cpg.controlStructure.code.l.count(isLoopCode)
val switchCount = cpg.controlStructure.code.l.count(isSwitchCode)
println(s"M6_branch_loop_switch = ($branchCount, $loopCount, $switchCount)")Detect CLI arguments and common I/O APIs:
// CLI
val cliArgsUse = cpg.identifier.nameExact("argv","argc").size
println(s"M7_cliArgs = $cliArgsUse)
// I/O
val ioFile = Set("fopen","fclose","fread","fwrite","open","close","read","write")
val ioNet = Set("connect","accept","send","recv","socket","listen")
val ioEnv = Set("getenv","setenv","putenv")
val fileIOCount = cpg.call.name.l.count(ioFile.contains)
val netIOCount = cpg.call.name.l.count(ioNet.contains)
val envIOCount = cpg.call.name.l.count(ioEnv.contains)
println(s"M8_file_net_env = ($fileIOCount, $netIOCount, $envIOCount)")Match function names and string literals against domain verbs:
val nameClues = List("login","logout","auth","register","create","delete","update","search")
val nameCueMap =
cleanMethods.map(_.fullName).distinct.map { fn =>
val cues = nameClues.filter(k => fn.toLowerCase.contains(k)).toSet
fn -> cues
}.toMap
println(s"M9_nameCluesExamples = ${nameCueMap.size}")Builds a bag-of-words per function using identifiers, parameters, and literals, then computes TF–IDF weights to capture domain-relevant tokens.
import scala.collection.mutable
def splitTokens(s:String): Seq[String] =
s.toLowerCase.replaceAll("[^a-z0-9_]+"," ").split("\\s+").filter(_.nonEmpty).toSeq
def bagOfWords(m:Method): Seq[String] = {
val parts = mutable.ArrayBuffer[String]()
parts += m.name
parts ++= m.parameter.name.l
parts ++= m.ast.isIdentifier.name.l
parts ++= m.ast.isLiteral.code.l
splitTokens(parts.mkString(" "))
}
val funcBags = cleanMethods.map(m => m.fullName -> bagOfWords(m)).toMap
val termDocFreq = funcBags.values.flatten.distinct.groupBy(identity).view.mapValues(_.size).toMap
val Ndocs = funcBags.size.max(1)
def tfidf(fn:String, term:String): Double = {
val terms = funcBags.getOrElse(fn, Seq.empty)
val tf = terms.count(_ == term).toDouble / terms.size.max(1)
val df = termDocFreq.getOrElse(term, 1).toDouble
val idf = Math.log((Ndocs + 1.0) / df)
tf * idf
}
def topTerms(fn:String, k:Int=10): Seq[(String,Double)] =
funcBags.getOrElse(fn, Seq.empty).distinct.map(t => t -> tfidf(fn,t)).sortBy(-_._2).take(k)
println(s"M10_tfidfTopTermsExamples = ${funcBags.size}")Counts comment nodes per function to estimate documentation coverage.
val commentsPerFunc = cleanMethods.map(m => m.fullName -> m.comment.l.size).toMap
val totalComments = commentsPerFunc.values.sum
println(s"M11_commentsTotal = $totalComments")
Approximates cyclomatic complexity and detects recursive functions.
def cyclomaticApprox(m:Method): Int = {
val branches = m.controlStructure.l
1 + branches.count(_.code.startsWith("if")) +
branches.count(_.code.startsWith("switch")) +
branches.count(_.code.startsWith("for")) +
branches.count(_.code.startsWith("while"))
}
def isRecursive(m:Method): Boolean = {
val fn = m.fullName
cpg.call.nameExact(fn).nonEmpty
}
case class FlowInfo(cyclo:Int, recursive:Boolean)
val perFuncComplexity = cleanMethods.map(m => m.fullName -> FlowInfo(cyclomaticApprox(m), isRecursive(m))).toMap
val cycloSum = perFuncComplexity.values.map(_.cyclo).sum
val recCount = perFuncComplexity.count(_._2.recursive)
println(s"M12_cyclomaticSum = $cycloSum, recursiveFuncs = $recCount")Detects inter-file calls and public API exposure to measure modularity:
case class Edge(caller:String, callee:String, file:String)
val callEdges = cpg.call.l.map { c =>
Edge(c.method.fullName, Option(c.methodFullName).getOrElse(c.name), c.file.name.headOption.getOrElse(""))
}.distinct
def fileOf(fn:String): String = cpg.method.fullNameExact(fn).file.name.headOption.getOrElse("")
val crossModuleEdges = callEdges.filter(e => fileOf(e.caller) != fileOf(e.callee))
val publicApiList = cpg.method.where(_.modifier.modifierType("PUBLIC")).fullName.l.distinct
println(s"M13_crossModuleEdges = ${crossModuleEdges.size}, publicApiCount = ${publicApiList.size}")Flags unsafe API usage, high-coupling functions, and unvalidated inputs.
val knownUnsafe = Set("gets","strcpy","strcat","sprintf","scanf","memcpy","memmove")
val unsafeCalls = cpg.call.name.l.filter(knownUnsafe.contains).distinct
val degreeByFn = callEdges.flatMap(e => List(e.caller -> e.callee, e.callee -> e.caller))
.groupBy(_._1).view.mapValues(_.map(_._2).toSet.size).toMap
val degVals = degreeByFn.values.toSeq.sorted
def percIdx(vs: Seq[Int], p: Double): Int =
if (vs.isEmpty) 0 else vs((p * (vs.size - 1)).round.toInt.min(vs.size - 1))
val highCouplingThreshold = percIdx(degVals, 0.9)
val highCouplingFuncs = degreeByFn.filter(_._2 >= highCouplingThreshold).keys.toSeq
def unvalidatedInputHeuristic(m:Method): Boolean = {
val ids = m.ast.isIdentifier.name.l.map(_.toLowerCase)
val lits = m.ast.isLiteral.code.l.map(_.toLowerCase)
val hasInput = ids.exists(Set("argv","input","buf","line").contains)
val hasValidate = (ids ++ lits).exists(s => s.contains("validate") || s.contains("sanitize") || s.contains("check"))
hasInput && !hasValidate
}
val unvalidatedCount = cleanMethods.count(unvalidatedInputHeuristic)
println(s"M14_unsafeCalls = ${unsafeCalls.size}, highCoupling = ${highCouplingFuncs.size}, unvalidatedInput = $unvalidatedCount")Detects inheritance edges and shared global variables to build class relationships.
case class InheritEdge(child:String, parent:String)
val inheritanceEdges =
cpg.typeDecl.l.flatMap(td => td.inheritsFromTypeFullName.l.map(p => InheritEdge(td.fullName, p)))
val allIdentifiers = cpg.identifier.name.l
val globals = allIdentifiers.groupBy(identity).filter(_._2.size >= 2).keys.toList
println(s"M15_inheritanceEdges = ${inheritanceEdges.size}, sharedGlobals = ${globals.size}")Matches tokenized code artifacts against provided UCD_TERMS to detect use-case relevance.
val ucdTerms: Seq[String] =
sys.env.get("UCD_TERMS")
.map(_.split(",").toSeq.map(_.trim.toLowerCase))
.getOrElse(Seq.empty)
val ucdMatches = if (ucdTerms.nonEmpty) {
funcBags.flatMap { case (fn, terms) =>
val hits = ucdTerms.filter(t => terms.exists(_.contains(t)))
if (hits.nonEmpty) Some(fn -> hits) else None
}
} else Map.empty[String,Seq[String]]
println(s"M16_enabled = ${ucdTerms.nonEmpty}, matchedFunctions = ${ucdMatches.size}")- Python 3.8+
- Docker (for Joern CPG extraction)
- OpenAI API Key (for GPT-5 access)
- PlantUML (optional, for local diagram rendering)
-
Clone the Repository
git clone https://github.com/ppakshad/CodeToDesign.git cd CodeToDesign -
Install Python Dependencies
pip install openai plantuml pathlib
-
Configure Docker Ensure Docker is running and can pull the Joern image:
docker pull ghcr.io/joernio/joern:nightly
-
Set OpenAI API Key Create
implementation/openai_api_key.txtor set theOPENAI_API_KEYenvironment variable:export OPENAI_API_KEY="your-api-key-here"
-
Prepare Your C/C++ Project Organize your source code in a directory structure:
your_project/ ├── code/ │ ├── main.cpp │ ├── module1.cpp │ └── ... └── graphs/ (auto-generated) -
Run the Analysis Pipeline
cd implementation python project.py -
Provide Project Path When prompted, enter the path to your project's
codedirectory:PROJECT path: /path/to/your/project/code -
Review Outputs The pipeline generates:
graphs/usecase_table_metrics_v2.json: Extracted code metricsgraphs/usecase_usecases.puml: PlantUML use case diagramgraphs/usecase_usecases.png: Rendered diagram (if PlantUML server available)implementation/testout.txt: Use case bindings with code referencesimplementation/securityReport.json: Security architecture analysis
Set environment variable to provide domain-specific vocabulary:
export UCD_TERMS="login,logout,register,search,upload,download"Override the default GPT-5 model:
export OPENAI_MODEL="gpt-5-chat-latest"project/
├── implementation/
│ ├── project.py # Main pipeline implementation
│ ├── docker.sh # Docker utility script
│ ├── usecaseprompt.txt # LLM prompt for use case extraction
│ ├── securityprompt.txt # LLM prompt for security analysis
│ ├── testout.txt # Generated use case bindings
│ └── securityReport.json # Generated security report
├── usecase/ # Analyzed projects (use case diagrams)
│ ├── BankingManagementSystem/
│ ├── LearningManagmentSystem/
│ └── ...
├── spearman/
└── README.md
The system generates PlantUML diagrams with:
- Actors: Human roles and external systems
- Use Cases: User-visible goals derived from code
- Relationships: Associations,
<<include>>,<<extend>>, generalizations - System Boundary: Inferred from project structure
Comprehensive security analysis including:
- Design Findings: Architecture-level security flaws
- Use Case Bindings: Code-to-use-case mappings
- Threat Modeling: STRIDE-based threat identification
- Compliance Mapping: OWASP, CWE, NIST references
- Remediation Roadmap: Prioritized action items
This framework supports research in:
- Reverse Engineering: Automated documentation of legacy systems
- Software Architecture: Extraction of high-level design from implementation
- Security Analysis: Systematic identification of design-level vulnerabilities
- AI-Assisted Software Engineering: LLM integration for code understanding
- Metrics-Driven Development: Evidence-based use case extraction
The repository includes a curated dataset of 12 C/C++ projects:
| Project | LOC | Files | Functions | Avg Complexity |
|---|---|---|---|---|
| LearningManagementSystem | 1,427 | 19 | 139 | 1.9 |
| RemoteDesktop | 992 | 11 | 87 | 2.3 |
| ProductManagementTool | 1,015 | 8 | 68 | 2.5 |
| BankingManagementSystem | 3,100 | 1 | 136 | 2.8 |
| HPSocketDev | 29,014 | 53 | 4,027 | 2.2 |
| ... | ... | ... | ... | ... |
Total: 38,966 LOC across 121 files, 4,823 functions (weighted avg complexity: 2.22)
The following table presents the security analysis results obtained from the BankingManagementSystem (C++) project. Each finding is automatically traced to its corresponding Use Case(s) and Bound Function(s) within the recovered design model.
| ID | Project | Defect / Weakness | Short Description | Severity | Referenced Standards | Use Case(s) | Bound Function(s) |
|---|---|---|---|---|---|---|---|
| DF1 | BankingManagementSystem (C++) | Weak Authentication | Console password prompt uses static check without hashing or lockout, exposing system to brute-force and spoofing. | High | OWASP A07; CWE-798; NIST IA-5; STRIDE: Spoofing | UC1 – Login to System | login:int() |
| DF2 | BankingManagementSystem (C++) | Undefined Authorization Model | No enforcement of role-based or contextual access; any authenticated user can perform financial operations. | Medium | OWASP A01; CWE-284; NIST AC-3, AC-6; STRIDE: EoP | UC4 – Transfer Funds | menu:void(), user_input:void(int) |
| DF3 | BankingManagementSystem (C++) | Missing Input Validation | User-provided fields (account number, SSN, amount) lack validation or canonicalization before processing. | Medium | OWASP A03; CWE-20; NIST SI-10; STRIDE: Tampering | UC5 – Submit Transaction | user_input:void(int) |
| DF4 | BankingManagementSystem (C++) | Lack of Data Classification / PII Lifecycle | Customer data and transaction records are processed without data classification or retention policy. | Medium | OWASP A02; CWE-200; ISO/IEC 27034; NIST DM-2 | UC4 – Transfer Funds | user_input:void(int), validate_account:void(int) |
| DF5 | BankingManagementSystem (C++) | Insufficient Auditing and Monitoring | Missing logging for login attempts, transfers, and account updates; no audit trail for repudiation analysis. | Medium | OWASP A09; CWE-778; NIST AU-2, AU-12; STRIDE: Repudiation | UC1 – Login to System | login:int(), validate_account:void(int) |
| DF6 | BankingManagementSystem (C++) | Insecure Secrets Management | Potential hardcoded password or plaintext comparison within memory; no secret rotation or KDF. | High | OWASP A07; CWE-522; CWE-798; NIST SC-12; STRIDE: Spoofing | UC1 – Login to System | login:int() |
Note:
All findings were generated as part of the RevEngSecure analysis pipeline, which reconstructs design-level artifacts and detects potential security flaws from source code automatically.
- Language Support: Currently limited to C/C++ (via Joern CPG)
- LLM Dependency: Requires GPT-5 API access (may incur costs)
- Static Analysis: Only captures compile-time information (no runtime behavior)
- Heuristic-Based: Metrics rely on naming conventions and structural patterns
- Manual Validation: Generated diagrams may require domain expert review
Contributions are welcome! Areas for improvement:
- Support for additional programming languages
- Enhanced metrics and heuristics
- Improved LLM prompt engineering
- Integration with other static analysis tools
- Performance optimizations for large codebases
If you use this framework in your research, please cite:
@software{Pakshad2025CodeToDesign,
title = {Code-to-Design: LLM-Augmented Reverse Engineering for Design-Level Security},
author = {Pakshad, Puya},
year = {2025},
url = {https://github.com/ppakshad/CodeToDesign}
}- Joern: Code Property Graph framework for C/C++
- OpenAI: GPT-5 API for semantic analysis
- PlantUML: Diagram rendering and visualization
For questions, issues, or collaboration inquiries, please open an issue on GitHub or contact [ppakshad@hawk.illinoistech.edu].
Note: This project is under active development. API interfaces and file structures may change between versions.
%20(1).png)