SAFE (Security-Aware Framework for Artifact Evaluation)

SAFE is a system for contextual security auditing of research artifacts. It takes outputs from static analysis tools (e.g., Semgrep, Trivy) and uses a repository-aware LLM agent to determine whether a finding is truly exploitable in real-world usage.

The system uses a ReAct-style agent powered by LangGraph, equipped with filesystem tools to inspect code, trace dependencies, and reason about execution context.

Repository Structure

SAFE/
│── main.py                       # Entry point
│── config.yaml                   # Configuration file
│── requirements.txt              # Dependencies
│── SAFE_TOOL_OUTPUT.csv          # Tool (Semgrep and Trivy) output for SAFE (its own codebase)
│
├── core/                         # Core reasoning & pipeline logic
│   ├── auditor.py                # Main auditing pipeline
│   ├── validator.py              # Validation logic
│   ├── schemas.py                # Data schemas
│   └── logger.py                 # Logging utilities
│
├── tools/                        # Static & structural analysis tools
│   ├── repo_parser.py            # Repository structure parsing
│   ├── ast_parser.py             # Code-level AST analysis
│   ├── dependency_analyzer.py    # Analyzes dependencies between modules and files
│   ├── file_reader.py            # Handles file loading and content extraction
│   ├── code_search.py.           # Enables keyword and pattern-based code search
│   └── artifact_resolver.py.     # Resolves artifacts and links related components
│
├── llm/                          # LLM interaction layer
│   ├── provider.py               # LLM abstraction
│   └── cost_tracker.py           # Token & cost tracking
│
├── outputs/                      # Generated outputs from SAFE runs
│   ├── final_results.csv         # Final classified findings/results
│   ├── costs/                    # Cost tracking outputs
│   │   └── *.json                # Token usage and cost logs
│   └── logs/                     # Execution logs (generated per run)
│       └── *.log                 # Detailed execution traces
├── ARTIFACT
│   └── SAFE/                     # Self-contained sample artifact used for testing SAFE on its own codebase

Setup Instructions

Setup Repository

git clone https://github.com/nanda-rani/SAFE.git
cd SAFE

Install Virtual Environment

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Configure API Keys You must export the relevant API key for the model you intend to use.
```
export OPENAI_API_KEY="sk-..."
```
Data Placement
- Modify config.yaml the CSV file containing findings/flags (SAFE_TOOL_OUTPUT.csv for demo) obtained from static analysis tool. SAFE_TOOL_OUTPUT.csv is the colelction of flags obtained using static analysis tools (Semgrep and Trivy).
- Place your analyzed repositories in the ARTIFACT/ folder. The structure should map artifact_id to directories (e.g. ARTIFACT/<artifact_id>/). ARTIFACT Folder contains SAFE codebase copied in side the folder.
Configuration Settings Edit config.yaml to select model string, paths, and maximum retry counts for schema validation.
Run the auditor after completing setup
```
python main.py
```

Check logs and outputs

cat outputs/logs/system.log
cat outputs/logs/<finding_uid>.json
cat outputs/logs/repo_<artifact_id>.log

Quick Start: Run SAFE on Its Own Codebase

This repository already includes a ready-to-run demo setup so you can test SAFE without preparing any external data.

Pre-configured setup:

SAFE_TOOL_OUTPUT.csv → Findings generated from SAFE’s own codebase (via Semgrep + Trivy)
ARTIFACT/SAFE/ → A copy of the SAFE repository used as the target artifact

Mapping:

artifact_id (CSV) = SAFE
→ ARTIFACT/SAFE/

Run:

python main.py

Run SAFE on Your Own Artifact

Step 1: Prepare Artifact

ARTIFACT/
└── my_project/

Step 2: Generate Findings

Option A: Semgrep

semgrep scan --config auto --json > semgrep.json

Option B: Trivy

trivy fs --format json --output trivy.json .

Convert Outputs to CSV

Example Entry:

artifact_id;tool;finding_id;category;severity_raw;file;line;message;package;version;cwe;cvss;scanner_applicable my_project;semgrep;python.lang.security.eval;code;HIGH;app.py;10;Use of eval;;;"CWE-95";;true

Step 3: Configure SAFE

Edit config.yaml:

input_csv: my_project_findings.csv
artifact_root: ARTIFACT/

Step 4: Run

python main.py

Step 5: Outputs

outputs/final_results.csv
outputs/logs/
outputs/costs/

Core Mapping Rule

artifact_id → ARTIFACT/<artifact_id>/

How Analysis Works

Initialization: The script reads findings from SAFE_TOOL_OUTPUT.csv.
Artifact Resolution: For each row, the script resolves the artifact_id against the ARTIFACT/ root to pinpoint the local Git repository matching the finding.
Agent Delegation: The specific finding metadata (file, message, priority, line code) is passed to the LangGraph ReAct agent.
Tool Use: The Agent dynamically executes local python functions exposed to it (get_repo_tree, read_snippet, search_package_usage, etc.) inspecting the filesystem for evidence.
JSON Emittance: The Agent concludes its analysis loop by emitting a final strict JSON evaluation mapping the artifact into standard Taxonomy boundaries (e.g. CONTEXTUAL_RISK vs FALSE_POSITIVE).
Validation: Pydantic logic strictly enforces that the final payload complies. If not, the LangGraph loop repeats requesting the LLM to fix formatting.

Logging and Cost Tracking

All outputs from the framework are routed into the outputs/ directory dynamically creating logs and saving costs at both the global and finding level dynamically using concurrent file locks.

outputs/logs/system.log: Standard operational log encompassing tool run initialization, finding traversal, schema retries, etc.
outputs/logs/error.log: Isolates stack-traces and validation aborts.
outputs/logs/<finding_uid>.log: Highly detailed debug log tracing the exact LangGraph inputs, raw node chains, precise tool parameters, and tool return data generated specifically for a singular finding.
outputs/logs/<finding_uid>.json: The final valid structured Pydantic payload returned upon successfully analyzing a finding.
outputs/costs/global_costs.json: An appending aggregator showing combined total dollars dynamically calculated per the requested LLM.
outputs/costs/<finding_uid>_cost.json: The isolated calculated USD dollar cost strictly applied measuring the prompt/completion token usage to investigate the singular finding.

Troubleshooting

Missing Artifact: If the log states "Skipping due to missing artifact path", ensure the ID column in your CSV correlates locally to a folder exactly inside ARTIFACT/.
API Timeout/Limits: Switch model down (e.g., to gpt-4o-mini) via config.yaml if you are hitting aggressive tier limits on concurrent tools querying models.
Missing API keys: Running the system fails out immediately. Set your standard keys via export before invoking.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SAFE (Security-Aware Framework for Artifact Evaluation)

Repository Structure

Setup Instructions

Quick Start: Run SAFE on Its Own Codebase

Pre-configured setup:

Mapping:

Run:

Run SAFE on Your Own Artifact

Step 1: Prepare Artifact

Step 2: Generate Findings

Option A: Semgrep

Option B: Trivy

Convert Outputs to CSV

Step 3: Configure SAFE

Step 4: Run

Step 5: Outputs

Core Mapping Rule

How Analysis Works

Logging and Cost Tracking

Troubleshooting

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
ARTIFACT		ARTIFACT
core		core
llm		llm
outputs		outputs
tools		tools
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md
SAFE_TOOL_OUTPUT.csv		SAFE_TOOL_OUTPUT.csv
config.yaml		config.yaml
main.py		main.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

SAFE (Security-Aware Framework for Artifact Evaluation)

Repository Structure

Setup Instructions

Quick Start: Run SAFE on Its Own Codebase

Pre-configured setup:

Mapping:

Run:

Run SAFE on Your Own Artifact

Step 1: Prepare Artifact

Step 2: Generate Findings

Option A: Semgrep

Option B: Trivy

Convert Outputs to CSV

Step 3: Configure SAFE

Step 4: Run

Step 5: Outputs

Core Mapping Rule

How Analysis Works

Logging and Cost Tracking

Troubleshooting

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages