Skip to content

Automated code scanning tool using local LLMs to find bugs, security vulnerabilities, and code quality issues

License

Notifications You must be signed in to change notification settings

ajokela/llm-code-scanner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLM Code Scanner

An automated code scanning tool that uses local LLMs to find bugs, security vulnerabilities, and code quality issues in your codebase.

Features

  • Nightly Scans: Fast scanning using vLLM with OpenAI-compatible API
  • Weekly Deep Scans: Thorough analysis using Ollama with larger models
  • Finding Deduplication: SQLite-based tracking prevents duplicate issues
  • JIRA Integration: Automatically create tickets for findings
  • Email Reports: Daily and weekly summaries via SendGrid
  • Multi-Language Support: Python, Rust, TypeScript, Kotlin, Swift, Go, and more

Requirements

  • Python 3.10+
  • vLLM server for nightly scans
  • Ollama for weekly deep scans (optional)
  • SendGrid account for email reports (optional)
  • JIRA account for ticket creation (optional)

Installation

# Clone the repository
git clone https://github.com/ajokela/llm-code-scanner.git
cd llm-code-scanner

# Install dependencies
pip install -r requirements.txt

# Copy and configure environment
cp .env.example .env
# Edit .env with your settings

Configuration

Scanner Configuration

Edit config/scanner_config.yaml for nightly scans:

vllm:
  host: "localhost"
  port: 8000
  model: "Qwen/Qwen2.5-Coder-7B-Instruct"

repositories:
  - name: "my-project"
    path: "my-project"
    scan_patterns:
      - "**/*.py"
    exclude_patterns:
      - "__pycache__/**"

Environment Variables

# vLLM connection
VLLM_HOST=localhost
VLLM_PORT=8000

# Ollama connection (for weekly scans)
OLLAMA_HOST=localhost
OLLAMA_PORT=11434

# Projects root directory
PROJECTS_ROOT=/path/to/projects

# JIRA (optional)
JIRA_DOMAIN=your-domain.atlassian.net
JIRA_EMAIL=your-email@example.com
JIRA_API_KEY=your-api-key

# Email (optional)
SENDGRID_API_KEY=your-sendgrid-key
EMAIL_FROM=scanner@example.com
EMAIL_TO=team@example.com

Usage

Run Nightly Scanner

# Set up vLLM server first
python -m vllm.entrypoints.openai.api_server \
  --model Qwen/Qwen2.5-Coder-7B-Instruct \
  --host 0.0.0.0 --port 8000

# Run scanner
python -m agent.scanner

Run Weekly Deep Scanner

# Ensure Ollama is running with your model
ollama run llama3:70b

# Run weekly scanner
python -m agent.weekly_scanner

Docker

# Build
docker build -t llm-code-scanner .

# Run (mount your projects)
docker run -v /path/to/projects:/projects llm-code-scanner

Project Structure

llm-code-scanner/
├── agent/
│   ├── __init__.py
│   ├── scanner.py         # Nightly scanner using vLLM
│   ├── weekly_scanner.py  # Weekly deep scanner using Ollama
│   ├── schemas.py         # Pydantic models
│   ├── tools.py           # File reading and search tools
│   ├── dedup.py           # Finding deduplication
│   └── email_reporter.py  # SendGrid email reports
├── config/
│   ├── scanner_config.yaml       # Nightly scan config
│   └── weekly_deep_config.yaml   # Weekly scan config
├── prompts/
│   ├── system.txt         # System prompt
│   ├── analysis.txt       # Analysis prompt
│   └── weekly/            # Deep scan prompts
├── data/                  # SQLite database (gitignored)
└── results/               # Scan results (gitignored)

How It Works

  1. File Discovery: Scans configured repositories for files matching patterns
  2. LLM Analysis: Sends code to LLM with analysis prompts
  3. Finding Extraction: Parses structured JSON findings from LLM response
  4. Deduplication: Checks against database to avoid duplicate tickets
  5. Ticket Creation: Creates JIRA tickets for new findings
  6. Reporting: Sends email summaries

Customization

Custom Prompts

Edit files in prompts/ to customize what the LLM looks for:

  • system.txt: Overall instructions and focus areas
  • analysis.txt: Specific analysis checklist
  • weekly/: Deep analysis prompts for weekly scans

Adding Repositories

Add entries to the repositories section in your config:

repositories:
  - name: "my-rust-project"
    path: "my-rust-project"
    languages: ["rust"]
    scan_patterns:
      - "src/**/*.rs"
    exclude_patterns:
      - "target/**"
    priority: 1  # Lower = higher priority

License

BSD 3-Clause License. See LICENSE for details.

About

Automated code scanning tool using local LLMs to find bugs, security vulnerabilities, and code quality issues

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages