Skip to content

Enhance security and add vLLM model support with deployment improvements#41

Open
ace-xc wants to merge 4 commits intopremAI-io:mainfrom
ace-xc:security-hardening
Open

Enhance security and add vLLM model support with deployment improvements#41
ace-xc wants to merge 4 commits intopremAI-io:mainfrom
ace-xc:security-hardening

Conversation

@ace-xc
Copy link
Copy Markdown
Contributor

@ace-xc ace-xc commented Apr 14, 2026

Summary

This PR addresses security vulnerabilities (#40), fixes dependency conflicts (#37), and adds new features for better model support and deployment experience.


Dependency Fixes (Issue #37)

Fixed version conflicts reported during installation:

Conflict Cause Resolution
httpx<0.29 vs >=0.27 required by ollama/browser-use fastapi 0.112 pinned old httpx Update fastapi >=0.115.0
starlette>=0.41.3 vs 0.38.6 fastapi 0.112 pinned old starlette Update fastapi >=0.115.0
python==3.12 not compatible Version range ^3.10 Extend to >=3.10,<3.13

Changes in pyproject.toml:

# Before
python = "^3.10"
fastapi = "^0.112.0"

# After
python = ">=3.10,<3.13"  # Support 3.11, 3.12
fastapi = ">=0.115.0"    # Brings httpx>=0.27, starlette>=0.41
httpx = ">=0.27.0"       # Explicit for clarity
starlette = ">=0.41.0"   # Explicit for clarity

Security Fixes (Issue #40)

Critical Severity

  • RCE via eval() → Replace with ast.literal_eval()
  • SSRFnormalize_base_url() restricts to loopback addresses
  • Missing Authentication → Token-based auth via PREMSQL_API_TOKEN

High Severity

  • SQL Write Operationsenforce_read_only_sql() blocks INSERT/UPDATE/DELETE/DROP
  • Path Traversal → Whitelist validation: [A-Za-z0-9_-]{1,64}
  • Information Disclosure → Removed sensitive fields from API responses
  • Error Message Leakagesafe_error_message() whitelist mechanism

Medium Severity

  • Pickle RCE → Add weights_only=True to torch.load()
  • Swagger Exposure → Restrict access based on DEBUG mode
  • Process Management → PID file-based precise process tracking
  • SQL Injection → Parameterized queries with ? placeholders

New Features

1. LLM Provider Support

Added new generators for self-hosted and custom LLM deployments:

Generator Use Case
Text2SQLGeneratorVLLM vLLM deployed models (auto Qwen3 thinking mode handling)
Text2SQLGeneratorOpenAICompatible Any OpenAI-compatible API (LM Studio, LocalAI, etc.)

Usage:

from premsql.generators import Text2SQLGeneratorVLLM

generator = Text2SQLGeneratorVLLM(
    model_name="/models/qwen",
    base_url="http://localhost:8000/v1",
    experiment_name="test", type="test"
)

2. Easy Deployment Script

Added start_agent.py for one-command AgentServer startup:

# Configure in .env, then:
python start_agent.py

Auto-detects configured LLM provider from environment variables.

3. Bug Fixes

Fix Description
Plot image generation Fixed plot_image=False hardcoded in server_mode, now generates base64 images
Matplotlib backend Added Agg backend for non-interactive server mode
Session duplicate error Auto-delete existing session when creating new one with same name
Session deletion memory Recreate memory DB table after deletion to prevent OperationalError
API token propagation Fixed Django backend not passing API token to AgentServer
Environment loading Added dotenv loading in Django manage.py and Streamlit main.py

4. UI Improvements

  • Session list now shows delete button for each session (no need to type session name)
  • Delete operation auto-refreshes the page
  • Simplified session creation form with placeholder hints

Configuration

Environment Variables (.env)

All configuration is optional for local development - tokens are auto-generated if not set.

# Security (auto-generated for local dev)
#PREMSQL_API_TOKEN=your-token-here
#PREMSQL_DJANGO_SECRET_KEY=your-secret-here

# LLM Provider (choose one)
VLLM_BASE_URL=http://localhost:8000/v1
VLLM_MODEL_NAME=/models/your-model

# Or custom OpenAI-compatible service
CUSTOM_BASE_URL=http://localhost:1234/v1
CUSTOM_MODEL_NAME=local-model

# Or official OpenAI
OPENAI_API_KEY=sk-your-key
OPENAI_MODEL_NAME=gpt-4o-mini

Quick Start

# 1. Copy and configure .env
cp .env.example .env
# Edit .env to set your LLM provider

# 2. Start services
python start_agent.py          # AgentServer on port 8100
premsql launch all             # Django + Streamlit

# 3. Open browser
http://localhost:8501          # PremSQL Playground

Security Configuration

Development Mode (Recommended for Local Testing)

Set PREMSQL_DJANGO_DEBUG=true to enable development mode:

# .env for development
PREMSQL_DJANGO_DEBUG=true
# No need to set PREMSQL_API_TOKEN - auto-generated

In development mode:

  • API token is auto-generated (printed in startup logs)
  • Authentication is skipped for all services
  • Suitable for local testing only

Production Mode (Required for Deployment)

IMPORTANT: Production mode requires explicit token configuration:

# .env for production
PREMSQL_DJANGO_DEBUG=false
PREMSQL_API_TOKEN=<your-strong-random-token>  # REQUIRED!
PREMSQL_DJANGO_SECRET_KEY=<your-strong-random-secret>

How to generate secure tokens:

# Using Python (recommended)
python -c "import secrets; print(secrets.token_hex(32))"

# Using OpenSSL
openssl rand -hex 32

# Using UUID
python -c "import uuid; print(uuid.uuid4().hex)"

Example output: a1b2c3d4e5f6... (64 characters hex string)

Security behavior summary:

Mode PREMSQL_API_TOKEN Behavior
Development Not set Auto-generated, auth skipped
Development Set Use configured token
Production Not set Service fails to start
Production Set Enforce authentication

Resolves #37 #40

ace-xc and others added 4 commits April 13, 2026 17:42
This PR addresses multiple security vulnerabilities:

## Critical Fixes
- Replace eval() with ast.literal_eval() for model output parsing (RCE prevention)
- Add SSRF protection: restrict base_url to loopback addresses only
- Implement API authentication via PREMSQL_API_TOKEN environment variable
- Enforce read-only SQL execution (SELECT/WITH only)

## High Priority Fixes
- Add session_name validation with whitelist regex [A-Za-z0-9_-]{1,64}
- Implement path traversal protection with resolve_path_within_root()
- Add safe_error_message() whitelist mechanism for error handling
- Remove sensitive fields from API responses (db_connection_uri, session_db_path)

## Medium Priority Fixes
- Add weights_only=True to torch.load() calls (pickle RCE prevention)
- Restrict Swagger API access based on DEBUG mode
- Replace pkill with PID file-based process management
- Add resource limits for upload endpoints

## Database Changes
- db_connection_uri: URLField -> CharField (sqlite paths are not URLs)
- session_db_path: add blank=True, default=""
- Completions: add agent_output JSONField

All fixes follow the "control input" whitelist approach.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Security Fixes:
- RCE via eval() → ast.literal_eval()
- SSRF → loopback-only URLs
- Missing auth → token-based authentication
- SQL write ops → read-only enforcement
- Path traversal → whitelist validation
- Error message leakage → safe_error_message()
- Pickle RCE → weights_only=True
- SQL injection → parameterized queries

New Features:
- Text2SQLGeneratorVLLM for vLLM deployments (auto Qwen3 thinking mode)
- Text2SQLGeneratorOpenAICompatible for any OpenAI-compatible API
- start_agent.py for easy deployment
- UI improvements: delete buttons for sessions

Bug Fixes:
- Plot image generation in server mode
- Matplotlib Agg backend for non-interactive mode
- Session duplicate/conflict handling
- API token propagation between services
- Dotenv loading in Django and Streamlit

Security Configuration:
- DEBUG=true: auto-generated tokens, auth skipped (development)
- DEBUG=false: PREMSQL_API_TOKEN required (production)
- Update Python version range: >=3.10,<3.13 (support 3.11, 3.12)
- Update fastapi: >=0.115.0 (brings httpx>=0.27, starlette>=0.41)
- Add explicit httpx: >=0.27.0 (compatible with ollama, browser-use)
- Add explicit starlette: >=0.41.0 (compatible with sse-starlette)
- Relax uvicorn: >=0.32.0

Resolves premAI-io#37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[check] python==3.12 not compatible

1 participant