Skip to content

melon1010/text2sql

Repository files navigation

Text2SQL SDK

English | 简体中文

An open-source Java 17 Text2SQL SDK with the core reasoning chain, Spring Boot starter, and runnable examples. It relies on Qdrant (vector search), Elasticsearch (keyword search + few-shot recall), and pluggable LLMs.

Features

  • Hybrid search: Qdrant column vector search + ES BM25 keyword search with metadata filtering and rerank.
  • NL2SQL pipeline: rewrite → reasoning → SQL generation, with a fallback template generator.
  • Multi-provider support: Supports 5 LLM providers and 4+ Embedding providers.
  • Caching & safety: local TTL cache, SQL injection detection, and automatic LIMIT protection.
  • Extensible: modular schema extraction, vectorization, search, and generators for easy replacement.
  • Spring ecosystem: starter provides configuration binding, auto-configuration, and health checks.

Supported Providers

LLM Providers

Provider Models Notes
OpenAI gpt-4, gpt-4-turbo, gpt-3.5-turbo Standard OpenAI API
Deepseek deepseek-chat, deepseek-coder, deepseek-reasoner OpenAI-compatible
Ollama llama3, qwen2.5, mistral, codellama Local deployment
SiliconFlow Qwen/Qwen2.5-72B-Instruct, DeepSeek-V3 硅基流动, OpenAI-compatible
ZhipuAI glm-4, glm-4-flash, glm-3-turbo 智谱AI

Embedding Providers

Provider Models Dimension
OpenAI text-embedding-3-small, text-embedding-3-large 1536 / 3072
Ollama bge-large-zh, nomic-embed-text, mxbai-embed-large 1024 / 768
SiliconFlow BAAI/bge-large-zh-v1.5, BAAI/bge-m3 1024
ZhipuAI embedding-2, embedding-3 1024 / 2048

Modules

  • text2sql-core: core logic and APIs (schema management, search, NL2SQL, cache, security).
  • text2sql-spring-boot-starter: auto-configuration and health checks.
  • text2sql-examples: basic usage, Spring Boot integration, and advanced samples.

Requirements

  • JDK 17, Maven 3.9+
  • Elasticsearch (default index text2sql_schemas)
  • Qdrant (default collection text2sql_schemas)
  • One of the supported LLM providers
  • One of the supported Embedding providers

Quickstart

1. Build

mvn -pl text2sql-core -am package -DskipTests

2. Configure LLM Provider

// Deepseek
LlmConfig llmConfig = LlmConfig.deepseek("sk-your-api-key");

// OpenAI
LlmConfig llmConfig = LlmConfig.openai("sk-your-api-key", "gpt-4");

// Ollama (local)
LlmConfig llmConfig = LlmConfig.ollama("qwen2.5:14b");

// SiliconFlow
LlmConfig llmConfig = LlmConfig.siliconflow("sf-your-api-key");

// ZhipuAI
LlmConfig llmConfig = LlmConfig.zhipu("your-api-key");

3. Use the Client

Text2SqlClient client = Text2SqlClient.builder()
    .datasource(DatasourceConfig.builder()
        .name("demo")
        .url("jdbc:mysql://localhost:3306/demo")
        .username("user")
        .password("pass")
        .build())
    .qdrant(QdrantConfig.builder()
        .host("http://localhost")
        .port(6333)
        .embeddingUrl("http://localhost:11434/api/embeddings")
        .embeddingModel("bge-large-zh")
        .dimension(1024)
        .build())
    .elasticsearch(EsConfig.builder().host("http://localhost").port(9200).build())
    .llm(LlmConfig.deepseek(System.getenv("DEEPSEEK_API_KEY")))
    .buildAndInitialize();

String sql = client.query("Total order amount in the last 7 days");
QueryResponse result = client.queryAndExecute("Top-selling products", QueryOptions.defaults());

Spring Boot Configuration

text2sql:
  llm:
    provider: DEEPSEEK  # OPENAI, DEEPSEEK, OLLAMA, SILICONFLOW, ZHIPU
    api-key: ${DEEPSEEK_API_KEY}
    model-name: deepseek-chat
  qdrant:
    embeddingUrl: http://localhost:11434/api/embeddings
    embeddingModel: bge-large-zh
    dimension: 1024

Development

  • Java 17 + Lombok; run mvn -pl text2sql-core -am package -DskipTests before submitting.
  • Core module keeps pluggable interfaces (cache, LLM, vector/keyword stores) lightweight.

Contributing

Issues and PRs are welcome! See CONTRIBUTING.md for guidelines.

License

Apache License 2.0, see LICENSE.

About

text2sql sdk

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages