Add full-text search (FTS) tools to MCP query server by rahim-kanji · Pull Request #21 · ProxySQL/proxysql-vec

rahim-kanji · 2026-01-20T09:22:57Z

Summary

Add full-text search (FTS) capability with BM25 ranking for MySQL/MariaDB tables
Implement external SQLite-based FTS index that doesn't require native MySQL FTS support
Add 6 new FTS tools for index management and searching

New Tools Added

Tool	Description
`fts_index_table`	Create and populate FTS index for a table
`fts_search`	Search indexed data with BM25 ranking
`fts_reindex`	Refresh index with fresh data
`fts_delete_index`	Remove an FTS index
`fts_list_indexes`	List all FTS indexes
`fts_rebuild_all`	Rebuild all FTS indexes

Key Features

Thread-safe FTS operations with mutex protection
Optional WHERE clause support for partial table indexing
Primary key mapping for source data retrieval
Configurable FTS database path (runtime resettable)
Comprehensive test coverage with 2 test scripts

Technical Details

Uses SQLite FTS5 extension for inverted index
BM25 algorithm for relevance ranking
Stores indexed columns + metadata externally
Maintains 1:1 mapping with MySQL primary keys
Separate from MySQL - no performance impact on writes

Test Plan

✅ scripts/mcp/test_mcp_fts.sh - Basic FTS functionality
✅ scripts/mcp/test_mcp_fts_detailed.sh - Comprehensive FTS testing

Implement BM25-ranked full-text search capability for MySQL/MariaDB tables using SQLite-based external FTS index. Changes: - Add MySQL_FTS class for managing SQLite FTS indexes - Add FTS tools: fts_index_table, fts_search, fts_reindex, fts_delete_index, fts_list_indexes, fts_rebuild_all - Add thread-safe FTS lifecycle management with fts_lock mutex - Add reset_fts_path() for runtime FTS database path configuration - Add comprehensive FTS test scripts (test_mcp_fts.sh, test_mcp_fts_detailed.sh)

coderabbitai · 2026-01-20T09:23:40Z

📝 Walkthrough

Summary by CodeRabbit

New Features
- Added Full Text Search (FTS): index, search, list, delete, reindex, rebuild operations and runtime-configurable FTS database path.
Tests
- Added extensive automated and detailed FTS test suites covering index lifecycle, searches (phrase, pagination, BM25), snippets, special cases, and config reloads.
Documentation
- Added comprehensive FTS user guide with usage, configuration, examples, and troubleshooting.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Walkthrough

Adds SQLite-backed Full Text Search (FTS): new mcp_fts_path config, MySQL_Tool_Handler FTS wiring and APIs, MCP variable get/set support, Query tools registration, build integration, MySQL_FTS implementation, tests, and user documentation.

Changes

Cohort / File(s)	Summary
Configuration Headers `include/MCP_Thread.h`, `include/MySQL_Tool_Handler.h`	Added `mcp_fts_path` config to MCP_Threads_Handler; extended `MySQL_Tool_Handler` API and members (FTS pointer, `fts_lock`), constructor now accepts `fts_path`, `reset_fts_path`, `reinit_fts`, multiple `fts_*` public methods; `execute_query` moved to public.
MCP Thread Logic `lib/MCP_Thread.cpp`	Registered `fts_path` variable, defaulted to `mcp_fts.db`, cleaned up on destroy, added `get_variable("fts_path")` and `set_variable("fts_path")` invoking `mysql_tool_handler->reset_fts_path(...)` at runtime.
MySQL Tool Implementation `lib/MySQL_Tool_Handler.cpp`	Implemented FTS lifecycle: constructor/destructor handling, conditional FTS creation, `init()` integration, `reset_fts_path`, `reinit_fts`, and thread-safe wrappers (`fts_index_table`, `fts_search`, `fts_list_indexes`, `fts_delete_index`, `fts_reindex`, `fts_rebuild_all`) with filesystem checks and locking.
FTS Core `include/MySQL_FTS.h`, `lib/MySQL_FTS.cpp`	New `MySQL_FTS` class and implementation: SQLite-backed index catalog, index/create/delete/reindex/rebuild/search/list APIs, schema/table helpers, JSON results, integration points accepting `MySQL_Tool_Handler` for data extraction.
Tool Registration `lib/Query_Tool_Handler.cpp`	Registered and implemented MCP tools: `fts_index_table`, `fts_search`, `fts_list_indexes`, `fts_delete_index`, `fts_reindex`, `fts_rebuild_all`; parameter extraction and dispatch to `MySQL_Tool_Handler`.
Build & Integration `lib/Makefile`, `lib/ProxySQL_MCP_Server.cpp`	Added `MySQL_FTS.oo` to lib objects; updated `MySQL_Tool_Handler` constructor call sites to pass `mcp_fts_path`.
Tests `scripts/mcp/test_mcp_fts.sh`, `scripts/mcp/test_mcp_fts_detailed.sh`	Added extensive integration and detailed test suites covering index lifecycle, search variants (BM25, snippets, pagination, special chars), reindex/delete/rebuild, config reloads, custom DB paths, and setup/teardown helpers.
Documentation `doc/MCP/FTS_USER_GUIDE.md`	New comprehensive user guide for MCP FTS: architecture, configuration, APIs, examples, troubleshooting, and testing instructions.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client/MCP
    participant QTH as Query_Tool_Handler
    participant MTH as MySQL_Tool_Handler
    participant FTS as MySQL_FTS
    participant DB as SQLite

    rect rgba(200,200,255,0.5)
    Client->>QTH: fts_index_table(schema, table, columns)
    QTH->>MTH: fts_index_table(...) [acquire fts_lock]
    MTH->>FTS: index_table(...)
    FTS->>DB: write index metadata / store index
    DB-->>FTS: OK
    FTS-->>MTH: JSON result
    MTH-->>QTH: return result
    QTH-->>Client: JSON response
    end

    rect rgba(200,255,200,0.5)
    Client->>QTH: fts_search(query, schema, table)
    QTH->>MTH: fts_search(...) [acquire fts_lock]
    MTH->>FTS: search(...)
    FTS->>DB: SELECT using FTS (MATCH/snippet)
    DB-->>FTS: rows + snippets
    FTS-->>MTH: JSON result
    MTH-->>QTH: return result
    QTH-->>Client: JSON response
    end

sequenceDiagram
    participant Admin as Admin/Config
    participant MCP as MCP_Threads_Handler
    participant MTH as MySQL_Tool_Handler
    participant OldFTS as MySQL_FTS (old)
    participant NewFTS as MySQL_FTS (new)

    rect rgba(255,230,200,0.5)
    Admin->>MCP: set_variable("fts_path", new_path)
    MCP->>MTH: reset_fts_path(new_path)
    MTH->>MTH: validate directory
    MTH->>NewFTS: create & initialize
    NewFTS-->>MTH: initialized
    MTH->>MTH: acquire fts_lock
    MTH->>OldFTS: delete old instance (if any)
    MTH->>MTH: swap fts pointer to NewFTS
    MTH->>MTH: release fts_lock
    MTH-->>MCP: success
    MCP-->>Admin: variable updated
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~65 minutes

Possibly related PRs

MCP (Model Context Protocol) Multi-Endpoint Architecture #9: Modifies MCP_Threads_Handler and adds FTS-related wiring similar to this change (overlapping classes and variables).

Poem

🐰 I hopped through code, found FTS ground,
I planted indexes, snippets abound,
Locks snugly held, swaps done with care,
Queries return highlights everywhere,
A tiny rabbit cheered — search now shared!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 51.32% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title accurately summarizes the main change: adding full-text search (FTS) tools to the MCP query server, which aligns with the comprehensive FTS implementation across multiple files.
Description check	✅ Passed	The PR description is well-related to the changeset, providing a clear summary of FTS capabilities, new tools, features, and test coverage that matches the actual implementation changes.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 8

🤖 Fix all issues with AI agents

In `@lib/MCP_Thread.cpp`:
- Around line 333-347: Currently the code updates variables.mcp_fts_path (via
strdup/free) before calling mysql_tool_handler->reset_fts_path(value), which can
leave the stored config out of sync if reset_fts_path fails; change the flow so
you attempt the runtime update first (call
mysql_tool_handler->reset_fts_path(value) while leaving variables.mcp_fts_path
unchanged), and only on success free() the old variables.mcp_fts_path and assign
strdup(value), returning 0; if reset_fts_path fails, do not modify
variables.mcp_fts_path and return -1. Reference symbols: variables.mcp_fts_path,
mysql_tool_handler, reset_fts_path, strdup, free.

In `@lib/MySQL_Tool_Handler.cpp`:
- Around line 117-137: reset_fts_path deletes the existing FTS before validating
the new path; instead allocate a temporary MySQL_FTS* (e.g., new_fts) with the
provided path, call new_fts->init(), and if init() fails call proxy_error,
delete new_fts and return false while leaving the original fts intact; only on
successful init swap under the existing pthread_mutex_lock (&fts_lock) by
deleting the old fts, assigning fts = new_fts, and then unlock and return true;
also preserve current behavior when path is empty (delete existing fts) and
ensure no double-free by only deleting the old instance after new_fts
successfully initialized.
- Around line 1142-1178: reinit_fts currently mutates the shared
pointer/variable fts without acquiring the FTS mutex; to fix, perform the
expensive new MySQL_FTS(fts_path) and new_fts->init() before taking the lock,
then acquire fts_lock (the same mutex used by other FTS calls) and inside the
critical section delete the old fts and assign fts = new_fts, then release the
lock; use the same locking primitive as the rest of the code (e.g.,
std::lock_guard or equivalent on fts_lock) to ensure no race with in-flight
operations while avoiding holding the lock during initialization.

In `@scripts/mcp/test_mcp_fts_detailed.sh`:
- Around line 71-74: The mysql_exec function exposes MYSQL_PASSWORD on the
process command line via the -p option; change it to avoid inline password usage
by using a safer method such as setting MYSQL_PWD in the environment before
invoking mysql or by using a MySQL option file, e.g., export
MYSQL_PWD="${MYSQL_PASSWORD}" && mysql -h "${MYSQL_HOST}" -P "${MYSQL_PORT}" -u
"${MYSQL_USER}" -e "${sql}" (and unset MYSQL_PWD afterward); update the
mysql_exec function and apply the same change to the similar invocation in
test_mcp_fts.sh to remove -p"${MYSQL_PASSWORD}" and prevent password leakage to
ps listings.
- Around line 251-260: The cleanup block currently unconditionally attempts to
delete fts_test.orders although it was already deleted earlier; update the
cleanup logic around the calls to tool_call "fts_delete_index" (and the
subsequent extract_tool_result) to either first check existence (e.g., call a
check/existence tool or API) and only call fts_delete_index for existing
indexes, or relax the jq assertion to accept either a success result or the
expected "index not found" error response (e.g., accept .success == true OR
.error indicates non-existent index) so the second delete does not cause the
test to fail.

In `@scripts/mcp/test_mcp_fts.sh`:
- Around line 992-993: The mysql invocations in test_fts_custom_database_path
(e.g., the command populating current_path and other mysql calls) expose the
password on the command line; replace direct use of -p"${MYSQL_PASSWORD}" with a
secure approach such as exporting MYSQL_PWD="${MYSQL_PASSWORD}" for the command
(e.g., MYSQL_PWD="${MYSQL_PASSWORD}" mysql -h "${MYSQL_HOST}" ...) or implement
a small helper function (e.g., safe_mysql()) that injects MYSQL_PWD into the
environment and runs mysql, then update all occurrences (current_path assignment
and the other mysql calls referenced) to use that helper to avoid password
exposure. Ensure the helper preserves -s -N and 2>/dev/null flags used in
existing calls.
- Around line 279-283: The mysql_exec function currently exposes the password on
the command line via -p"${MYSQL_PASSWORD}"; update mysql_exec to avoid that by
passing the password through the MYSQL_PWD environment variable instead (e.g.,
invoke mysql with MYSQL_PWD="${MYSQL_PASSWORD}" mysql -h "${MYSQL_HOST}" -P
"${MYSQL_PORT}" -u "${MYSQL_USER}" -e "${sql}" 2>/dev/null) so the password is
not visible in process listings; keep the function name mysql_exec and the same
parameters but remove the inline -p argument.
- Around line 341-343: The INSERT builds SQL with unescaped variables (title,
doc, category, priority) which allows quotes to break the statement; before
calling mysql_exec in the loop, sanitize each variable by replacing every
single-quote (') with two single-quotes ('') so the SQL string is safe (e.g.,
transform ${doc} -> escaped_doc), then use those escaped variables in the
INSERT; update the code around the mysql_exec call and reference the mysql_exec
invocation and the TEST_SCHEMA/TEST_TABLE variables when making the change.

🧹 Nitpick comments (7)

lib/Query_Tool_Handler.cpp (2)
278-300: Remove debug fprintf statements before merging.

These debug statements will pollute stderr in production. Convert to proxy_debug() calls for consistency with the rest of the codebase, or remove entirely.
♻️ Suggested fix
 static std::string get_json_string(const json& j, const std::string& key, const std::string& default_val = "") {
-	fprintf(stderr, "DEBUG: get_json_string key=%s, default='%s'\n", key.c_str(), default_val.c_str());
 	if (j.contains(key)) {
 		const json& val = j[key];
-		fprintf(stderr, "DEBUG: key exists, is_null=%d, is_string=%d\n", val.is_null(), val.is_string());
 		if (!val.is_null()) {
 			if (val.is_string()) {
 				std::string result = val.get<std::string>();
-				fprintf(stderr, "DEBUG: returning string: '%s'\n", result.c_str());
 				return result;
 			} else {
-				fprintf(stderr, "DEBUG: value is not a string, trying dump\n");
 				std::string result = val.dump();
-				fprintf(stderr, "DEBUG: returning dumped: '%s'\n", result.c_str());
 				return result;
 			}
 		}
 	}
-	fprintf(stderr, "DEBUG: returning default: '%s'\n", default_val.c_str());
 	return default_val;
 }
310-312: Debug fprintf in execute_tool should also be removed.

Same issue as the helper function - this debug statement will pollute stderr.
♻️ Suggested fix
 json Query_Tool_Handler::execute_tool(const std::string& tool_name, const json& arguments) {
-	fprintf(stderr, "DEBUG: execute_tool tool_name=%s, arguments=%s\n", tool_name.c_str(), arguments.dump().c_str());
-
 	if (!mysql_handler) {
 		return create_error_response("MySQL handler not initialized");
 	}
scripts/mcp/test_mcp_fts_detailed.sh (2)
101-106: Quote CURL_OPTS to prevent word splitting issues.

When CURL_OPTS is empty, the unquoted expansion can cause unexpected behavior. Use an array or conditional inclusion.
♻️ Suggested fix
 mcp_request() {
   local payload="$1"
-  curl ${CURL_OPTS} -s -X POST "${MCP_ENDPOINT}" \
+  curl ${CURL_OPTS:+"${CURL_OPTS}"} -s -X POST "${MCP_ENDPOINT}" \
     -H "Content-Type: application/json" \
     -d "${payload}"
 }
Or use an array approach for cleaner handling:
if [ "${USE_SSL}" = "true" ]; then
  CURL_OPTS=(-k)
else
  CURL_OPTS=()
fi

# Then use:
curl "${CURL_OPTS[@]}" -s -X POST ...
162-162: Columns argument passed as JSON string literal may cause issues.

The columns argument is passed as a raw JSON array string '["name","email","created_at"]' which is then interpolated into another JSON string. This works but is fragile. Consider validating the JSON structure.
scripts/mcp/test_mcp_fts.sh (3)
58-58: Unused variable FTS_INDEX_NAME.

This variable is defined but never used. Either remove it or use it where ${TEST_SCHEMA}.${TEST_TABLE} is repeated.
-FTS_INDEX_NAME="${TEST_SCHEMA}.${TEST_TABLE}"
74-81: Unused associative array SEARCH_QUERIES.

The SEARCH_QUERIES array is defined but the tests use hardcoded query strings instead. Remove or utilize this array for consistency.

356-359: Unused variable delete_response - consider checking for errors.

The response from fts_delete_index is captured but not validated. While cleanup failures may be acceptable, logging the error could help debugging.
♻️ Suggested improvement
     # Drop FTS index if exists
     local delete_response
     delete_response=$(fts_tool_call "fts_delete_index" "{\"schema\": \"${TEST_SCHEMA}\", \"table\": \"${TEST_TABLE}\"}")
+    log_verbose "FTS index delete response: ${delete_response}"

coderabbitai · 2026-01-20T09:31:20Z

lib/MCP_Thread.cpp

+	if (!strcmp(name, "fts_path")) {
+		if (variables.mcp_fts_path)
+			free(variables.mcp_fts_path);
+		variables.mcp_fts_path = strdup(value);
+		// Apply at runtime by resetting FTS in the existing handler
+		if (mysql_tool_handler) {
+			proxy_info("MCP: Applying new fts_path at runtime: %s\n", value);
+			if (!mysql_tool_handler->reset_fts_path(value)) {
+				proxy_error("Failed to reset FTS path at runtime\n");
+				return -1;
+			}
+		}
+
+		return 0;
+	}


⚠️ Potential issue | 🟠 Major

Avoid updating mcp_fts_path when runtime reset fails.

The variable is updated before reset_fts_path succeeds, which can leave config out of sync with the live FTS handler. Swap the stored value only after a successful reset (or roll back on failure).

🔧 Suggested fix (update after successful reset)

if (!strcmp(name, "fts_path")) { - if (variables.mcp_fts_path) - free(variables.mcp_fts_path); - variables.mcp_fts_path = strdup(value); - // Apply at runtime by resetting FTS in the existing handler - if (mysql_tool_handler) { - proxy_info("MCP: Applying new fts_path at runtime: %s\n", value); - if (!mysql_tool_handler->reset_fts_path(value)) { - proxy_error("Failed to reset FTS path at runtime\n"); - return -1; - } - } - - return 0; + char* new_path = strdup(value); + if (!new_path) + return -1; + // Apply at runtime by resetting FTS in the existing handler + if (mysql_tool_handler) { + proxy_info("MCP: Applying new fts_path at runtime: %s\n", value); + if (!mysql_tool_handler->reset_fts_path(value)) { + proxy_error("Failed to reset FTS path at runtime\n"); + free(new_path); + return -1; + } + } + if (variables.mcp_fts_path) + free(variables.mcp_fts_path); + variables.mcp_fts_path = new_path; + return 0; }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (!strcmp(name, "fts_path")) {

if (variables.mcp_fts_path)

free(variables.mcp_fts_path);

variables.mcp_fts_path = strdup(value);

// Apply at runtime by resetting FTS in the existing handler

if (mysql_tool_handler) {

proxy_info("MCP: Applying new fts_path at runtime: %s\n", value);

if (!mysql_tool_handler->reset_fts_path(value)) {

proxy_error("Failed to reset FTS path at runtime\n");

return -1;

}

}

return 0;

}

if (!strcmp(name, "fts_path")) {

char* new_path = strdup(value);

if (!new_path)

return -1;

// Apply at runtime by resetting FTS in the existing handler

if (mysql_tool_handler) {

proxy_info("MCP: Applying new fts_path at runtime: %s\n", value);

if (!mysql_tool_handler->reset_fts_path(value)) {

proxy_error("Failed to reset FTS path at runtime\n");

free(new_path);

return -1;

}

}

if (variables.mcp_fts_path)

free(variables.mcp_fts_path);

variables.mcp_fts_path = new_path;

return 0;

}

🤖 Prompt for AI Agents

In `@lib/MCP_Thread.cpp` around lines 333 - 347, Currently the code updates variables.mcp_fts_path (via strdup/free) before calling mysql_tool_handler->reset_fts_path(value), which can leave the stored config out of sync if reset_fts_path fails; change the flow so you attempt the runtime update first (call mysql_tool_handler->reset_fts_path(value) while leaving variables.mcp_fts_path unchanged), and only on success free() the old variables.mcp_fts_path and assign strdup(value), returning 0; if reset_fts_path fails, do not modify variables.mcp_fts_path and return -1. Reference symbols: variables.mcp_fts_path, mysql_tool_handler, reset_fts_path, strdup, free.

coderabbitai · 2026-01-20T09:31:20Z

lib/MySQL_Tool_Handler.cpp

+bool MySQL_Tool_Handler::reset_fts_path(const std::string& path) {
+	pthread_mutex_lock(&fts_lock);
+
+	if (fts) {
+		delete fts;
+		fts = NULL;
+	}
+
+	if (!path.empty()) {
+		fts = new MySQL_FTS(path);
+		if (fts->init()) {
+			proxy_error("Failed to initialize FTS with new path: %s\n", path.c_str());
+			delete fts;
+			fts = NULL;
+			pthread_mutex_unlock(&fts_lock);
+			return false;
+		}
+	}
+
+	pthread_mutex_unlock(&fts_lock);
+	return true;


⚠️ Potential issue | 🟠 Major

Preserve existing FTS if the new path fails to init.

Currently the old FTS instance is deleted before the new one is validated, which can disable FTS on a bad path. Swap only after a successful init.

🔧 Suggested fix (safe swap)

bool MySQL_Tool_Handler::reset_fts_path(const std::string& path) { pthread_mutex_lock(&fts_lock); - if (fts) { - delete fts; - fts = NULL; - } - - if (!path.empty()) { - fts = new MySQL_FTS(path); - if (fts->init()) { - proxy_error("Failed to initialize FTS with new path: %s\n", path.c_str()); - delete fts; - fts = NULL; - pthread_mutex_unlock(&fts_lock); - return false; - } - } + MySQL_FTS* new_fts = NULL; + if (!path.empty()) { + new_fts = new MySQL_FTS(path); + if (!new_fts || new_fts->init()) { + proxy_error("Failed to initialize FTS with new path: %s\n", path.c_str()); + delete new_fts; + pthread_mutex_unlock(&fts_lock); + return false; + } + } + if (fts) { + delete fts; + } + fts = new_fts; pthread_mutex_unlock(&fts_lock); return true; }

🤖 Prompt for AI Agents

In `@lib/MySQL_Tool_Handler.cpp` around lines 117 - 137, reset_fts_path deletes the existing FTS before validating the new path; instead allocate a temporary MySQL_FTS* (e.g., new_fts) with the provided path, call new_fts->init(), and if init() fails call proxy_error, delete new_fts and return false while leaving the original fts intact; only on successful init swap under the existing pthread_mutex_lock (&fts_lock) by deleting the old fts, assigning fts = new_fts, and then unlock and return true; also preserve current behavior when path is empty (delete existing fts) and ensure no double-free by only deleting the old instance after new_fts successfully initialized.

coderabbitai · 2026-01-20T09:31:20Z

lib/MySQL_Tool_Handler.cpp

+int MySQL_Tool_Handler::reinit_fts(const std::string& fts_path) {
+	proxy_info("MySQL_Tool_Handler: Reinitializing FTS with path: %s\n", fts_path.c_str());
+
+	// Check if directory exists (SQLite can't create directories)
+	std::string::size_type last_slash = fts_path.find_last_of("/");
+	if (last_slash != std::string::npos && last_slash > 0) {
+		std::string dir = fts_path.substr(0, last_slash);
+		struct stat st;
+		if (stat(dir.c_str(), &st) != 0 || !S_ISDIR(st.st_mode)) {
+			proxy_error("MySQL_Tool_Handler: Directory does not exist for path '%s' (directory: '%s')\n",
+				fts_path.c_str(), dir.c_str());
+			return -1;
+		}
+	}
+
+	// First, test if we can open the new database
+	MySQL_FTS* new_fts = new MySQL_FTS(fts_path);
+	if (!new_fts) {
+		proxy_error("MySQL_Tool_Handler: Failed to create new FTS handler\n");
+		return -1;
+	}
+
+	if (new_fts->init() != 0) {
+		proxy_error("MySQL_Tool_Handler: Failed to initialize FTS at %s\n", fts_path.c_str());
+		delete new_fts;
+		return -1;  // Return error WITHOUT closing old FTS
+	}
+
+	// Success! Now close old and replace with new
+	if (fts) {
+		delete fts;
+	}
+	fts = new_fts;
+
+	proxy_info("MySQL_Tool_Handler: FTS reinitialized successfully at %s\n", fts_path.c_str());
+	return 0;
+}


⚠️ Potential issue | 🟠 Major

Guard reinit_fts with fts_lock to prevent races.

This method mutates fts without acquiring the same mutex used by FTS calls, which can race with in-flight operations. Lock it consistently.

🔧 Suggested fix (lock around mutation)

int MySQL_Tool_Handler::reinit_fts(const std::string& fts_path) { + pthread_mutex_lock(&fts_lock); proxy_info("MySQL_Tool_Handler: Reinitializing FTS with path: %s\n", fts_path.c_str()); // Check if directory exists (SQLite can't create directories) std::string::size_type last_slash = fts_path.find_last_of("/"); if (last_slash != std::string::npos && last_slash > 0) { std::string dir = fts_path.substr(0, last_slash); struct stat st; if (stat(dir.c_str(), &st) != 0 || !S_ISDIR(st.st_mode)) { proxy_error("MySQL_Tool_Handler: Directory does not exist for path '%s' (directory: '%s')\n", fts_path.c_str(), dir.c_str()); + pthread_mutex_unlock(&fts_lock); return -1; } } // First, test if we can open the new database MySQL_FTS* new_fts = new MySQL_FTS(fts_path); if (!new_fts) { proxy_error("MySQL_Tool_Handler: Failed to create new FTS handler\n"); + pthread_mutex_unlock(&fts_lock); return -1; } if (new_fts->init() != 0) { proxy_error("MySQL_Tool_Handler: Failed to initialize FTS at %s\n", fts_path.c_str()); delete new_fts; + pthread_mutex_unlock(&fts_lock); return -1; // Return error WITHOUT closing old FTS } // Success! Now close old and replace with new if (fts) { delete fts; } fts = new_fts; proxy_info("MySQL_Tool_Handler: FTS reinitialized successfully at %s\n", fts_path.c_str()); + pthread_mutex_unlock(&fts_lock); return 0; }

🤖 Prompt for AI Agents

In `@lib/MySQL_Tool_Handler.cpp` around lines 1142 - 1178, reinit_fts currently mutates the shared pointer/variable fts without acquiring the FTS mutex; to fix, perform the expensive new MySQL_FTS(fts_path) and new_fts->init() before taking the lock, then acquire fts_lock (the same mutex used by other FTS calls) and inside the critical section delete the old fts and assign fts = new_fts, then release the lock; use the same locking primitive as the rest of the code (e.g., std::lock_guard or equivalent on fts_lock) to ensure no race with in-flight operations while avoiding holding the lock during initialization.

coderabbitai · 2026-01-20T09:31:21Z

scripts/mcp/test_mcp_fts_detailed.sh

+mysql_exec() {
+  local sql="$1"
+  mysql -h "${MYSQL_HOST}" -P "${MYSQL_PORT}" -u "${MYSQL_USER}" -p"${MYSQL_PASSWORD}" -e "${sql}"
+}


⚠️ Potential issue | 🟡 Minor

Password exposed in command line is visible in process list.

Using -p"${MYSQL_PASSWORD}" exposes the password in ps output. Consider using MYSQL_PWD environment variable or a MySQL option file instead.

🔒 Suggested fix using MYSQL_PWD

mysql_exec() { local sql="$1" - mysql -h "${MYSQL_HOST}" -P "${MYSQL_PORT}" -u "${MYSQL_USER}" -p"${MYSQL_PASSWORD}" -e "${sql}" + MYSQL_PWD="${MYSQL_PASSWORD}" mysql -h "${MYSQL_HOST}" -P "${MYSQL_PORT}" -u "${MYSQL_USER}" -e "${sql}" }

Note: The same pattern appears in test_mcp_fts.sh at line 279-282 and should be updated there as well.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

mysql_exec() {

local sql="$1"

mysql -h "${MYSQL_HOST}" -P "${MYSQL_PORT}" -u "${MYSQL_USER}" -p"${MYSQL_PASSWORD}" -e "${sql}"

}

mysql_exec() {

local sql="$1"

MYSQL_PWD="${MYSQL_PASSWORD}" mysql -h "${MYSQL_HOST}" -P "${MYSQL_PORT}" -u "${MYSQL_USER}" -e "${sql}"

}

🤖 Prompt for AI Agents

In `@scripts/mcp/test_mcp_fts_detailed.sh` around lines 71 - 74, The mysql_exec function exposes MYSQL_PASSWORD on the process command line via the -p option; change it to avoid inline password usage by using a safer method such as setting MYSQL_PWD in the environment before invoking mysql or by using a MySQL option file, e.g., export MYSQL_PWD="${MYSQL_PASSWORD}" && mysql -h "${MYSQL_HOST}" -P "${MYSQL_PORT}" -u "${MYSQL_USER}" -e "${sql}" (and unset MYSQL_PWD afterward); update the mysql_exec function and apply the same change to the similar invocation in test_mcp_fts.sh to remove -p"${MYSQL_PASSWORD}" and prevent password leakage to ps listings.

coderabbitai · 2026-01-20T09:31:21Z

scripts/mcp/test_mcp_fts_detailed.sh

+if [ "${CLEANUP}" = "true" ]; then
+  log "Cleanup: deleting fts_test.customers and fts_test.orders indexes"
+  delete_resp=$(tool_call "fts_delete_index" '{"schema":"fts_test","table":"customers"}')
+  delete_resp=$(extract_tool_result "${delete_resp}")
+  echo "${delete_resp}" | jq -e '.success == true' >/dev/null
+
+  delete_resp=$(tool_call "fts_delete_index" '{"schema":"fts_test","table":"orders"}')
+  delete_resp=$(extract_tool_result "${delete_resp}")
+  echo "${delete_resp}" | jq -e '.success == true' >/dev/null
+fi


⚠️ Potential issue | 🟡 Minor

Cleanup deletes fts_test.orders index that was already deleted at line 237.

The test at line 237 deletes fts_test.orders. When --cleanup is specified, line 257-259 attempts to delete it again. While the test verifies .success == true, the backend should return an error for non-existent index (as tested in test_mcp_fts.sh test 18). This assertion may fail.

🐛 Suggested fix - skip already-deleted index or expect failure

if [ "${CLEANUP}" = "true" ]; then log "Cleanup: deleting fts_test.customers and fts_test.orders indexes" delete_resp=$(tool_call "fts_delete_index" '{"schema":"fts_test","table":"customers"}') delete_resp=$(extract_tool_result "${delete_resp}") echo "${delete_resp}" | jq -e '.success == true' >/dev/null - delete_resp=$(tool_call "fts_delete_index" '{"schema":"fts_test","table":"orders"}') - delete_resp=$(extract_tool_result "${delete_resp}") - echo "${delete_resp}" | jq -e '.success == true' >/dev/null + # Note: fts_test.orders was already deleted earlier in the test fi

🤖 Prompt for AI Agents

In `@scripts/mcp/test_mcp_fts_detailed.sh` around lines 251 - 260, The cleanup block currently unconditionally attempts to delete fts_test.orders although it was already deleted earlier; update the cleanup logic around the calls to tool_call "fts_delete_index" (and the subsequent extract_tool_result) to either first check existence (e.g., call a check/existence tool or API) and only call fts_delete_index for existing indexes, or relax the jq assertion to accept either a success result or the expected "index not found" error response (e.g., accept .success == true OR .error indicates non-existent index) so the second delete does not cause the test to fail.

coderabbitai · 2026-01-20T09:31:21Z

scripts/mcp/test_mcp_fts.sh

+mysql_exec() {
+    local sql="$1"
+    mysql -h "${MYSQL_HOST}" -P "${MYSQL_PORT}" -u "${MYSQL_USER}" -p"${MYSQL_PASSWORD}" \
+        -e "${sql}" 2>/dev/null
+}


⚠️ Potential issue | 🟡 Minor

Same password exposure issue as the detailed test script.

🔒 Suggested fix using MYSQL_PWD

mysql_exec() { local sql="$1" - mysql -h "${MYSQL_HOST}" -P "${MYSQL_PORT}" -u "${MYSQL_USER}" -p"${MYSQL_PASSWORD}" \ - -e "${sql}" 2>/dev/null + MYSQL_PWD="${MYSQL_PASSWORD}" mysql -h "${MYSQL_HOST}" -P "${MYSQL_PORT}" -u "${MYSQL_USER}" \ + -e "${sql}" 2>/dev/null }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

mysql_exec() {

local sql="$1"

mysql -h "${MYSQL_HOST}" -P "${MYSQL_PORT}" -u "${MYSQL_USER}" -p"${MYSQL_PASSWORD}" \

-e "${sql}" 2>/dev/null

}

mysql_exec() {

local sql="$1"

MYSQL_PWD="${MYSQL_PASSWORD}" mysql -h "${MYSQL_HOST}" -P "${MYSQL_PORT}" -u "${MYSQL_USER}" \

-e "${sql}" 2>/dev/null

}

🤖 Prompt for AI Agents

In `@scripts/mcp/test_mcp_fts.sh` around lines 279 - 283, The mysql_exec function currently exposes the password on the command line via -p"${MYSQL_PASSWORD}"; update mysql_exec to avoid that by passing the password through the MYSQL_PWD environment variable instead (e.g., invoke mysql with MYSQL_PWD="${MYSQL_PASSWORD}" mysql -h "${MYSQL_HOST}" -P "${MYSQL_PORT}" -u "${MYSQL_USER}" -e "${sql}" 2>/dev/null) so the password is not visible in process listings; keep the function name mysql_exec and the same parameters but remove the inline -p argument.

coderabbitai · 2026-01-20T09:31:21Z

scripts/mcp/test_mcp_fts.sh

+        mysql_exec "INSERT INTO ${TEST_SCHEMA}.${TEST_TABLE} (title, content, category, priority) \
+            VALUES ('${title}', '${doc}', '${category}', '${priority}');" 2>/dev/null || true
+    done


⚠️ Potential issue | 🟠 Major

SQL injection vulnerability in INSERT statement.

The ${doc} variable contains user-defined test data with apostrophes (e.g., "Customer John Smith...") which will break the SQL or could enable injection. Use proper escaping or parameterized queries.

🐛 Suggested fix - escape single quotes

for doc_id in "${!TEST_DOCUMENTS[@]}"; do local doc="${TEST_DOCUMENTS[$doc_id]}" + # Escape single quotes for SQL + doc="${doc//\'/\'\'}" local title="Document ${doc_id}" ...

🤖 Prompt for AI Agents

In `@scripts/mcp/test_mcp_fts.sh` around lines 341 - 343, The INSERT builds SQL with unescaped variables (title, doc, category, priority) which allows quotes to break the statement; before calling mysql_exec in the loop, sanitize each variable by replacing every single-quote (') with two single-quotes ('') so the SQL string is safe (e.g., transform ${doc} -> escaped_doc), then use those escaped variables in the INSERT; update the code around the mysql_exec call and reference the mysql_exec invocation and the TEST_SCHEMA/TEST_TABLE variables when making the change.

coderabbitai · 2026-01-20T09:31:21Z

scripts/mcp/test_mcp_fts.sh

+    current_path=$(mysql -h "${MYSQL_HOST}" -P "${MYSQL_PORT}" -u "${MYSQL_USER}" -p"${MYSQL_PASSWORD}" \
+        -e "SELECT @@mcp-fts_path" -s -N 2>/dev/null | tr -d '\r')


⚠️ Potential issue | 🟡 Minor

Password exposure in multiple mysql invocations in test_fts_custom_database_path.

Several direct mysql calls in this test function also expose the password via command line. Apply the same MYSQL_PWD fix or create a helper function.

Also applies to: 1005-1006, 1018-1019, 1029-1030, 1089-1092

🤖 Prompt for AI Agents

In `@scripts/mcp/test_mcp_fts.sh` around lines 992 - 993, The mysql invocations in test_fts_custom_database_path (e.g., the command populating current_path and other mysql calls) expose the password on the command line; replace direct use of -p"${MYSQL_PASSWORD}" with a secure approach such as exporting MYSQL_PWD="${MYSQL_PASSWORD}" for the command (e.g., MYSQL_PWD="${MYSQL_PASSWORD}" mysql -h "${MYSQL_HOST}" ...) or implement a small helper function (e.g., safe_mysql()) that injects MYSQL_PWD into the environment and runs mysql, then update all occurrences (current_path assignment and the other mysql calls referenced) to use that helper to avoid password exposure. Ensure the helper preserves -s -N and 2>/dev/null flags used in existing calls.

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@doc/MCP/FTS_USER_GUIDE.md`:
- Line 853: Replace the bare URL "https://github.com/ProxySQL/proxysql-vec" with
a labeled Markdown link to satisfy lint and improve rendering; for example
change the line to use a descriptive label like [ProxySQL proxysql-vec GitHub
repository](https://github.com/ProxySQL/proxysql-vec) so the link is not bare.

🧹 Nitpick comments (2)

doc/MCP/FTS_USER_GUIDE.md (2)
32-38: Add language identifiers to fenced code blocks. This is flagged by markdownlint and improves readability/syntax highlighting.
Proposed fix
-```
+```text
 Traditional Query Flow:
 LLM Agent → Full Table Scan → Millions of Rows → Slow Response
@@
-```
+```text
@@
-```
+```text
@@
-```
+```text
@@
-```
+```text
@@
-```
+```text
Also applies to: 46-87, 92-99, 575-577, 773-775, 779-781

1-1: Hyphenate compound adjective “full-text” where it modifies a noun. Keeps phrasing consistent and matches common style guides.

Also applies to: 19-19

coderabbitai · 2026-01-20T09:34:11Z

doc/MCP/FTS_USER_GUIDE.md

+## Support
+
+For issues, questions, or contributions:
+- GitHub: https://github.com/ProxySQL/proxysql-vec


⚠️ Potential issue | 🟡 Minor

Avoid bare URLs in Markdown. Convert to a labeled link to satisfy lint and improve rendering.

Proposed fix

-- GitHub: https://github.com/ProxySQL/proxysql-vec +- GitHub: [ProxySQL/proxysql-vec](https://github.com/ProxySQL/proxysql-vec)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- GitHub: https://github.com/ProxySQL/proxysql-vec

- GitHub: [ProxySQL/proxysql-vec](https://github.com/ProxySQL/proxysql-vec)

🧰 Tools

🪛 markdownlint-cli2 (0.18.1)

853-853: Bare URL used

(MD034, no-bare-urls)

🤖 Prompt for AI Agents

In `@doc/MCP/FTS_USER_GUIDE.md` at line 853, Replace the bare URL "https://github.com/ProxySQL/proxysql-vec" with a labeled Markdown link to satisfy lint and improve rendering; for example change the line to use a descriptive label like [ProxySQL proxysql-vec GitHub repository](https://github.com/ProxySQL/proxysql-vec) so the link is not bare.

coderabbitai

Actionable comments posted: 7

🤖 Fix all issues with AI agents

In `@lib/MySQL_FTS.cpp`:
- Around line 303-316: The query builder in MySQL_FTS.cpp currently appends
identifiers and the where_clause raw (symbols: mysql_query, selected_cols,
schema, table, where_clause), which risks SQL injection; implement an
identifier-escaping helper (e.g., escape_identifier) that doubles any backticks
and use it when writing selected_cols, schema and table into mysql_query, and
stop appending where_clause verbatim—either require the caller to supply a
pre-validated/parameterized filter, validate the clause against an allowed
pattern, or convert to a parameterized WHERE using bound parameters; update the
mysql_query construction to use escape_identifier(schema),
escape_identifier(table), and escaped selected_cols instead of raw values.
- Around line 582-585: The code sets the C-string pointer error and returns
without freeing it, causing a memory leak; in the block where result["error"] is
assigned (the same place where error is checked), call the SQLite free stub
exactly as used in search(): (*proxy_sqlite3_free)(error) before returning
result.dump(); locate the error pointer usage in MySQL_FTS.cpp (the error
variable and the block that builds result["error"]) and add the free call
immediately after constructing the error message (ensuring you do not free error
twice elsewhere).
- Around line 748-756: The code reads reindex_json["error"].get<std::string>()
which can throw if the "error" key is missing; update the failure branch (where
reindex_json is inspected and failed_item is built) to safely obtain the error
string — either check reindex_json.contains("error") before calling get or use
reindex_json.value("error", "unknown error") (or similar default) to populate
failed_item["error"]; keep the rest of the logic (failed.push_back(failed_item),
rebuilt_count handling) unchanged and reference reindex_json, failed_item, and
failed in your change.
- Around line 90-99: sanitize_name currently only replaces '.', '-', and space,
leaving other dangerous characters (backticks, quotes, semicolons, nulls) that
can enable SQL injection when used in CREATE/ALTER statements; change it to an
allowlist-based sanitizer that only permits ASCII letters, digits and
underscore, enforces the first character is a letter, enforces a reasonable max
length, rejects or canonicalizes empty/invalid results (e.g., return error/throw
or prepend a fixed safe prefix), and explicitly strip or reject any NUL bytes or
non-ASCII characters; update the sanitize_name implementation to iterate over
bytes of the input, build a new string containing only [A-Za-z0-9_], ensure the
first character is a letter (or prefix with "t_" if not), truncate to the max
length, and make callers treat an empty/invalid sanitized result as an error
rather than embedding it into SQL.
- Around line 30-33: The init path leaks the allocated SQLite3DB when db->open()
fails; in MySQL_FTS.cpp where you construct SQLite3DB* db and call db->open(),
ensure you free the allocation before returning -1 (either delete db on error or
switch the variable to a smart pointer like std::unique_ptr<SQLite3DB>), and
include the same error logging via proxy_error; update the failure branch around
db->open() to release/delete db (or use unique_ptr so it auto-frees) before
returning.
- Around line 24-37: MySQL_FTS::init currently creates a VLA for db_path and
calls new SQLite3DB() without handling allocation failure; replace the VLA with
a heap buffer using std::vector<char> (copy db_path into it and push a
terminating '\0') and pass path_buf.data() to db->open, and change the
allocation to use new (std::nothrow) SQLite3DB() (or wrap in try/catch) and
check the returned pointer (db) for nullptr before calling db->open; on failure
call proxy_error and return -1; keep the rest of the flow (init_schema)
unchanged and reference the SQLite3DB, db_path, db->open, proxy_error, and
init_schema symbols when making the edits.
- Around line 488-497: The MATCH clause currently interpolates user input via
escape_sql(query) into the constructed search_sql (see fts_table, sanitized_data
and the WHERE ... MATCH usage), which allows FTS5 operator injection; fix by
producing an FTS5-literal-safe value: double any embedded double-quote
characters in the user query and wrap the entire query in double quotes before
inserting into MATCH (i.e., transform query -> "\"" + replace(query, "\"",
"\"\"") + "\""), or better yet switch to a parameterized/bound MATCH parameter
instead of string interpolation; update the code that builds search_sql (the
snippet constructing search_sql and the use of escape_sql) to use the
quoted/doubled-quote form or bind parameters so MATCH receives a literal phrase.

🧹 Nitpick comments (5)

include/MySQL_FTS.h (1)

7-7: Unused include.

<memory> is included but no smart pointers (std::unique_ptr, std::shared_ptr) are used in this class. Consider removing it or switching to smart pointers for SQLite3DB* db member.
lib/MySQL_FTS.cpp (4)
46-61: PRAGMA and index creation failures silently ignored.

Lines 48-50 execute PRAGMA statements without checking return values. While these are typically optional optimizations, failures could indicate underlying issues. Similarly, lines 84-85 don't check if index creation succeeded.

332-392: Batch insert without transaction may be slow and non-atomic.

Rows are inserted individually without an explicit transaction. SQLite auto-commits each statement by default (unless already in a transaction). For large tables, this will be significantly slower and won't be atomic.
Proposed fix - wrap in transaction
 	db->wrlock();
+	db->execute("BEGIN TRANSACTION");

 	try {
 		// ... existing insert loop ...

 		// Update metadata
 		// ...
 		db->execute(metadata_sql.str().c_str());

+		db->execute("COMMIT");
 		db->wrunlock();
 		// ...
 	} catch (const std::exception& e) {
+		db->execute("ROLLBACK");
 		db->wrunlock();
 		// ...
 	}
544-554: In-memory pagination is inefficient for large datasets.

All matching results are fetched into all_results before pagination is applied. For queries matching many rows across multiple indexes, this could consume significant memory. Consider pushing LIMIT and OFFSET to individual index queries or implementing cursor-based pagination.

633-656: Delete operations not wrapped in transaction.

If an error occurs between dropping triggers/tables and deleting metadata, the database could be left in an inconsistent state. Consider wrapping in a transaction.

coderabbitai · 2026-01-20T14:05:56Z

lib/MySQL_FTS.cpp

+int MySQL_FTS::init() {
+	// Initialize database connection
+	db = new SQLite3DB();
+	char path_buf[db_path.size() + 1];
+	strcpy(path_buf, db_path.c_str());
+	int rc = db->open(path_buf, SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE);
+	if (rc != SQLITE_OK) {
+		proxy_error("Failed to open FTS database at %s: %d\n", db_path.c_str(), rc);
+		return -1;
+	}
+
+	// Initialize schema
+	return init_schema();
+}


⚠️ Potential issue | 🟡 Minor

VLA and potential null pointer dereference.

Line 27 uses a Variable Length Array (VLA) which is a non-standard C++ extension. If db_path is very long, this could cause stack overflow.

Line 29 uses db->open() without checking if new SQLite3DB() returned a valid pointer (though in practice new throws on failure in C++, unless nothrow is used).

Suggested fix using std::vector or c_str() directly

int MySQL_FTS::init() { // Initialize database connection db = new SQLite3DB(); - char path_buf[db_path.size() + 1]; - strcpy(path_buf, db_path.c_str()); - int rc = db->open(path_buf, SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE); + int rc = db->open(const_cast<char*>(db_path.c_str()), SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE); if (rc != SQLITE_OK) { proxy_error("Failed to open FTS database at %s: %d\n", db_path.c_str(), rc); + delete db; + db = NULL; return -1; }

If db->open() requires a mutable buffer, use std::vector<char>:

std::vector<char> path_buf(db_path.begin(), db_path.end()); path_buf.push_back('\0'); int rc = db->open(path_buf.data(), SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE);

🧰 Tools

🪛 Cppcheck (2.19.0)

[warning] 30-30: If memory allocation fails, then there is a possible null pointer dereference

(nullPointerOutOfMemory)

🤖 Prompt for AI Agents

In `@lib/MySQL_FTS.cpp` around lines 24 - 37, MySQL_FTS::init currently creates a VLA for db_path and calls new SQLite3DB() without handling allocation failure; replace the VLA with a heap buffer using std::vector<char> (copy db_path into it and push a terminating '\0') and pass path_buf.data() to db->open, and change the allocation to use new (std::nothrow) SQLite3DB() (or wrap in try/catch) and check the returned pointer (db) for nullptr before calling db->open; on failure call proxy_error and return -1; keep the rest of the flow (init_schema) unchanged and reference the SQLite3DB, db_path, db->open, proxy_error, and init_schema symbols when making the edits.

coderabbitai · 2026-01-20T14:05:56Z

lib/MySQL_FTS.cpp

+	if (rc != SQLITE_OK) {
+		proxy_error("Failed to open FTS database at %s: %d\n", db_path.c_str(), rc);
+		return -1;
+	}


⚠️ Potential issue | 🟠 Major

Memory leak on init failure.

If db->open() fails, the allocated SQLite3DB object is not deleted before returning -1, causing a memory leak.

Proposed fix

int rc = db->open(path_buf, SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE); if (rc != SQLITE_OK) { proxy_error("Failed to open FTS database at %s: %d\n", db_path.c_str(), rc); + delete db; + db = NULL; return -1; }

🧰 Tools

🪛 Cppcheck (2.19.0)

[warning] 30-30: If memory allocation fails, then there is a possible null pointer dereference

(nullPointerOutOfMemory)

🤖 Prompt for AI Agents

In `@lib/MySQL_FTS.cpp` around lines 30 - 33, The init path leaks the allocated SQLite3DB when db->open() fails; in MySQL_FTS.cpp where you construct SQLite3DB* db and call db->open(), ensure you free the allocation before returning -1 (either delete db on error or switch the variable to a smart pointer like std::unique_ptr<SQLite3DB>), and include the same error logging via proxy_error; update the failure branch around db->open() to release/delete db (or use unique_ptr so it auto-frees) before returning.

coderabbitai · 2026-01-20T14:05:56Z

lib/MySQL_FTS.cpp

+std::string MySQL_FTS::sanitize_name(const std::string& name) {
+	std::string sanitized = name;
+	// Replace dots and special characters with underscores
+	for (size_t i = 0; i < sanitized.length(); i++) {
+		if (sanitized[i] == '.' || sanitized[i] == '-' || sanitized[i] == ' ') {
+			sanitized[i] = '_';
+		}
+	}
+	return sanitized;
+}


⚠️ Potential issue | 🔴 Critical

Incomplete sanitization allows potential SQL injection in table names.

sanitize_name() only replaces ., -, and space characters. It does not handle SQL metacharacters like backticks, quotes, semicolons, or null bytes. Since the result is used directly in SQL statements (e.g., CREATE TABLE ... << sanitized_data), malicious schema/table names could inject SQL.

Proposed fix - use allowlist approach

std::string MySQL_FTS::sanitize_name(const std::string& name) { - std::string sanitized = name; - // Replace dots and special characters with underscores - for (size_t i = 0; i < sanitized.length(); i++) { - if (sanitized[i] == '.' || sanitized[i] == '-' || sanitized[i] == ' ') { - sanitized[i] = '_'; + std::string sanitized; + sanitized.reserve(name.length()); + for (size_t i = 0; i < name.length(); i++) { + char c = name[i]; + // Allow only alphanumeric and underscore + if ((c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z') || + (c >= '0' && c <= '9') || c == '_') { + sanitized += c; + } else { + sanitized += '_'; } } return sanitized; }

🤖 Prompt for AI Agents

In `@lib/MySQL_FTS.cpp` around lines 90 - 99, sanitize_name currently only replaces '.', '-', and space, leaving other dangerous characters (backticks, quotes, semicolons, nulls) that can enable SQL injection when used in CREATE/ALTER statements; change it to an allowlist-based sanitizer that only permits ASCII letters, digits and underscore, enforces the first character is a letter, enforces a reasonable max length, rejects or canonicalizes empty/invalid results (e.g., return error/throw or prepend a fixed safe prefix), and explicitly strip or reject any NUL bytes or non-ASCII characters; update the sanitize_name implementation to iterate over bytes of the input, build a new string containing only [A-Za-z0-9_], ensure the first character is a letter (or prefix with "t_" if not), truncate to the max length, and make callers treat an empty/invalid sanitized result as an error rather than embedding it into SQL.

coderabbitai · 2026-01-20T14:05:56Z

lib/MySQL_FTS.cpp

+	// Build MySQL query to fetch data
+	std::ostringstream mysql_query;
+	mysql_query << "SELECT ";
+	for (size_t i = 0; i < selected_cols.size(); i++) {
+		if (i > 0) mysql_query << ", ";
+		mysql_query << "`" << selected_cols[i] << "`";
+	}
+
+	mysql_query << " FROM `" << schema << "`.`" << table << "`";
+
+	if (!where_clause.empty()) {
+		mysql_query << " WHERE " << where_clause;
+	}
+


⚠️ Potential issue | 🔴 Critical

SQL injection risk in MySQL query construction.

Line 311: Schema and table names are wrapped in backticks but backticks within the names are not escaped. A schema/table containing a backtick could break out.

Line 314: where_clause is appended directly without any validation or sanitization, allowing arbitrary SQL injection.

Proposed fix

For backtick escaping in identifiers:

std::string escape_identifier(const std::string& name) { std::string escaped; for (char c : name) { if (c == '`') escaped += "``"; else escaped += c; } return escaped; }

For where_clause, consider:

Validating it contains only allowed SQL constructs, or

Documenting that this is a privileged operation and the caller must ensure safety, or

Using parameterized queries where possible

🤖 Prompt for AI Agents

In `@lib/MySQL_FTS.cpp` around lines 303 - 316, The query builder in MySQL_FTS.cpp currently appends identifiers and the where_clause raw (symbols: mysql_query, selected_cols, schema, table, where_clause), which risks SQL injection; implement an identifier-escaping helper (e.g., escape_identifier) that doubles any backticks and use it when writing selected_cols, schema and table into mysql_query, and stop appending where_clause verbatim—either require the caller to supply a pre-validated/parameterized filter, validate the clause against an allowed pattern, or convert to a parameterized WHERE using bound parameters; update the mysql_query construction to use escape_identifier(schema), escape_identifier(table), and escaped selected_cols instead of raw values.

coderabbitai · 2026-01-20T14:05:57Z

lib/MySQL_FTS.cpp

+		// Search query for this index (use table name for MATCH/bm25)
+		std::ostringstream search_sql;
+		search_sql << "SELECT d.schema_name, d.table_name, d.primary_key_value, "
+		    << "snippet(" << fts_table << ", 0, '<mark>', '</mark>', '...', 30) AS snippet, "
+		    << "d.metadata "
+		    << "FROM " << fts_table << " "
+		    << "JOIN " << sanitized_data << " d ON " << fts_table << ".rowid = d.rowid "
+		    << "WHERE " << fts_table << " MATCH '" << escape_sql(query) << "' "
+		    << "ORDER BY bm25(" << fts_table << ") ASC "
+		    << "LIMIT " << limit;


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's locate and examine the MySQL_FTS.cpp file find . -name "MySQL_FTS.cpp" -type f

Repository: ProxySQL/proxysql-vec

Length of output: 84

🏁 Script executed:

# Search for the escape_sql function to understand how it works rg "escape_sql" -A 5 -B 2 | head -100

Repository: ProxySQL/proxysql-vec

Length of output: 5391

🌐 Web query:

SQLite FTS5 MATCH query escaping requirements and security

💡 Result:

Short summary + practical guidance:

Key points

The right-hand side of MATCH is an FTS5 query string that is parsed (supports boolean operators, NEAR, prefixes, column filters, phrases, etc.). User input can therefore be interpreted as query operators, not just a literal term. [1][2]

FTS5 allows unquoted barewords (letters, digits, underscore, non-ASCII) and requires quoting (double quotes) for strings containing other characters (spaces, punctuation). Inside a quoted string double quotes are escaped by doubling them. [1]

Because the MATCH string is parsed by FTS5, you must treat it like any other language input: untrusted input can change the meaning of the query (i.e., “injection” of operators). [1][3]

Escaping / literal-search rules

To make an input treated as a single literal phrase, enclose it in double quotes and double any embedded double-quote characters. Example: input He said "hi" -> MATCH '"He said ""hi"""'. [1]

If you need to treat each word literally (not as a phrase), quote each word separately: MATCH '"word1" "word2"'. [1]

Bare punctuation/special characters cannot appear unquoted in a bareword; if they’re present you must quote the string. Also bear in mind future FTS5 versions might change bareword allowances, so quoting is safer for untrusted content. [1]

Security / best practices

Prefer parameterized queries / prepared statements rather than concatenating SQL. You can bind the MATCH RHS as a parameter (e.g., WHERE ft MATCH ? or use the table-valued form ft(?)). Binding prevents SQL-level injection and is the recommended first defense. [1][2]

Because MATCH has its own query language, binding alone does not change how FTS5 parses the bound string — it still can contain FTS operators. If you need to prevent the user from supplying operators (i.e., force a literal token/phrase search), sanitize by quoting the whole input as an FTS5 string (wrap in double quotes, escape internal double-quotes) before binding. Example: bind the value '"user input with ""quotes"""'. [1][3]

For strict literal substring matching consider using the trigram tokenizer and LIKE/GLOB on that column (FTS5 trigram supports indexed LIKE/GLOB), or maintain a separate (escaped) column for exact matches. These approaches avoid FTS5 query parsing semantics. [1]

Validate inputs and limit features exposed to untrusted callers (e.g., disallow OR/NEAR/column filters) by sanitizing or rejecting characters/patterns you don’t want allowed.

Minimal recommended workflow

Decide whether callers may supply FTS operators. If yes, bind their string directly (prepared statement). If no:

Escape by wrapping the user input in double quotes and doubling any existing double quotes, then bind that value to MATCH ? (or use ft(?)). This forces a phrase-literal interpretation. [1][3]

For substring/complex needs, consider trigram tokenizer + LIKE/GLOB or explicit tokenization on input. [1]

References

SQLite FTS5 documentation (query syntax, strings, quoting, table-valued form, trigram tokenizer). [1][2]

Community note / examples on escaping MATCH terms (practical quoting advice). [3]

References used
[1] SQLite FTS5 documentation — Query syntax & strings/phrases/quoting.
[2] SQLite FTS5 documentation — table-valued function, MATCH parameter examples (bm25/rank examples).
[3] StackOverflow discussion showing practical quoting examples for MATCH.

🏁 Script executed:

# Read the search() function around lines 488-497 to get full context sed -n '470,520p' lib/MySQL_FTS.cpp | cat -n

Repository: ProxySQL/proxysql-vec

Length of output: 2292

🏁 Script executed:

# Check if there's any validation on the query parameter in the function signature or before line 488 sed -n '426,497p' lib/MySQL_FTS.cpp | cat -n

Repository: ProxySQL/proxysql-vec

Length of output: 2858

FTS5 query syntax injection risk in MATCH clause.

Line 495 uses escape_sql(query) which only escapes single quotes, but FTS5 MATCH parses special operators and syntax (*, ", AND, OR, NOT, NEAR, etc.). Without FTS5-compliant escaping, untrusted user input can inject these operators to alter query behavior or trigger unintended search patterns.

For example:

Input test OR malicious would execute as a boolean OR instead of a literal phrase search

Input test* would perform a prefix wildcard search

Use FTS5-compliant escaping: wrap the entire query in double quotes and double any embedded double-quote characters before the MATCH clause (e.g., "user input with ""quotes""" to force literal phrase matching). Alternatively, use parameterized queries with bound parameters to prevent the injection vector.

🤖 Prompt for AI Agents

In `@lib/MySQL_FTS.cpp` around lines 488 - 497, The MATCH clause currently interpolates user input via escape_sql(query) into the constructed search_sql (see fts_table, sanitized_data and the WHERE ... MATCH usage), which allows FTS5 operator injection; fix by producing an FTS5-literal-safe value: double any embedded double-quote characters in the user query and wrap the entire query in double quotes before inserting into MATCH (i.e., transform query -> "\"" + replace(query, "\"", "\"\"") + "\""), or better yet switch to a parameterized/bound MATCH parameter instead of string interpolation; update the code that builds search_sql (the snippet constructing search_sql and the use of escape_sql) to use the quoted/doubled-quote form or bind parameters so MATCH receives a literal phrase.

coderabbitai · 2026-01-20T14:05:57Z

lib/MySQL_FTS.cpp

+	if (error) {
+		result["error"] = "Failed to list indexes: " + std::string(error);
+		return result.dump();
+	}


⚠️ Potential issue | 🟡 Minor

Memory leak: error pointer not freed.

When error is set (line 582-583), it should be freed with (*proxy_sqlite3_free)(error) as done in the search() method (line 508).

Proposed fix

if (error) { result["error"] = "Failed to list indexes: " + std::string(error); + (*proxy_sqlite3_free)(error); return result.dump(); }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (error) {

result["error"] = "Failed to list indexes: " + std::string(error);

return result.dump();

}

if (error) {

result["error"] = "Failed to list indexes: " + std::string(error);

(*proxy_sqlite3_free)(error);

return result.dump();

}

🤖 Prompt for AI Agents

In `@lib/MySQL_FTS.cpp` around lines 582 - 585, The code sets the C-string pointer error and returns without freeing it, causing a memory leak; in the block where result["error"] is assigned (the same place where error is checked), call the SQLite free stub exactly as used in search(): (*proxy_sqlite3_free)(error) before returning result.dump(); locate the error pointer usage in MySQL_FTS.cpp (the error variable and the block that builds result["error"]) and add the free call immediately after constructing the error message (ensuring you do not free error twice elsewhere).

coderabbitai · 2026-01-20T14:05:57Z

lib/MySQL_FTS.cpp

+		if (reindex_json["success"].get<bool>()) {
+			rebuilt_count++;
+		} else {
+			json failed_item;
+			failed_item["schema"] = schema;
+			failed_item["table"] = table;
+			failed_item["error"] = reindex_json["error"].get<std::string>();
+			failed.push_back(failed_item);
+		}


⚠️ Potential issue | 🟡 Minor

Potential exception if "error" key is missing.

Line 754 calls reindex_json["error"].get<std::string>() but the index_table method may return without setting an "error" key in some edge cases. Use .value() with a default or check for key existence.

Proposed fix

if (reindex_json["success"].get<bool>()) { rebuilt_count++; } else { json failed_item; failed_item["schema"] = schema; failed_item["table"] = table; - failed_item["error"] = reindex_json["error"].get<std::string>(); + failed_item["error"] = reindex_json.value("error", "Unknown error"); failed.push_back(failed_item); }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (reindex_json["success"].get<bool>()) {

rebuilt_count++;

} else {

json failed_item;

failed_item["schema"] = schema;

failed_item["table"] = table;

failed_item["error"] = reindex_json["error"].get<std::string>();

failed.push_back(failed_item);

}

if (reindex_json["success"].get<bool>()) {

rebuilt_count++;

} else {

json failed_item;

failed_item["schema"] = schema;

failed_item["table"] = table;

failed_item["error"] = reindex_json.value("error", "Unknown error");

failed.push_back(failed_item);

}

🤖 Prompt for AI Agents

In `@lib/MySQL_FTS.cpp` around lines 748 - 756, The code reads reindex_json["error"].get<std::string>() which can throw if the "error" key is missing; update the failure branch (where reindex_json is inspected and failed_item is built) to safely obtain the error string — either check reindex_json.contains("error") before calling get or use reindex_json.value("error", "unknown error") (or similar default) to populate failed_item["error"]; keep the rest of the logic (failed.push_back(failed_item), rebuilt_count handling) unchanged and reference reindex_json, failed_item, and failed in your change.

renecannao · 2026-01-21T13:22:20Z

@gemini-code-assist review

gemini-code-assist

Code Review

This pull request introduces a significant new feature: full-text search (FTS) capabilities for the MCP query server, using SQLite's FTS5 extension. The changes include new C++ classes for managing FTS, integration into the existing tool handling framework, configuration options, extensive documentation, and new test scripts. Overall, this is a well-structured and comprehensive addition. My review focuses on improving security, correctness, and performance in the core implementation, as well as enhancing the clarity of the new user documentation.

gemini-code-assist · 2026-01-21T13:25:19Z

lib/MySQL_FTS.cpp

+			if (row.contains(primary_key_lower) && !row[primary_key_lower].is_null()) {
+				pk_value = row[primary_key_lower].get<std::string>();
+			} else {
+				pk_value = std::to_string(row_count);
+			}


When a primary key value is null or not found for a row during indexing, a fallback key is generated using std::to_string(row_count). This is problematic because this generated key does not correspond to any real key in the source MySQL table, making it impossible to retrieve the original row using the FTS result. This can lead to incorrect data retrieval and confusion. It would be safer to skip indexing rows that have a null or missing primary key and log a warning.

Suggested change

if (row.contains(primary_key_lower) && !row[primary_key_lower].is_null()) {

pk_value = row[primary_key_lower].get<std::string>();

} else {

pk_value = std::to_string(row_count);

}

if (row.contains(primary_key_lower) && !row[primary_key_lower].is_null()) {

pk_value = row[primary_key_lower].get<std::string>();

} else {

proxy_warning("FTS: Skipping row in %s.%s with null or missing primary key '%s'.", schema.c_str(), table.c_str(), primary_key.c_str());

continue; // Skip this row

}

gemini-code-assist · 2026-01-21T13:25:19Z

lib/MySQL_FTS.cpp

+			std::ostringstream insert_sql;
+			insert_sql << "INSERT INTO " << sanitized_data
+				<< " (schema_name, table_name, primary_key_value, content, metadata) "
+				<< "VALUES ('" << escape_sql(schema) << "', '"
+				<< escape_sql(table) << "', '"
+				<< escape_sql(pk_value) << "', '"
+				<< escape_sql(content.str()) << "', '"
+				<< escape_sql(metadata.dump()) << "');";
+
+			if (!db->execute(insert_sql.str().c_str())) {
+				proxy_error("Failed to insert row into FTS: %s\n", insert_sql.str().c_str());
+			}


The SQL INSERT statement is constructed using string concatenation with manual escaping via escape_sql. This approach is vulnerable to SQL injection if the escaping function is not exhaustive. To ensure security and robustness, it is strongly recommended to use prepared statements with parameter binding (e.g., sqlite3_prepare_v2, sqlite3_bind_text) instead of manual string construction.

gemini-code-assist · 2026-01-21T13:25:19Z

lib/MySQL_FTS.cpp

+		    << "d.metadata "
+		    << "FROM " << fts_table << " "
+		    << "JOIN " << sanitized_data << " d ON " << fts_table << ".rowid = d.rowid "
+		    << "WHERE " << fts_table << " MATCH '" << escape_sql(query) << "' "


The FTS5 MATCH clause is constructed by concatenating a user-provided query string after only escaping single quotes. The FTS5 query syntax is complex, and this simple escaping may not be sufficient to prevent all forms of injection or syntax errors, especially with untrusted input. While MATCH does not support bound parameters for the query string, the input should be more rigorously validated or sanitized to prevent abuse of the FTS5 query syntax (e.g., injection of column filters or function calls).

gemini-code-assist · 2026-01-21T13:25:19Z

doc/MCP/FTS_USER_GUIDE.md

+    {
+      "schema": "sales",
+      "table": "orders",
+      "columns": ["order_id","customer_name","notes"],


The example response for fts_list_indexes has an incorrectly formatted columns field. It is shown as a single string ["order_id","customer_name","notes"] instead of a proper JSON array of strings. This could be misleading for users trying to parse the API response.

Suggested change

"columns": ["order_id","customer_name","notes"],

"columns": ["order_id", "customer_name", "notes"],

gemini-code-assist · 2026-01-21T13:25:19Z

lib/MySQL_FTS.cpp

+	// Collect all results from each index
+	json all_results = json::array();
+	int total_matches = 0;
+
+	for (std::vector<SQLite3_row*>::iterator it = indexes_result->rows.begin();
+	     it != indexes_result->rows.end(); ++it) {
+		SQLite3_row* row = *it;
+		const char* idx_schema = row->fields[0];
+		const char* idx_table = row->fields[1];
+
+		if (!idx_schema || !idx_table) continue;
+
+		std::string data_table = get_data_table_name(idx_schema, idx_table);
+		std::string fts_table = get_fts_table_name(idx_schema, idx_table);
+		std::string sanitized_data = data_table;
+
+		// Search query for this index (use table name for MATCH/bm25)
+		std::ostringstream search_sql;
+		search_sql << "SELECT d.schema_name, d.table_name, d.primary_key_value, "
+		    << "snippet(" << fts_table << ", 0, '<mark>', '</mark>', '...', 30) AS snippet, "
+		    << "d.metadata "
+		    << "FROM " << fts_table << " "
+		    << "JOIN " << sanitized_data << " d ON " << fts_table << ".rowid = d.rowid "
+		    << "WHERE " << fts_table << " MATCH '" << escape_sql(query) << "' "
+		    << "ORDER BY bm25(" << fts_table << ") ASC "
+		    << "LIMIT " << limit;
+
+		SQLite3_result* idx_resultset = NULL;
+		error = NULL;
+		cols = 0;
+		affected = 0;
+
+		idx_resultset = db->execute_statement(search_sql.str().c_str(), &error, &cols, &affected);
+
+		if (error) {
+			proxy_error("FTS search error on %s.%s: %s\n", idx_schema, idx_table, error);
+			(*proxy_sqlite3_free)(error);
+		}
+
+		if (idx_resultset) {
+			for (std::vector<SQLite3_row*>::iterator row_it = idx_resultset->rows.begin();
+			     row_it != idx_resultset->rows.end(); ++row_it) {
+				SQLite3_row* res_row = *row_it;
+
+				json match;
+				match["schema"] = res_row->fields[0] ? res_row->fields[0] : "";
+				match["table"] = res_row->fields[1] ? res_row->fields[1] : "";
+				match["primary_key_value"] = res_row->fields[2] ? res_row->fields[2] : "";
+
+				match["snippet"] = res_row->fields[3] ? res_row->fields[3] : "";
+
+				// Parse metadata JSON
+				try {
+					if (res_row->fields[4]) {
+						match["metadata"] = json::parse(res_row->fields[4]);
+					} else {
+						match["metadata"] = json::object();
+					}
+				} catch (const json::exception& e) {
+					match["metadata"] = res_row->fields[4] ? res_row->fields[4] : "";
+				}
+
+				all_results.push_back(match);
+				total_matches++;
+			}
+			delete idx_resultset;
+		}
+	}
+
+	delete indexes_result;
+	db->rdunlock();


The search function fetches results from all matching indexes into an in-memory all_results vector and then applies pagination. This can lead to excessive memory consumption if there are many matches across numerous indexes, as the initial LIMIT is applied per index. A more scalable approach would be to perform pagination at the database level, for instance by combining results with UNION ALL and applying a single ORDER BY and LIMIT/OFFSET to the final result set.

gemini-code-assist · 2026-01-21T13:25:19Z

lib/MySQL_Tool_Handler.cpp

+bool MySQL_Tool_Handler::reset_fts_path(const std::string& path) {
+	pthread_mutex_lock(&fts_lock);
+
+	if (fts) {
+		delete fts;
+		fts = NULL;
+	}
+
+	if (!path.empty()) {
+		fts = new MySQL_FTS(path);
+		if (fts->init()) {
+			proxy_error("Failed to initialize FTS with new path: %s\n", path.c_str());
+			delete fts;
+			fts = NULL;
+			pthread_mutex_unlock(&fts_lock);
+			return false;
+		}
+	}
+
+	pthread_mutex_unlock(&fts_lock);
+	return true;
+}


There are two functions for updating the FTS path: reset_fts_path and reinit_fts (lines 1142-1178). reset_fts_path is called by the configuration handler but lacks a check for the directory's existence. reinit_fts performs this check but appears to be unused. To improve robustness and eliminate redundant code, the directory existence check from reinit_fts should be integrated into reset_fts_path, and the unused reinit_fts function should be removed.

Copilot

Pull request overview

This pull request adds full-text search (FTS) capabilities to the MCP query server using SQLite's FTS5 extension with BM25 ranking. The implementation allows AI agents to quickly search MySQL table data through an external SQLite index without requiring native MySQL FTS support.

Changes:

Added 6 new FTS tools: fts_index_table, fts_search, fts_list_indexes, fts_delete_index, fts_reindex, and fts_rebuild_all
Implemented MySQL_FTS class for managing SQLite FTS5 indexes with BM25 ranking
Added thread-safe FTS operations with mutex protection and runtime path configuration
Included comprehensive test scripts and detailed user documentation

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 13 comments.

Show a summary per file

File	Description
scripts/mcp/test_mcp_fts_detailed.sh	Detailed FTS validation script testing all tools, runtime config changes, and edge cases
scripts/mcp/test_mcp_fts.sh	Comprehensive test suite with 21 test cases covering all FTS functionality
lib/Query_Tool_Handler.cpp	Registers 6 FTS tools in MCP protocol and routes calls to MySQL_Tool_Handler
lib/ProxySQL_MCP_Server.cpp	Passes FTS path from configuration to MySQL_Tool_Handler constructor
lib/MySQL_Tool_Handler.cpp	Wraps MySQL_FTS class with thread-safe mutex protection and runtime path reset
lib/MySQL_FTS.cpp	Core FTS implementation using SQLite FTS5 with indexing, search, and management
lib/Makefile	Adds MySQL_FTS.oo to build targets
lib/MCP_Thread.cpp	Handles mcp_fts_path configuration variable with runtime reload support
include/MySQL_Tool_Handler.h	Declares FTS integration methods and FTS pointer member
include/MySQL_FTS.h	Defines MySQL_FTS class interface for full-text search operations
include/MCP_Thread.h	Adds mcp_fts_path to configuration variables structure
doc/MCP/FTS_USER_GUIDE.md	Comprehensive 854-line user guide with architecture, API reference, and examples

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-21T13:30:48Z

lib/MySQL_FTS.cpp

+std::string MySQL_FTS::escape_sql(const std::string& str) {
+	std::string escaped;
+	for (size_t i = 0; i < str.length(); i++) {
+		if (str[i] == '\'') {
+			escaped += "''";
+		} else {
+			escaped += str[i];
+		}
+	}
+	return escaped;
+}


Potential SQL injection vulnerability: The escape_sql function only escapes single quotes, but table/column names are being directly interpolated into SQL statements without proper quoting or validation. While sanitize_name is used for some identifiers, the FTS search query at line 495 uses escape_sql(query) within a MATCH clause which could allow FTS query syntax injection. Consider using parameterized queries where possible, or implementing more robust escaping/validation for user-controlled inputs.

Copilot · 2026-01-21T13:30:49Z

lib/MySQL_Tool_Handler.cpp

+	// Check if directory exists (SQLite can't create directories)
+	std::string::size_type last_slash = fts_path.find_last_of("/");
+	if (last_slash != std::string::npos && last_slash > 0) {
+		std::string dir = fts_path.substr(0, last_slash);
+		struct stat st;
+		if (stat(dir.c_str(), &st) != 0 || !S_ISDIR(st.st_mode)) {
+			proxy_error("MySQL_Tool_Handler: Directory does not exist for path '%s' (directory: '%s')\n",
+				fts_path.c_str(), dir.c_str());
+			return -1;
+		}


Missing error handling: If stat() fails, the code returns -1, but if the directory exists but is not writable, this check won't catch it. Consider adding a write permission check or attempting to create a test file in the directory to verify write access before attempting to initialize the FTS database.

Copilot · 2026-01-21T13:30:49Z

lib/MySQL_Tool_Handler.cpp

+
+	if (new_fts->init() != 0) {
+		proxy_error("MySQL_Tool_Handler: Failed to initialize FTS at %s\n", fts_path.c_str());
+		delete new_fts;


Potential resource leak: If new_fts->init() fails at line 1164, the function deletes new_fts and returns -1 without keeping the old FTS handler. However, there's a comment "Return error WITHOUT closing old FTS" which is correct. But if the old fts is NULL (first initialization), the system will remain in a state where FTS is not initialized. Consider documenting this behavior more clearly or handling the null case explicitly.

Suggested change

delete new_fts;

delete new_fts;

// If this was the first initialization attempt (no previous FTS handler),

// the system remains without an initialized FTS instance.

if (!fts) {

proxy_error("MySQL_Tool_Handler: No previous FTS handler exists; FTS remains uninitialized after failure\n");

}

Copilot · 2026-01-21T13:30:49Z

lib/MySQL_FTS.cpp

+		json cols_json = json::parse(columns);
+		if (!cols_json.is_array()) {
+			result["error"] = "columns must be a JSON array";
+			return result.dump();
+		}
+	} catch (const json::exception& e) {
+		result["error"] = std::string("Invalid JSON in columns: ") + e.what();
+		return result.dump();
+	}
+
+	// Check if index already exists
+	if (index_exists(schema, table)) {
+		result["error"] = "Index already exists for " + schema + "." + table + ". Use fts_reindex to update.";
+		return result.dump();
+	}
+
+	// Create index tables
+	if (create_index_tables(schema, table) != 0) {
+		result["error"] = "Failed to create index tables";
+		return result.dump();
+	}
+
+	// Parse columns and build query (ensure primary key is selected)
+	std::vector<std::string> indexed_cols;
+	std::vector<std::string> selected_cols;
+	std::unordered_set<std::string> seen;
+
+	try {
+		json cols_json = json::parse(columns);
+		if (!cols_json.is_array()) {
+			result["error"] = "columns must be a JSON array";
+			return result.dump();
+		}
+		for (const auto& col : cols_json) {
+			std::string col_name = col.get<std::string>();
+			std::string col_lower = col_name;
+			std::transform(col_lower.begin(), col_lower.end(), col_lower.begin(), ::tolower);
+			indexed_cols.push_back(col_lower);
+			if (seen.insert(col_lower).second) {
+				selected_cols.push_back(col_name);
+			}
+		}
+	} catch (const json::exception& e) {
+		result["error"] = std::string("Failed to parse columns: ") + e.what();
+		return result.dump();
+	}


The columns parameter is parsed twice in the index_table method: once at lines 251-259 for validation, and again at lines 278-296 for actual processing. This is inefficient and could lead to inconsistencies if the JSON parsing behaves differently between the two calls. Consider parsing once and reusing the parsed result.

Copilot · 2026-01-21T13:30:49Z

scripts/mcp/test_mcp_fts.sh

+test_fts_search_special_chars() {
+    # Create a document with special characters
+    mysql_exec "INSERT INTO ${TEST_SCHEMA}.${TEST_TABLE} (title, content, category, priority) \
+        VALUES ('Special Chars', 'Test with @ # $ % ^ & * ( ) - _ = + [ ] { } | \\ : ; \" \" < > ? / ~', 'test', 'normal');" 2>/dev/null || true


In the test at line 921, special characters including backslashes are inserted into the database. The backslash should be escaped properly in the bash script string. The current approach might not correctly insert all special characters as intended. Consider using a here-document or different quoting mechanism for the SQL statement.

Suggested change

VALUES ('Special Chars', 'Test with @ # $ % ^ & * ( ) - _ = + [ ] { } | \\ : ; \" \" < > ? / ~', 'test', 'normal');" 2>/dev/null || true

VALUES ('Special Chars', 'Test with @ # $ % ^ & * ( ) - _ = + [ ] { } | \\\\ : ; \" \" < > ? / ~', 'test', 'normal');" 2>/dev/null || true

Copilot · 2026-01-21T13:30:50Z

lib/MySQL_Tool_Handler.cpp

+bool MySQL_Tool_Handler::reset_fts_path(const std::string& path) {
+	pthread_mutex_lock(&fts_lock);
+
+	if (fts) {
+		delete fts;
+		fts = NULL;
+	}
+
+	if (!path.empty()) {
+		fts = new MySQL_FTS(path);
+		if (fts->init()) {
+			proxy_error("Failed to initialize FTS with new path: %s\n", path.c_str());
+			delete fts;
+			fts = NULL;
+			pthread_mutex_unlock(&fts_lock);
+			return false;
+		}
+	}
+
+	pthread_mutex_unlock(&fts_lock);
+	return true;
+}


The reset_fts_path method acquires fts_lock, but reinit_fts does not. If both methods can be called from different contexts, there's a potential for inconsistent locking patterns. Consider either making reinit_fts acquire the lock, or documenting that it must only be called when the caller already holds the lock, or consolidating the logic to avoid duplication.

Copilot · 2026-01-21T13:30:51Z

lib/MySQL_FTS.cpp

+	json all_results = json::array();
+	int total_matches = 0;
+
+	for (std::vector<SQLite3_row*>::iterator it = indexes_result->rows.begin();
+	     it != indexes_result->rows.end(); ++it) {
+		SQLite3_row* row = *it;
+		const char* idx_schema = row->fields[0];
+		const char* idx_table = row->fields[1];
+
+		if (!idx_schema || !idx_table) continue;
+
+		std::string data_table = get_data_table_name(idx_schema, idx_table);
+		std::string fts_table = get_fts_table_name(idx_schema, idx_table);
+		std::string sanitized_data = data_table;
+
+		// Search query for this index (use table name for MATCH/bm25)
+		std::ostringstream search_sql;
+		search_sql << "SELECT d.schema_name, d.table_name, d.primary_key_value, "
+		    << "snippet(" << fts_table << ", 0, '<mark>', '</mark>', '...', 30) AS snippet, "
+		    << "d.metadata "
+		    << "FROM " << fts_table << " "
+		    << "JOIN " << sanitized_data << " d ON " << fts_table << ".rowid = d.rowid "
+		    << "WHERE " << fts_table << " MATCH '" << escape_sql(query) << "' "
+		    << "ORDER BY bm25(" << fts_table << ") ASC "
+		    << "LIMIT " << limit;
+
+		SQLite3_result* idx_resultset = NULL;
+		error = NULL;
+		cols = 0;
+		affected = 0;
+
+		idx_resultset = db->execute_statement(search_sql.str().c_str(), &error, &cols, &affected);
+
+		if (error) {
+			proxy_error("FTS search error on %s.%s: %s\n", idx_schema, idx_table, error);
+			(*proxy_sqlite3_free)(error);
+		}
+
+		if (idx_resultset) {
+			for (std::vector<SQLite3_row*>::iterator row_it = idx_resultset->rows.begin();
+			     row_it != idx_resultset->rows.end(); ++row_it) {
+				SQLite3_row* res_row = *row_it;
+
+				json match;
+				match["schema"] = res_row->fields[0] ? res_row->fields[0] : "";
+				match["table"] = res_row->fields[1] ? res_row->fields[1] : "";
+				match["primary_key_value"] = res_row->fields[2] ? res_row->fields[2] : "";
+
+				match["snippet"] = res_row->fields[3] ? res_row->fields[3] : "";
+
+				// Parse metadata JSON
+				try {
+					if (res_row->fields[4]) {
+						match["metadata"] = json::parse(res_row->fields[4]);
+					} else {
+						match["metadata"] = json::object();
+					}
+				} catch (const json::exception& e) {
+					match["metadata"] = res_row->fields[4] ? res_row->fields[4] : "";
+				}
+
+				all_results.push_back(match);
+				total_matches++;
+			}
+			delete idx_resultset;
+		}
+	}
+
+	delete indexes_result;
+	db->rdunlock();
+
+	// Apply pagination to collected results
+	int total_size = (int)all_results.size();
+	int start_idx = offset;
+	if (start_idx >= total_size) start_idx = total_size;
+	int end_idx = start_idx + limit;
+	if (end_idx > total_size) end_idx = total_size;
+
+	json paginated_results = json::array();
+	for (int i = start_idx; i < end_idx; i++) {
+		paginated_results.push_back(all_results[i]);
+	}


The search method performs iteration over potentially large result sets without limiting memory usage. If a search matches millions of rows across multiple indexes, the all_results array at line 473 could grow very large before pagination is applied at line 544. Consider applying the limit during collection rather than after, or implementing streaming/chunked result processing.

Copilot · 2026-01-21T13:30:51Z

lib/MySQL_FTS.cpp

+std::string MySQL_FTS::search(
+	const std::string& query,
+	const std::string& schema,
+	const std::string& table,
+	int limit,
+	int offset
+) {


Missing validation: The limit parameter in the search methods could potentially be negative or zero, which might cause unexpected behavior. Consider adding validation to ensure limit is positive (e.g., if (limit <= 0) limit = 100;) and similarly for offset to ensure it's non-negative.

Copilot · 2026-01-21T13:30:51Z

scripts/mcp/test_mcp_fts.sh

+mysql_exec() {
+    local sql="$1"
+    mysql -h "${MYSQL_HOST}" -P "${MYSQL_PORT}" -u "${MYSQL_USER}" -p"${MYSQL_PASSWORD}" \
+        -e "${sql}" 2>/dev/null


The test script uses set -e which causes the script to exit on any error. However, the mysql_exec function at line 279 redirects stderr to /dev/null, which might hide important error messages. Consider allowing errors to be visible during test execution, or at least logging them for debugging purposes.

Suggested change

-e "${sql}" 2>/dev/null

-e "${sql}"

Copilot · 2026-01-21T13:30:51Z

lib/MySQL_FTS.cpp

+	char path_buf[db_path.size() + 1];
+	strcpy(path_buf, db_path.c_str());


The use of variable-length array (VLA) char path_buf[db_path.size() + 1] is not standard in C++ and is a C99 feature that may not be supported by all compilers. This can lead to portability issues. Consider using std::string directly with the SQLite3DB API, or dynamically allocate the buffer with new[] and delete[], or use std::vector<char>.

…leanup Security fixes: - Add escape_identifier() helper for proper SQLite identifier escaping - Replace sanitize_name() with allowlist validation (ASCII letters, digits, underscore only) - Fix MATCH clause FTS5 operator injection by wrapping query in double quotes - Apply escape_identifier() to all DDL statements (CREATE, DROP, triggers) Memory safety fixes: - Replace VLA with std::vector in MySQL_FTS::init(), add delete on error path - Fix memory leak: free error string before return in list_indexes() - Fix reindex_json["error"] potential exception using .value() with default Thread safety fixes: - reinit_fts(): Add mutex lock around pointer swap - reset_fts_path(): Move blocking init() outside lock, only swap pointer under lock Code cleanup: - Remove 7 debug fprintf statements from Query_Tool_Handler.cpp - Remove unused #include <memory> from MySQL_FTS.h Test script security fixes: - Use MYSQL_PWD environment variable instead of -p"..." for password - Add escape_sql() function and apply to INSERT statement - Fix CURL_OPTS quoting: ${CURL_OPTS:+"${CURL_OPTS}"} - Remove unused FTS_INDEX_NAME and SEARCH_QUERIES variables Documentation fixes: - Fix bare URL to markdown link format - Add code block language identifiers (text, bash)

Fix PR #21 Review: Security, Memory Safety, Thread Safety + Additional Fixes

Add FTS_User_Guide.md

8e2230c

coderabbitai bot reviewed Jan 20, 2026

View reviewed changes

Merge remote-tracking branch 'v3.1-vec' into v3.1_fts-support

18dd249

coderabbitai bot reviewed Jan 20, 2026

View reviewed changes

renecannao requested a review from Copilot January 21, 2026 13:19

Copilot started reviewing on behalf of renecannao January 21, 2026 13:19 View session

gemini-code-assist bot reviewed Jan 21, 2026

View reviewed changes

renecannao changed the base branch from v3.1-vec to v3.1-vec_21 January 21, 2026 13:25

Copilot AI reviewed Jan 21, 2026

View reviewed changes

renecannao merged commit a831670 into ProxySQL:v3.1-vec_21 Jan 21, 2026
7 checks passed

renecannao mentioned this pull request Jan 21, 2026

Fix PR #21 Review: Security, Memory Safety, Thread Safety + Additional Fixes #25

Merged

renecannao added a commit that referenced this pull request Jan 21, 2026

Merge pull request #25 from ProxySQL/v3.1-vec_21

f257c6a

Fix PR #21 Review: Security, Memory Safety, Thread Safety + Additional Fixes

		current_path=$(mysql -h "${MYSQL_HOST}" -P "${MYSQL_PORT}" -u "${MYSQL_USER}" -p"${MYSQL_PASSWORD}" \
		-e "SELECT @@mcp-fts_path" -s -N 2>/dev/null \| tr -d '\r')

	- GitHub: https://github.com/ProxySQL/proxysql-vec
	- GitHub: [ProxySQL/proxysql-vec](https://github.com/ProxySQL/proxysql-vec)

	"columns": ["order_id","customer_name","notes"],
	"columns": ["order_id", "customer_name", "notes"],

-		delete new_fts;
+		delete new_fts;
+		// If this was the first initialization attempt (no previous FTS handler),
+		// the system remains without an initialized FTS instance.
+		if (!fts) {
+			proxy_error("MySQL_Tool_Handler: No previous FTS handler exists; FTS remains uninitialized after failure\n");
+		}

	VALUES ('Special Chars', 'Test with @ # $ % ^ & * ( ) - _ = + [ ] { } \| \\ : ; \" \" < > ? / ~', 'test', 'normal');" 2>/dev/null \|\| true
	VALUES ('Special Chars', 'Test with @ # $ % ^ & * ( ) - _ = + [ ] { } \| \\\\ : ; \" \" < > ? / ~', 'test', 'normal');" 2>/dev/null \|\| true

		char path_buf[db_path.size() + 1];
		strcpy(path_buf, db_path.c_str());

Conversation

rahim-kanji commented Jan 20, 2026

Summary

New Tools Added

Key Features

Technical Details

Test Plan

Uh oh!

coderabbitai bot commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

renecannao commented Jan 21, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

coderabbitai bot commented Jan 20, 2026 •

edited

Loading