MindsDB CLI & Knowledge Base Toolkit

This project provides a powerful Command Line Interface (CLI) for interacting with MindsDB, with a special focus on its Knowledge Base features and AI Agent integration. It also includes a suite of scripts for performance benchmarking, stress testing, and evaluating MindsDB's reranking capabilities.

Features

CLI (main.py):
- Manage MindsDB datasources (e.g., setup HackerNews).
- Create, index, and query Knowledge Bases.
- Ingest data into Knowledge Bases from sources like HackerNews.
- Create and query AI Agents linked to Knowledge Bases (e.g., using Google Gemini).
- Automate ingestion using MindsDB Jobs.
- Create and query general AI models/tables (e.g., using Google Gemini for classification).
Reporting Scripts (src/report_scripts/):
- Performance Benchmarking: Measure ingestion times and query latencies.
- Stress Testing: Test system stability under heavy load.
- Reranking Evaluation: Compare search results with and without reranking.
Docker Support: Includes a Dockerfile to build and run the CLI tool in a containerized environment.

1. Setup and Installation

1.1. Prerequisites

Python 3.8+
Access to a running MindsDB instance (local or cloud).
- Ensure Ollama is running and accessible if you plan to use default Ollama models for embeddings/reranking (e.g., ollama pull nomic-embed-text, ollama pull llama3).
Google Gemini API Key if using Google Gemini models for AI Agents or general AI tables.
(Optional) Docker for containerized execution.

1.2. Installation Steps

Clone the repository (if you haven't already):

git clone https://github.com/yashksaini-coder/Kleos
cd Kleos

Create a virtual environment (recommended):

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

OR 

uv venv
.venv/bin/activate  # On Windows: uv .venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

OR 

uv pip install -r requirements.txt

1.3. Configuration

The application requires configuration for connecting to MindsDB and for API keys.

Copy the example configuration file:
```
cp config/config.py.example config/config.py
```
Alternatively, set environment variables directly. The application prioritizes environment variables.
Edit config/config.py or set environment variables:
- MindsDB Connection: MINDSDB_HOST, MINDSDB_PORT, MINDSDB_USER (optional), MINDSDB_PASSWORD (optional).
- Google Gemini API Key (required for Google AI features): GOOGLE_GEMINI_API_KEY.
- Ollama Settings (if using default models): OLLAMA_BASE_URL, OLLAMA_EMBEDDING_MODEL, OLLAMA_RERANKING_MODEL.
Using a .env file (recommended for local development): Create a .env file in the project root (add this file to .gitignore):
```
MINDSDB_HOST="http://127.0.0.1"
MINDSDB_PORT=47334
GOOGLE_GEMINI_API_KEY="your_gemini_key"
OLLAMA_BASE_URL="http://127.0.0.1:11434"
# Add other variables as needed
```
Install MindsDB (if not already installed): If you haven't installed MindsDB yet, you can do so via pip:
```
pip install mindsdb
```
Or, if you prefer to use Docker, you can run MindsDB using the official Docker image:
```
docker run -p 47334:47334 mindsdb/mindsdb
```
Verify MindsDB is running: Ensure your MindsDB instance is up and running. You can check this by accessing the MindsDB UI in your web browser at http://127.0.0.1:47334/.
(Optional) Install Ollama models: If you plan to use Ollama for embeddings or reranking, ensure the required models are installed:
```
ollama pull nomic-embed-text
ollama pull llama3
```

2. Usage

2.1. Kleos CLI (`main.py`)

The main CLI application is run using python main.py.

General Help:

python main.py --help

This lists available command groups: ai, job, kb, setup.

Important Note on JSON Parameters:

When passing complex parameters as JSON strings via options like --metadata-map, --metadata-filter, or --agent-params, ensure they are correctly quoted and escaped for your specific command-line shell. Incorrect quoting is a common source of errors.

For Windows Command Prompt (cmd.exe): Typically, you'll need to enclose the entire JSON string in double quotes (") and escape all inner double quotes with a backslash (\"). Example: --metadata-map "{\"key\":\"value\"}"
For Linux/macOS (bash, zsh) or PowerShell: Often, enclosing the JSON string in single quotes (') is sufficient: Example: --metadata-map '{"key":"value"}'

Knowledge Base Improvements: The Knowledge Base commands have been enhanced and thoroughly tested with:

Auto-detection of sensible content and metadata columns for HackerNews tables (stories, comments)
Improved error handling with clear, actionable error messages for common issues
Robust JSON parsing with platform-specific examples for Windows CMD, PowerShell, and Unix shells
Automatic datasource creation when the HackerNews datasource is missing
Enhanced column mapping flexibility for custom use cases and data sources
Validated SQL generation ensuring correct INSERT INTO ... SELECT syntax for MindsDB
Windows compatibility with proper JSON escaping examples and testing

All command examples have been tested on Windows Command Prompt and include proper JSON escaping patterns.

Detailed Command Reference: For detailed information on all commands, options, and comprehensive examples for different platforms, please consult the COMMANDS_REFERENCE.md file.

Quick Examples:

Setup HackerNews Datasource:

python main.py setup hackernews --name hackernews_db

List Available Databases:
```
python main.py kb list-databases
```

Create a Knowledge Base with Custom Columns:

# Basic KB creation
python main.py kb create my_documents_kb --embedding-model nomic-embed-text --reranking-model llama3

# KB creation with custom content and metadata columns
python main.py kb create hn_stories_kb --embedding-model nomic-embed-text --reranking-model llama3 --content-columns "title,text" --metadata-columns "id,by,score,time" --id-column id

Ingest Data into a Knowledge Base:

Windows Command Prompt (cmd.exe):

REM Simple ingestion with auto-detected columns (for HackerNews tables)
python main.py kb ingest my_documents_kb --from-hackernews stories --limit 50

REM Custom content and metadata mapping
python main.py kb ingest my_documents_kb --from-hackernews stories --content-columns title --metadata-map "{\"story_id\":\"id\", \"author\":\"by\", \"score\":\"score\"}" --limit 100

PowerShell/Bash/Zsh:

# Simple ingestion with auto-detected columns (for HackerNews tables)
python main.py kb ingest my_documents_kb --from-hackernews stories --limit 50

# Custom content and metadata mapping
python main.py kb ingest my_documents_kb --from-hackernews stories --content-columns title --metadata-map '{"story_id":"id", "author":"by", "score":"score"}' --limit 100

Query a Knowledge Base:

# Basic query
python main.py kb query my_documents_kb "latest trends in AI"

# Query with metadata filter (PowerShell/Bash/Zsh)
python main.py kb query my_documents_kb "search specific topic" --metadata-filter '{"author":"some_user"}'

Windows Command Prompt with metadata filter:

python main.py kb query my_documents_kb "search specific topic" --metadata-filter "{\"author\":\"some_user\"}"

Create an AI Agent for a Knowledge Base:

PowerShell/Bash/Zsh:

python main.py kb create-agent my_kb_agent my_documents_kb --model-name gemini-pro --agent-params '{"temperature":0.2, "prompt_template":"Answer questions based on the KB."}'

Windows Command Prompt:

python main.py kb create-agent my_kb_agent my_documents_kb --model-name gemini-pro --agent-params "{\"temperature\":0.2, \"prompt_template\":\"Answer questions based on the KB.\"}"

Query the AI Agent:

python main.py kb query-agent my_kb_agent "Summarize articles about Python."

2.2. Reporting Scripts (`src/report_scripts/`)

These standalone scripts perform specific evaluations and generate JSON reports.

Run from the project root directory. Use the --help flag on any script to see its specific options.
```
python src/report_scripts/benchmark_report.py --help
```

Performance Benchmark Report:

python src/report_scripts/benchmark_report.py --kb-name benchmark_test --ingestion-sizes 100 500

Stress Test Report:

python src/report_scripts/stress_test_report.py --kb-name stress_test_run --initial-load 1000

Reranking Evaluation Report:
```
python src/report_scripts/reranking_eval_report.py --kb-no-reranker kb_baseline --kb-with-reranker kb_reranked
```
Remember to manually fill in the "qualitative_analysis_notes" in the generated JSON report for this script.

3. Docker Usage

A Dockerfile is provided to build and run the CLI application.

Build the Docker image:
```
docker build -t mindsdb-cli-app .
```
Run the CLI tool using Docker: Use --env-file to pass your .env file for configuration.
```
docker run -it --rm --env-file .env mindsdb-cli-app kb query my_docker_kb "search via docker"
```
Or pass environment variables individually:
```
docker run -it --rm -e MINDSDB_HOST="http://host.docker.internal:47334" mindsdb-cli-app --help
```
(Note: host.docker.internal can be used to access services running on your host machine from Docker Desktop).

4. Project Internals & Development

Detailed Documentation:

For a comprehensive guide to all CLI commands, options, and examples, please see COMMANDS_REFERENCE.md.
For an in-depth article covering project architecture, workflow, and code explanations, refer to ARTICLE.md.

The core logic for MindsDB interactions is encapsulated in src/core/mindsdb_handler.py. CLI commands are defined in modules within src/commands/.

5. Contributing

Contributions are welcome! Please follow standard fork-and-pull-request workflows. Ensure documentation is updated for any new features or changes.

Yash K. Saini

(Author)

👋 Hi there! I'm Yash K. Saini, a self-taught software developer and a computer science student from India.

Building products & systems that can benefit & solve problems for many other DEVs.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
config		config
src		src
.gitignore		.gitignore
.python-version		.python-version
COMMANDS_REFERENCE.md		COMMANDS_REFERENCE.md
DOCKERFILE		DOCKERFILE
IMP_DOCS.md		IMP_DOCS.md
Quest-Tasks.md		Quest-Tasks.md
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MindsDB CLI & Knowledge Base Toolkit

Features

1. Setup and Installation

1.1. Prerequisites

1.2. Installation Steps

1.3. Configuration

2. Usage

2.1. Kleos CLI (`main.py`)

2.2. Reporting Scripts (`src/report_scripts/`)

3. Docker Usage

4. Project Internals & Development

5. Contributing

Yash K. Saini

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

yashksaini-coder/Kleos

Folders and files

Latest commit

History

Repository files navigation

MindsDB CLI & Knowledge Base Toolkit

Features

1. Setup and Installation

1.1. Prerequisites

1.2. Installation Steps

1.3. Configuration

2. Usage

2.1. Kleos CLI (main.py)

2.2. Reporting Scripts (src/report_scripts/)

3. Docker Usage

4. Project Internals & Development

5. Contributing

Yash K. Saini

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

2.1. Kleos CLI (`main.py`)

2.2. Reporting Scripts (`src/report_scripts/`)

Packages