A polyglot full-stack application that translates natural language into executable shell commands using offline Vector Embeddings.
CmdGen is a tool designed to bridge the gap between human intent and system execution. Unlike simple API wrappers, CmdGen utilizes a local semantic search engine powered by Sentence-Transformers (BERT) to map natural language queries (e.g., "Find all files larger than 100MB") to precise Linux commands.
This project demonstrates a Microservices Architecture where a Java Spring Boot gateway orchestrates communication between a high-performance React UI and a Python Inference Service.
The system consists of three decoupled microservices:
- Frontend (React + Vite): A reactive UI for user interaction and real-time command visualization.
- Backend (Java Spring Boot): Acts as the API Gateway and Business Logic layer, handling validation and routing.
- Inference Engine (Python FastAPI): A dedicated AI service that uses Vector Embeddings (Cosine Similarity) to match queries against a dataset of 500+ optimized shell commands.
| Component | Technology |
|---|---|
| Frontend | React.js, Vite, Tailwind CSS (optional) |
| Backend | Java 17, Spring Boot 3, Maven |
| AI / ML | Python, FastAPI, Pandas, Scikit-Learn, Sentence-Transformers (HuggingFace) |
| Data | CSV (Flat file), Vector Embeddings (In-memory) |
Follow these steps to run the complete system locally.
- Python 3.9+
- Java JDK 17+
- Node.js & npm
The AI service must be running first to handle inference requests.
cd model_service
# Create and activate virtual environment
python -m venv .venv
# Windows:
.venv\Scripts\activate
# Mac/Linux:
source .venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Run the service (Port 8000)
python ai_service.pyNote: On the first run, it will automatically download the
all-MiniLM-L6-v2model.
Open a new terminal.
cd backend
# Run using Maven Wrapper (Port 8080)
# Windows:
./mvnw.cmd spring-boot:run
# Mac/Linux:
./mvnw spring-boot:runOpen a new terminal.
cd ui/cmdgen-frontend
# Install dependencies
npm install
# Run the development server
npm run devInstead of relying on slow and expensive external APIs (like OpenAI), CmdGen uses Semantic Search:
- Initialization: On startup, the Python service loads
linux_commands.csvand converts all descriptions into high-dimensional vectors usingall-MiniLM-L6-v2. - Querying: When a user types a request, the input is converted into a vector.
- Matching: The system calculates the Cosine Similarity between the input vector and the dataset vectors.
- Result: The command with the highest similarity score is returned with <200ms latency.
CmdGen/
├── backend/ # Spring Boot Application
├── model_service/ # Python FastAPI AI Service
│ ├── ai_service.py # Inference Logic
│ └── requirements.txt # Python dependencies
├── ui/cmdgen-frontend/ # React Application
├── linux_commands.csv # Dataset for command mapping
└── README.md # Documentation
Pranay Kelotra