MedParsWell is a modular FastAPI-based API and service layer for wrapping CLI-based LLM backends such as ik_llama.cpp. It is designed to expose model metadata and options via schemas compatible with downstream tools like Gradio and future UI layers.
- 🧠 LLM inference shell wrapper (supports
ik_llama.cpp) - 🛠️ Schema-driven API metadata exposure (Pydantic v2, with
json_schema_extrafor Gradio compatibility) - 🪵 Configurable logging with console and file output
- 🧾 Environment-variable-based configuration via
.env - 🔁 Designed for compatibility with Gradio dynamic forms
- 📡 Remote execution support via subprocess shell calls
- 📦 Structured, extensible architecture with FastAPI
- 🧪 Testable CLI inference layer with mocking support
git clone git@github.com:Yummyfudge/medparswell.git
cd medparswell
# Create environment
conda env create -f environment.yml
conda activate medparswell
cp .env.sample .env # Set environment variables (edit to match your system)
# Launch dev server
uvicorn app.main:app --reloadA sample .env.sample file is provided to show required and optional environment variables used by the app. Copy this file to .env and adjust paths as needed:
cp .env.sample .envmedparswell/
├── .env.sample
├── app/
│ ├── config/
│ │ └── logging_config.py
│ ├── routes/
│ ├── schemas/
│ └── services/
├── tests/
│ ├── integration/
│ ├── routes/
│ ├── schemas/
│ ├── services/
│ └── conftest.py
├── notes/
├── docs/
├── to_linux.sh
├── environment.yml
└── README.md
pytest -vSupports full mocking and isolated CLI testing. One integration test is skipped unless the FastAPI server is running.
- Route scaffolding & health endpoints
- API schema metadata exposure
- CLI integration with
ik_llama.cpp - Add Gradio UI compatibility
- Support full CLI argument mapping
- Implement request batching & streaming
- Local/remote inference split
- Add persistent slot caching
- CLI embedding-only mode support
This project is licensed under the MIT License.
This project interfaces with the following external tools:
ik_llama.cpp: A high-performance fork ofllama.cppfor running GGUF models locally. Used by MedParsWell as the backend inference engine via CLI subprocess calls. See their repository for licensing and attribution details.- The Level1Techs community and Wendell for the inspiration, ideas, and support that helped shape the goals of this project.