CogStack · alhendrickson · May 27, 2026 · May 27, 2026 · May 27, 2026
diff --git a/medcat-service/README.md b/medcat-service/README.md
diff --git a/medcat-service/docs/assets/demo-anoncat.png b/medcat-service/docs/assets/demo-anoncat.png
diff --git a/medcat-service/docs/assets/demo-api.webm b/medcat-service/docs/assets/demo-api.webm
diff --git a/medcat-service/docs/assets/demo-medcat.png b/medcat-service/docs/assets/demo-medcat.png
diff --git a/medcat-service/docs/index.md b/medcat-service/docs/index.md
@@ -0,0 +1,40 @@
+# Medcat service documentation
+
+Medcat service is a REST API for serving [MedCAT](https://github.com/CogStack/cogstack-nlp/blob/main/medcat-v2/) models, allowing you to perform named entity resolution and deidentification of medical text over an API.
+
+Feel free to ask questions on the github issue tracker or on our [discourse website](https://discourse.cogstack.org) which is frequently used by our development team!
+
+## Demo
+<video loop muted playsinline controls style="max-width: 100%; height: auto;">
+  <source src="assets/demo-api.webm" type="video/webm">
+</video>
+
+## Features
+
+### Clinical NLP over HTTP
+
+- **Single-document processing** — `POST /api/process` accepts free-text clinical notes and returns entities with CUIs, types, spans, and meta-annotations (for example negation and subject).
+- **Bulk processing** — `POST /api/process_bulk` processes many documents in one request, with parallel annotation via configurable worker threads (`APP_BULK_NPROC`).
+- **Meta-annotation filtering** — Optional `meta_anns_filters` on `/api/process` let you return only entities that match selected meta-annotation values (for example affirmed presence or patient subject).
+- **AnonCAT De-identification** - Detect and redact identifiable information in clinical text by loading an AnonCAT model. See [API examples](user-guide/api-example-use.md).
+
+### Demo UI
+
+- **Interactive model trial** — A user friendly UI to try out medcat models and see the results. See [Demo UI](user-guide/demo-ui.md).
+
+### Operations and observability
+
+- **Health endpoints** — Kubernetes-friendly liveness (`/api/health/live`) and readiness (`/api/health/ready`) checks that verify the model is loaded.
+- **Service metadata** — `GET /api/info` returns application name, language, version, and loaded model details.
+- **Metrics** — Optional prometheus metrics of the service on `/metrics` when `APP_ENABLE_METRICS=True`.
+- **Tracing** — Distributed tracing export for production deployments. See [Configuration](setup/configuration.md#telemetry).
+
+
+## Get started
+
+| Topic | Guide |
+|-------|-------|
+| Install with Helm or Docker Compose | [Installation](setup/installation.md) |
+| Environment variables and tuning | [Configuration](setup/configuration.md) |
+| `curl` examples for process and bulk APIs | [API example use](user-guide/api-example-use.md) |
+| Try models in the browser | [Demo UI](user-guide/demo-ui.md) |
diff --git a/medcat-service/docs/pyproject.toml b/medcat-service/docs/pyproject.toml
@@ -0,0 +1,19 @@
+[project]
+name = "medcat-trainer-documentation"
+version = "0.1.0"
+description = "MedCAT Service Documentation"
+readme = "README.md"
+requires-python = ">=3.10"
+dependencies = [
+    "mkdocs-material>=9.7.0",
+    "termynal>=0.13.1",
+]
+
+[project.optional-dependencies]
+dev = [
+    "ruff>=0.12.11",
+]
+
+[tool.ruff]
+line-length = 120
+
diff --git a/medcat-service/docs/reference/extensions.md b/medcat-service/docs/reference/extensions.md
@@ -0,0 +1,5 @@
+# Extending Medcat Service
+
+## spaCy models
+
+When using MedCAT for a different language than English, it can be useful to use a different spaCy model. A spaCy model can be included in the MedCAT model pack, but when not using this functionality, it can be useful to install models in the Docker image. This can be done by setting a build-time variable. See the `SPACY_MODELS` variable in [Dockerfile](https://github.com/CogStack/cogstack-nlp/blob/e5827a806c100abafb7c5a70f917d560fdfc374c/medcat-service/Dockerfile) for default value and usage.
diff --git a/medcat-service/docs/setup/configuration.md b/medcat-service/docs/setup/configuration.md
@@ -0,0 +1,75 @@
+# Configuration
+
+Medcat service can be configured with environment variables on startup.
+
+## Service Environment vars
+
+
+
+The following environment variables are available for tailoring the MedCAT Service `gunicorn` server:
+
+- `SERVER_HOST` - specifies the host address (default: `0.0.0.0`),
+- `SERVER_PORT` - the port number used (default: `5000`),
+- `SERVER_WORKERS` - the number of workers serving the Flask app working in parallel (default: `1` ; only used in production server).
+- `SERVER_WORKER_TIMEOUT` - the max timeout (in sec) for receiving response from worker (default: `300` ; only used with production server).
+- `SERVER_GUNICORN_MAX_REQUESTS` - maximum number of requests a worker will process before restarting (default: `1000`),
+- `SERVER_GUNICORN_MAX_REQUESTS_JITTER` - adds randomness to `MAX_REQUESTS` to avoid all workers restarting simultaneously (default: `50`),
+- `SERVER_GUNICORN_EXTRA_ARGS` - any additional Gunicorn CLI arguments you want to pass (default: none). (Example value: "SERVER_GUNICORN_EXTRA_ARGS=--backlog 20")
+
+The following environment variables are available for tailoring the MedCAT Service wrapper:
+
+- `APP_MODEL_NAME` - an informative name of the model used by MedCAT (optional),
+- `APP_MODEL_CDB_PATH` - the path to the model's concept database,
+- `APP_MODEL_VOCAB_PATH` - the path to the model's vocabulary,
+- `APP_MODEL_META_PATH_LIST` - the list of paths to meta-annotation models, each separated by `:` character (optional),
+- `APP_BULK_NPROC` - the number of threads used in bulk processing (default: `8`),
+- `APP_MEDCAT_MODEL_PACK` -  MedCAT Model Pack path, if this parameter has a value IT WILL BE LOADED FIRST OVER EVERYTHING ELSE (CDB, Vocab, MetaCATs, etc.) declared above.
+- `APP_ENABLE_METRICS` - Enable prometheus metrics collection served on the path /metrics
+- `APP_ENABLE_DEMO_UI` - Enable the demo user interface to try models. (Default: `False`)
+- `APP_DEMO_UI_PATH` - Customise the path of the demo UI. (Default: `/`)
+
+### Shared Memory (`DOCKER_SHM_SIZE`)
+
+The MedCAT service uses PyTorch multiprocessing and memory-mapped models, which rely on Linux shared memory (`/dev/shm`).  
+By default, Docker limits this to **64 MB**, which is insufficient for NLP models.
+
+Use the environment variable `DOCKER_SHM_SIZE` to control the size of shared memory inside the container. 
+You can set this variable in the `env/general.env` file.
+
+- **Recommended**: `8g` for bulk inference (`APP_BULK_NPROC > 1`)  
+- **Minimum**: `1g` for single-process inference (`APP_BULK_NPROC=1`)  
+
+Example:
+
+```env
+DOCKER_SHM_SIZE=8g
+```
+
+### Telemetry
+MedCAT Service supports exporting traces using Opentelemetry
+To enable distributed tracing and telemetry in the MedCAT Service, several environment variables must be set. These can be configured in your environment files or exported in your startup scripts (see `start_service_debug.sh` and related files):
+
+| Environment Variable                       | Description                                                                                       | Example Value                         |
+|--------------------------------------------|---------------------------------------------------------------------------------------------------|---------------------------------------|
+| `APP_ENABLE_TRACING`                       | Enable OpenTelemetry tracing in the application.                                                  | `True`                                |
+| `OTEL_TRACES_EXPORTER`                     | Exporter to use for traces (commonly `otlp`).                                                     | `otlp`                                |
+| `OTEL_SERVICE_NAME`                        | Logical service name for your traces.                                                             | `medcat-service`                      |
+| `OTEL_EXPORTER_OTLP_ENDPOINT`              | URL for your OpenTelemetry collector.                                                             | `http://localhost:4317`               |
+| `OTEL_EXPORTER_OTLP_PROTOCOL`              | Protocol to use for OTLP exporter.                                                                | `grpc`                                |
+| `OTEL_METRICS_EXPORTER`                    | Set to `none` to disable metrics export, or another value if metrics are enabled.                 | `none`                                |
+| `OTEL_PYTHON_FASTAPI_EXCLUDED_URLS`        | Comma-separated list of URLs to exclude from tracing and metrics (e.g., health/metrics endpoints).| `/api/health,/metrics`                |
+| `OTEL_EXPERIMENTAL_RESOURCE_DETECTORS`     | Additional resource detectors to use (comma-separated).                                           | `containerid,os`                      |
+
+
+See https://opentelemetry-python.readthedocs.io/en/latest/sdk/environment_variables.html for the full list of opentelemetry environment variables.
+
+## Performance Tuning
+
+Theres a range of factors that might impact the performance of this service, the most obvious being the size of the processed documents (amount of text per document) as well as the resources of the machine on which the service operates.
+The main settings that can be used to improve the performance when querying large amounts of documents are : `SERVER_WORKERS` (number of flask web workers that chan handle parallel requests) and `APP_BULK_NPROC` (threads for annotation processing).
+
+## MedCAT library
+
+MedCAT parameters are defined in selected `envs/medcat*`  file.
+
+For details on available MedCAT parameters please refer to [the official GitHub repository](https://github.com/CogStack/cogstack-nlp/blob/main/medcat-v2/).
diff --git a/medcat-service/docs/setup/installation.md b/medcat-service/docs/setup/installation.md
@@ -0,0 +1,40 @@
+# Installation
+
+Medcat Service can be run using helm or docker compose
+
+The recommended approach is using helm
+
+## Helm installation
+
+Bring up medcat service with one line using helm
+
+```
+helm install medcat-service-helm oci://registry-1.docker.io/cogstacksystems/medcat-service-helm
+```
+
+See [medcat-service-helm](https://docs.cogstack.org/en/latest/platform/deployment/helm/charts/medcat-service-helm/) for the full documentation of how to use this chart.
+
+## Docker compose
+
+You also can bring up medcat service using docker compose with an example like this:
+
+
+```yaml
+name: cogstack-medcat-service
+services:
+  medcat-service:
+    image: cogstacksystems/medcat-service:${IMAGE_TAG-latest}
+    restart: unless-stopped
+    environment:
+      # Uses a preloaded model pack example inside the image
+      - APP_MEDCAT_MODEL_PACK=/cat/models/examples/example-medcat-v2-model-pack.zip
+      - APP_ENABLE_METRICS=True
+      - APP_ENABLE_DEMO_UI=True
+    ports:
+      - "5555:5000"
+```
+
+You can now access medcat service on `localhost:5555`
+
+See the other examples and scenarios for running medcat service on [cogstack-nlp github](https://github.com/CogStack/cogstack-nlp/tree/main/medcat-service/docker)
+