Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
320 changes: 4 additions & 316 deletions medcat-service/README.md

Large diffs are not rendered by default.

Binary file added medcat-service/docs/assets/demo-anoncat.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added medcat-service/docs/assets/demo-api.webm
Binary file not shown.
Binary file added medcat-service/docs/assets/demo-medcat.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
40 changes: 40 additions & 0 deletions medcat-service/docs/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# Medcat service documentation

Medcat service is a REST API for serving [MedCAT](https://github.com/CogStack/cogstack-nlp/blob/main/medcat-v2/) models, allowing you to perform named entity resolution and deidentification of medical text over an API.

Feel free to ask questions on the github issue tracker or on our [discourse website](https://discourse.cogstack.org) which is frequently used by our development team!

## Demo
<video loop muted playsinline controls style="max-width: 100%; height: auto;">
<source src="assets/demo-api.webm" type="video/webm">
</video>

## Features

### Clinical NLP over HTTP

- **Single-document processing** — `POST /api/process` accepts free-text clinical notes and returns entities with CUIs, types, spans, and meta-annotations (for example negation and subject).
- **Bulk processing** — `POST /api/process_bulk` processes many documents in one request, with parallel annotation via configurable worker threads (`APP_BULK_NPROC`).
- **Meta-annotation filtering** — Optional `meta_anns_filters` on `/api/process` let you return only entities that match selected meta-annotation values (for example affirmed presence or patient subject).
- **AnonCAT De-identification** - Detect and redact identifiable information in clinical text by loading an AnonCAT model. See [API examples](user-guide/api-example-use.md).

### Demo UI

- **Interactive model trial** — A user friendly UI to try out medcat models and see the results. See [Demo UI](user-guide/demo-ui.md).

### Operations and observability

- **Health endpoints** — Kubernetes-friendly liveness (`/api/health/live`) and readiness (`/api/health/ready`) checks that verify the model is loaded.
- **Service metadata** — `GET /api/info` returns application name, language, version, and loaded model details.
- **Metrics** — Optional prometheus metrics of the service on `/metrics` when `APP_ENABLE_METRICS=True`.
- **Tracing** — Distributed tracing export for production deployments. See [Configuration](setup/configuration.md#telemetry).


## Get started

| Topic | Guide |
|-------|-------|
| Install with Helm or Docker Compose | [Installation](setup/installation.md) |
| Environment variables and tuning | [Configuration](setup/configuration.md) |
| `curl` examples for process and bulk APIs | [API example use](user-guide/api-example-use.md) |
| Try models in the browser | [Demo UI](user-guide/demo-ui.md) |
19 changes: 19 additions & 0 deletions medcat-service/docs/pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
[project]
name = "medcat-trainer-documentation"
version = "0.1.0"
description = "MedCAT Service Documentation"
readme = "README.md"
requires-python = ">=3.10"
dependencies = [
"mkdocs-material>=9.7.0",
"termynal>=0.13.1",
]

[project.optional-dependencies]
dev = [
"ruff>=0.12.11",
]

[tool.ruff]
line-length = 120

5 changes: 5 additions & 0 deletions medcat-service/docs/reference/extensions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Extending Medcat Service

## spaCy models

When using MedCAT for a different language than English, it can be useful to use a different spaCy model. A spaCy model can be included in the MedCAT model pack, but when not using this functionality, it can be useful to install models in the Docker image. This can be done by setting a build-time variable. See the `SPACY_MODELS` variable in [Dockerfile](https://github.com/CogStack/cogstack-nlp/blob/e5827a806c100abafb7c5a70f917d560fdfc374c/medcat-service/Dockerfile) for default value and usage.
75 changes: 75 additions & 0 deletions medcat-service/docs/setup/configuration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
# Configuration

Medcat service can be configured with environment variables on startup.

## Service Environment vars



The following environment variables are available for tailoring the MedCAT Service `gunicorn` server:

- `SERVER_HOST` - specifies the host address (default: `0.0.0.0`),
- `SERVER_PORT` - the port number used (default: `5000`),
- `SERVER_WORKERS` - the number of workers serving the Flask app working in parallel (default: `1` ; only used in production server).
- `SERVER_WORKER_TIMEOUT` - the max timeout (in sec) for receiving response from worker (default: `300` ; only used with production server).
- `SERVER_GUNICORN_MAX_REQUESTS` - maximum number of requests a worker will process before restarting (default: `1000`),
- `SERVER_GUNICORN_MAX_REQUESTS_JITTER` - adds randomness to `MAX_REQUESTS` to avoid all workers restarting simultaneously (default: `50`),
- `SERVER_GUNICORN_EXTRA_ARGS` - any additional Gunicorn CLI arguments you want to pass (default: none). (Example value: "SERVER_GUNICORN_EXTRA_ARGS=--backlog 20")

The following environment variables are available for tailoring the MedCAT Service wrapper:

- `APP_MODEL_NAME` - an informative name of the model used by MedCAT (optional),
- `APP_MODEL_CDB_PATH` - the path to the model's concept database,
- `APP_MODEL_VOCAB_PATH` - the path to the model's vocabulary,
- `APP_MODEL_META_PATH_LIST` - the list of paths to meta-annotation models, each separated by `:` character (optional),
- `APP_BULK_NPROC` - the number of threads used in bulk processing (default: `8`),
- `APP_MEDCAT_MODEL_PACK` - MedCAT Model Pack path, if this parameter has a value IT WILL BE LOADED FIRST OVER EVERYTHING ELSE (CDB, Vocab, MetaCATs, etc.) declared above.
- `APP_ENABLE_METRICS` - Enable prometheus metrics collection served on the path /metrics
- `APP_ENABLE_DEMO_UI` - Enable the demo user interface to try models. (Default: `False`)
- `APP_DEMO_UI_PATH` - Customise the path of the demo UI. (Default: `/`)

### Shared Memory (`DOCKER_SHM_SIZE`)

The MedCAT service uses PyTorch multiprocessing and memory-mapped models, which rely on Linux shared memory (`/dev/shm`).
By default, Docker limits this to **64 MB**, which is insufficient for NLP models.

Use the environment variable `DOCKER_SHM_SIZE` to control the size of shared memory inside the container.
You can set this variable in the `env/general.env` file.

- **Recommended**: `8g` for bulk inference (`APP_BULK_NPROC > 1`)
- **Minimum**: `1g` for single-process inference (`APP_BULK_NPROC=1`)

Example:

```env
DOCKER_SHM_SIZE=8g
```

### Telemetry
MedCAT Service supports exporting traces using Opentelemetry
To enable distributed tracing and telemetry in the MedCAT Service, several environment variables must be set. These can be configured in your environment files or exported in your startup scripts (see `start_service_debug.sh` and related files):

| Environment Variable | Description | Example Value |
|--------------------------------------------|---------------------------------------------------------------------------------------------------|---------------------------------------|
| `APP_ENABLE_TRACING` | Enable OpenTelemetry tracing in the application. | `True` |
| `OTEL_TRACES_EXPORTER` | Exporter to use for traces (commonly `otlp`). | `otlp` |
| `OTEL_SERVICE_NAME` | Logical service name for your traces. | `medcat-service` |
| `OTEL_EXPORTER_OTLP_ENDPOINT` | URL for your OpenTelemetry collector. | `http://localhost:4317` |
| `OTEL_EXPORTER_OTLP_PROTOCOL` | Protocol to use for OTLP exporter. | `grpc` |
| `OTEL_METRICS_EXPORTER` | Set to `none` to disable metrics export, or another value if metrics are enabled. | `none` |
| `OTEL_PYTHON_FASTAPI_EXCLUDED_URLS` | Comma-separated list of URLs to exclude from tracing and metrics (e.g., health/metrics endpoints).| `/api/health,/metrics` |
| `OTEL_EXPERIMENTAL_RESOURCE_DETECTORS` | Additional resource detectors to use (comma-separated). | `containerid,os` |


See https://opentelemetry-python.readthedocs.io/en/latest/sdk/environment_variables.html for the full list of opentelemetry environment variables.

## Performance Tuning

Theres a range of factors that might impact the performance of this service, the most obvious being the size of the processed documents (amount of text per document) as well as the resources of the machine on which the service operates.
The main settings that can be used to improve the performance when querying large amounts of documents are : `SERVER_WORKERS` (number of flask web workers that chan handle parallel requests) and `APP_BULK_NPROC` (threads for annotation processing).

## MedCAT library

MedCAT parameters are defined in selected `envs/medcat*` file.

For details on available MedCAT parameters please refer to [the official GitHub repository](https://github.com/CogStack/cogstack-nlp/blob/main/medcat-v2/).
40 changes: 40 additions & 0 deletions medcat-service/docs/setup/installation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# Installation

Medcat Service can be run using helm or docker compose

The recommended approach is using helm

## Helm installation

Bring up medcat service with one line using helm

```
helm install medcat-service-helm oci://registry-1.docker.io/cogstacksystems/medcat-service-helm
```

See [medcat-service-helm](https://docs.cogstack.org/en/latest/platform/deployment/helm/charts/medcat-service-helm/) for the full documentation of how to use this chart.

## Docker compose

You also can bring up medcat service using docker compose with an example like this:


```yaml
name: cogstack-medcat-service
services:
medcat-service:
image: cogstacksystems/medcat-service:${IMAGE_TAG-latest}
restart: unless-stopped
environment:
# Uses a preloaded model pack example inside the image
- APP_MEDCAT_MODEL_PACK=/cat/models/examples/example-medcat-v2-model-pack.zip
- APP_ENABLE_METRICS=True
- APP_ENABLE_DEMO_UI=True
ports:
- "5555:5000"
```

You can now access medcat service on `localhost:5555`

See the other examples and scenarios for running medcat service on [cogstack-nlp github](https://github.com/CogStack/cogstack-nlp/tree/main/medcat-service/docker)

Loading
Loading