Skip to content

llm-zip v0.2.1

Choose a tag to compare

@finktech-dev finktech-dev released this 09 Jun 04:58
· 5 commits to main since this release

llm-zip v0.2.1

Health probes, structured logging, system info endpoint, and dependency fixes.

What's New

Health probes

Added Kubernetes-compliant /health/live and /health/ready endpoints.

  • live guarantees the HTTP server is running.
  • ready remains unavailable until inference models are fully loaded into memory, handling the typical 2–5 minute cold-start latency.

Structured logging

Added rotating JSON file logging in logs/llmzip.log alongside colored console output.

Logs now include structured fields such as:

  • tokens_in
  • tokens_out
  • ratio
  • elapsed_ms

This makes ingestion by monitoring platforms such as Datadog and Loki significantly easier.

Info endpoint

Added GET /v1/info.

Returns:

  • Current system configuration
  • Loaded models
  • Enabled features
  • Active hardware limits (e.g. max_tokens, max_file_size_mb)

File size limits

Enforced MAX_FILE_SIZE_MB (default: 50 MB) on the /v1/compress/file endpoint to prevent memory exhaustion when processing large documents.

CLI commands

Added:

llmzip version

to quickly verify the installed package version.

Documentation

Added:

  • DOCKER.md with detailed guidance for monolith and split deployments, including Kubernetes examples.
  • KNOWN_LIMITATIONS.md documenting current architectural constraints and expected behavior.

Fixed

Docker dependencies

Resolved a ModuleNotFoundError affecting split-mode deployments by ensuring sentence-transformers is installed in the stateless API container when semantic scoring is enabled.

Dependency scope

Moved heavy machine learning dependencies (llmlingua, markitdown) into the optional [inference] dependency group in pyproject.toml.

API reliability

Fixed a NameError involving _get_warning that could trigger HTTP 500 responses during single-file and batch compression requests.

Upgrading from 0.2.0

No breaking changes.

The logs/ directory will be created automatically on startup.

If you want to override the default 50 MB upload limit, copy MAX_FILE_SIZE_MB from .llmzip.config.example into your existing configuration file.