Prebuilt Windows binaries of the VisualText NLP Engine, packaged together with the data/ knowledge bases and a Python wrapper so the engine can be downloaded and used out of the box — no compilation required.
The NLP Engine is the runtime for NLP++, a domain-specific language for natural-language analyzers. This repository tracks upstream releases automatically and republishes them as a Windows-ready bundle.
The NLP Engine is distributed per platform. Pick the one that matches your OS:
| Platform | Repository |
|---|---|
| Windows | VisualText/nlp-engine-windows (this repo) |
| Linux | VisualText/nlp-engine-linux |
| macOS | VisualText/nlp-engine-mac |
| Source | VisualText/nlp-engine |
For production use from Python, prefer the NLPPlus Python package instead of the simple wrapper shipped here.
| Path | Description |
|---|---|
nlp.exe |
The NLP Engine command-line executable (renamed from upstream nlpw.exe). |
icudt*.dll, icuuc*.dll, icuin*.dll |
ICU runtime DLLs that nlp.exe links against. The version suffix (74, 78, …) tracks whichever ICU upstream is currently using. |
data/ |
NLP Engine data directory containing the rfb (rules-from-builder) knowledge base. nlp.exe is invoked with this directory as -WORK. |
compile-libs/ |
Headers (include/Api/, include/cs/) and engine static libraries (lib/{prim,kbm,consh,words,lite}.lib) used to link a compiled analyzer/KB into a .dll. Populated by the workflow from upstream's nlpengine-compile-libs.zip. |
scripts/compile-analyzer.{ps1,bat} |
Compile the analyzer (run+kb) into <analyzer>\bin\run.dll + <analyzer>\bin\kb.dll. |
python/ |
Git submodule pointing at VisualText/python — a thin Python wrapper (NLPEngine class) that shells out to nlp.exe. |
.version-flag |
Records the upstream release tag currently vendored (e.g. v3.0.2). Used by the update workflow to detect stale builds. |
.github/workflows/nlp-engine-build.yml |
Automation that pulls the latest upstream release and commits/tags it here. |
Grab the latest tagged release from this repository's Releases page. The tag mirrors the upstream VisualText/nlp-engine version (e.g. v3.1.9).
git clone --recurse-submodules https://github.com/VisualText/nlp-engine-windows.git
cd nlp-engine-windowsThe --recurse-submodules flag pulls in the python/ submodule. If you forgot it, run:
git submodule update --init --recursivenlp.exe is invoked with an analyzer folder, a working directory containing the data/ tree, and an input text file:
.\nlp.exe -ANA <path-to-analyzer-folder> -WORK <path-to-this-repo> <path-to-input-text-file>Add -DEV to enable developer logging output.
nlp.exe and the ICU DLLs must remain in the same folder — Windows resolves the DLLs from the executable's directory at load time.
The bundled python/ submodule contains an NLPEngine class that wraps the executable for scripting use. See python/README.md for details. Minimal example:
from python.nlpengine import NLPEngine
engine = NLPEngine(
engineDir=r"C:\path\to\nlp-engine-windows",
analyzersDir=r"C:\path\to\analyzers",
)
engine.analyzeFile("my-analyzer", "sample.txt")For production workloads, prefer the NLPPlus Python package, which links against the engine directly instead of shelling out.
By default nlp.exe runs analyzers fully interpreted from the .nlp
source. With the engine's -COMPILED mode, both the analyzer body
(the rule passes) and the knowledge base are compiled to native
DLLs that the engine LoadLibrarys at runtime — the analyzer runs
entirely from compiled code, so source edits to .nlp files between
runs don't affect the output until you recompile.
| Script | What it does | Output |
|---|---|---|
scripts/compile-analyzer.ps1 |
Runs nlp.exe -COMPILE (emits the analyzer C++ trees under <analyzer>\run and <analyzer>\kb), then links everything into a single DLL against compile-libs/. The DLL exports both run_analyzer(Parse*) and kb_setup(void*) (engine codegen emits both). |
<analyzer>\bin\run.dll<analyzer>\bin\runu.dll<analyzer>\bin\kb.dll<analyzer>\bin\kbu.dll |
The same DLL is staged under all four filenames so the engine's load paths find it whether it's looking for the ANSI or UNICODE build flavour (lite/nlp.cpp:1242 / cs/libconsh/cg.cpp:168).
A .bat shim of the same name ships alongside the .ps1, so the
script can be run either directly from PowerShell or from cmd.exe.
- Visual Studio 2022 (or Build Tools) with the Desktop development with C++ workload — provides
cl.exe,lib.exe,dumpbin.exe, andVsDevCmd.bat. The script locates these automatically viavswhere.exe. - CMake ≥ 3.16 (on
PATH).
# Default: full-analyzer compile (run + kb):
.\scripts\compile-analyzer.ps1 data\rfb data\rfb\input\text.txt
# Or from cmd.exe via the .bat shim:
scripts\compile-analyzer.bat data\rfb data\rfb\input\text.txt
# Legacy: KB-only compile (matches the pre-NLP-ENGINE-WINDOWS-007 behaviour):
.\scripts\compile-analyzer.ps1 -KbOnly data\rfb data\rfb\input\text.txt
# Run with the compiled artifacts:
.\nlp.exe -COMPILED -ANA data\rfb -WORK . data\rfb\input\text.txtWhat you should see in the -COMPILED output for a successful
round-trip:
[CG: Trying to load compiled KB.]
[Loading compiled kb: data\rfb\bin\kb.dll]
[Loaded compiled kb library]
[Loading compiled analyzer data\rfb\bin\run.dll]
[Loaded compiled analyzer]
... parse output ...
Upstream only ships ICU as runtime DLLs (icudt78.dll, icuin78.dll, icuuc78.dll) — no .lib import libraries. On its first run, compile-analyzer.ps1 generates the import libs from the DLLs:
dumpbin /exports icu*.dlllists every exported symbol.- The exports are written to a
.deffile undercompile-libs\lib\. lib.exe /def:... /machine:X64 /out:icu*.libproduces the import library.
Subsequent runs reuse the generated .lib files. The ICU version digits (78) match the bundled DLLs and will move in lock-step whenever upstream bumps ICU.
The nlp-engine-build.yml workflow keeps this repo in sync with upstream:
- Queries
VisualText/nlp-enginefor the latest GitHub release. - Downloads the Windows assets —
nlpengine.zip(thedata/tree),nlpw.exe, the three ICU DLLs (icudt*.dll,icuuc*.dll,icuin*.dll), andnlpengine-compile-libs.zip(headers + engine static libraries used by the compile scripts; optional — skipped if absent on a given release). Asset matching is version-agnostic, so ICU bumps don't require workflow edits. - Removes the previously committed binaries in a dedicated cleanup commit (so git stores a clean diff rather than two layered binary blobs).
- Unzips
nlpengine.zip, renamesnlpw.exe→nlp.exe, extractsnlpengine-compile-libs.ziptocompile-libs/, and commits the new files. - Tags the commit with the upstream version and publishes a GitHub release.
repository_dispatchwith event typenlp-engine-release— fired by the upstream repo on every new release.workflow_dispatch— manual trigger from the Actions tab. Forces an update even if the tag already exists locally.
If actions/github-script throws Could not find <asset> in release …, an upstream asset has been renamed. The error message includes the full list of available assets — update the matcher in nlp-engine-build.yml (the findAsset calls) to reflect the new name.
- VisualText/nlp-engine — upstream source for the engine itself.
- VisualText/nlp-engine-linux — Linux binary bundle.
- VisualText/nlp-engine-mac — macOS binary bundle.
- VisualText/py-package-nlpengine — production-grade Python bindings.
- VisualText NLP++ VSCode Extension — IDE integration that drives
nlp.exe.
MIT — matches the upstream NLP Engine license.