Add Python (pybind11) bindings + hello_search.py example#9
Merged
evilmucedin merged 1 commit intoMay 28, 2026
Merged
Conversation
A new optional `searchplusplus` CPython extension module that exposes
Schema, Document, IndexWriter, IndexReader, and Searcher to Python.
Methods are snake_case; Status / Expected<T> failures raise normal
Python exceptions (ValueError / KeyError / IndexError / RuntimeError
depending on the StatusCode).
- python/CMakeLists.txt + python/src/searchplusplus.cpp: the binding
itself. Off by default — gated by SPP_BUILD_PYTHON=OFF so the
regular C++ build doesn't grow a Python dev-header requirement.
- examples/python/hello_search.py: a line-for-line mirror of
examples/hello_search/main.cpp. Same corpus, same queries, same
printed output (verified locally on macOS):
inverted index total=1
b score=1.9253
bm25 total=1
c score=1.3863
title:tokenizer total=1
e score=1.3863
- vcpkg.json: pybind11 lives under a `python` feature so vcpkg only
pulls it in when bindings are actually requested.
- DESIGN.md: pybind11 added to the build-time / dev-only dependency
list with the same justification depth as the existing entries
(header-only, ~200 KB, single-purpose, alternative is hand-rolling
every method against the CPython C API — clears the principle-2
dependency bar).
- python/README.md + examples/python/README.md: build instructions
and a one-time gotcha (the Python module must be built without
AddressSanitizer — the default preset wires ASan into spp_core,
and ASan can't be loaded into a non-instrumented Python via
dlopen).
The binding covers only the v0.1 surface (schema, writer, searcher);
v0.2 LTR ranker registration and per-token-weight features are not
wired up here — straightforward extension when there's a caller.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
DenisRaskovalov
added a commit
to DenisRaskovalov/SearchPlusPlus
that referenced
this pull request
May 28, 2026
evilmucedin#9 merged before the clang-format follow-up commit on that PR landed, so main still carries the un-formatted python/src/searchplusplus.cpp. That makes the clang-format lane red on every subsequent PR, including this README-only one, even though no PR after evilmucedin#9 has touched the file. Apply the same in-tree .clang-format edits the follow-up commit on evilmucedin#9 made (no behavior change): IncludeBlocks: Regroup moves stdlib above third-party, and a few one-line lambdas / class_ instantiations fit on a single line under the 100-column limit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Optional CPython extension module that exposes the SearchPlusPlus core
to Python via pybind11, plus a Python example that mirrors
examples/hello_search/main.cppline-for-line.Binding (
python/src/searchplusplus.cpp) covers the v0.1 surface —Schema,Document,IndexWriter,IndexReader,Searcher,SearchResult,Hit. Python-side methods aresnake_case.Status/Expected<T>failures raiseValueError/KeyError/IndexError/RuntimeErrorbased on the underlyingStatusCode, so callers get normalraisesemantics instead of having to check.ok().Example (
examples/python/hello_search.py) reproduces the C++ example's output exactly:Off by default —
SPP_BUILD_PYTHON=OFF. The regular build stays self-contained; no Python dev-header dependency unless you opt in.pybind11 added under a vcpkg
pythonfeature sovcpkg installonly pulls it in when the bindings are actually requested. Justified inDESIGN.mdagainst principle 2: header-only, ~200 KB, single-purpose, alternative is hand-rolling every method against the raw CPython C API.Build
The README documents the one gotcha: don't build the bindings under the
default(ASan) preset — ASan can't initialize when the.soisdlopen'd into a non-instrumented Python interpreter.Test plan
releasepreset (macOS, Apple Clang 21.0.0)hello_search.pyruns end-to-end and produces output byte-identical to the C++ example🤖 Generated with Claude Code