Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ option(PARAKEET_BUILD_CLI "Build the parakeet CLI binary" ON)
option(PARAKEET_BUILD_TESTS "Build parakeet tests" ON)
option(PARAKEET_BUILD_BENCHMARKS "Build parakeet benchmarks" OFF)
option(PARAKEET_BUILD_EXAMPLES "Build parakeet examples" ON)
option(PARAKEET_BUILD_SERVER_EXAMPLE "Build the Unix socket server example" OFF)
option(BUILD_SHARED_LIBS "Build shared library instead of static" OFF)

# ── Axiom ────────────────────────────────────────────────────────────────────
Expand Down Expand Up @@ -188,6 +189,7 @@ message(STATUS " Build type: ${CMAKE_BUILD_TYPE}")
message(STATUS " CLI: ${PARAKEET_BUILD_CLI}")
message(STATUS " Tests: ${PARAKEET_BUILD_TESTS}")
message(STATUS " Examples: ${PARAKEET_BUILD_EXAMPLES}")
message(STATUS " Server example: ${PARAKEET_BUILD_SERVER_EXAMPLE}")
message(STATUS " Benchmarks: ${PARAKEET_BUILD_BENCHMARKS}")
message(STATUS " Shared libs: ${BUILD_SHARED_LIBS}")
message(STATUS " Install prefix: ${CMAKE_INSTALL_PREFIX}")
Expand Down
5 changes: 5 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,11 @@ ifdef CLI
CMAKE_FLAGS += -DPARAKEET_BUILD_CLI=$(CLI)
endif

# Optional: make build SERVER=ON
ifdef SERVER
CMAKE_FLAGS += -DPARAKEET_BUILD_SERVER_EXAMPLE=$(SERVER)
endif

# Use Ninja if available, otherwise Unix Makefiles
ifneq ($(shell which ninja 2>/dev/null),)
GENERATOR := Ninja
Expand Down
8 changes: 8 additions & 0 deletions examples/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,11 @@ add_subdirectory(nemotron)
add_subdirectory(diarize)
add_subdirectory(diarized-transcription)
add_subdirectory(c-api)

if(PARAKEET_BUILD_SERVER_EXAMPLE)
if(UNIX)
add_subdirectory(server)
else()
message(WARNING "PARAKEET_BUILD_SERVER_EXAMPLE is ON, but the server example currently requires Unix domain sockets")
endif()
endif()
1 change: 1 addition & 0 deletions examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ Disable with `-DPARAKEET_BUILD_EXAMPLES=OFF`.
| [batch](batch/) | Batch transcription of multiple files |
| [vad](vad/) | Voice activity detection (standalone + ASR preprocessing) |
| [gpu](gpu/) | Metal GPU acceleration and FP16 with timing comparison |
| [server](server/) | Warm transcriber daemon over a Unix domain socket (opt-in build) |

## Streaming

Expand Down
2 changes: 2 additions & 0 deletions examples/server/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
add_executable(example-server main.cpp)
target_link_libraries(example-server PRIVATE parakeet_lib)
103 changes: 103 additions & 0 deletions examples/server/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
# Server

`example-server` keeps a Parakeet transcriber warm inside one process and
accepts newline-delimited JSON requests over a Unix domain socket. It is an
example for consumers who want persistent-process reuse without changing
parakeet.cpp core code.

## Build

This example is opt-in:

```bash
make build SERVER=ON
# Binary: ./build/examples/server/example-server
```

You can also configure CMake directly:

```bash
cmake -B build -DPARAKEET_BUILD_SERVER_EXAMPLE=ON
cmake --build build
```

## Usage

```bash
./build/examples/server/example-server /tmp/parakeet.sock \
model.safetensors vocab.txt [options]
```

Server startup options:

- `--model TYPE` — `tdt-ctc-110m` (default) or `tdt-600m`
- `--gpu` — move the model to Metal GPU once at startup
- `--fp16` — cast to fp16 before `--gpu`
- `--vad PATH` — load Silero VAD weights once at startup so requests can opt in

This example keeps one loaded model instance warm per process and supports:

- `tdt-ctc-110m` through the high-level `parakeet::Transcriber` API
- `tdt-600m` through the same reusable loaded-state pattern used by the CLI's
explicit TDT path

## Request protocol

Each request is one JSON object per line:

```json
{"request_id":"1","audio_path":"samples/mm1.wav","decoder":"tdt","timestamps":true}
```

Supported request keys:

- `request_id` — optional string echoed back in the response
- `audio_path` — required path to an audio file readable by `read_audio`
- `decoder` — optional: `tdt`, `ctc`, `tdt-beam`, `ctc-beam`
- `timestamps` — optional boolean
- `use_vad` — optional boolean (requires `--vad PATH` at server startup)
- `beam_width` — optional integer
- `lm_path` — optional ARPA LM path for beam decoders
- `lm_weight` — optional float
- `boost_score` — optional float
- `boost_phrases` — optional array of strings

Responses are also newline-delimited JSON:

```json
{"ok":true,"request_id":"1","text":"...","elapsed_ms":812}
```

With timestamps enabled, the response also includes `word_timestamps`:

```json
{"ok":true,"request_id":"1","text":"...","elapsed_ms":812,"word_timestamps":[{"word":"hello","start":0.0,"end":0.4,"confidence":0.98}]}
```

Errors stay on the socket as JSON and operational logs stay on stderr:

```json
{"ok":false,"request_id":"1","error":"audio_path is required"}
```

## Example session

Start the server:

```bash
./build/examples/server/example-server /tmp/parakeet.sock model.safetensors vocab.txt --model tdt-600m
```

Send a request:

```bash
printf '%s\n' '{"request_id":"demo","audio_path":"samples/mm1.wav","timestamps":true}' \
| nc -U /tmp/parakeet.sock
```

This is intentionally example-grade:

- one warm model per server process
- line-delimited request/response framing
- no auth, TLS, or concurrency guarantees
- meant to be wrapped or adapted by downstream applications
Loading