ggml-org · dnhkng · Nov 19, 2023 · Nov 19, 2023
diff --git a/README.md b/README.md
@@ -751,9 +751,7 @@ in [models](models).
 - [X] .NET: | [#422](https://github.com/ggerganov/whisper.cpp/discussions/422)
   - [sandrohanea/whisper.net](https://github.com/sandrohanea/whisper.net)
   - [NickDarvey/whisper](https://github.com/NickDarvey/whisper)
-- [X] Python: | [#9](https://github.com/ggerganov/whisper.cpp/issues/9)
-  - [stlukey/whispercpp.py](https://github.com/stlukey/whispercpp.py) (Cython)
-  - [aarnphm/whispercpp](https://github.com/aarnphm/whispercpp) (Pybind11)
+- [X] Python: [bindings/python](bindings/python) | [#9](https://github.com/ggerganov/whisper.cpp/issues/9)
 - [X] R: [bnosac/audio.whisper](https://github.com/bnosac/audio.whisper)
 - [X] Unity: [macoron/whisper.unity](https://github.com/Macoron/whisper.unity)
 

diff --git a/bindings/python/README.md b/bindings/python/README.md
@@ -0,0 +1,39 @@
+# Python bindings for Whisper
+
+This is a guide on Python bindings for whisper.cpp. It has been tested on:
+
+  * Darwin (OS X) 14.0 on arm64 - not working, library won't load!
+  * Ubuntu x86_64 - works, also with CUDA acceleration and the distil model!
+
+
+## Usage
+It can be used like this:
+
+  * move the compiled 'libwhisper.so' to the same directory, or add it to the path.
+  * rebuild the low-level wrapper is something breaks on changes to whisper-cpp (see below).
+
+```python
+from scipy.io import wavfile
+from whisper_cpp import WhisperCpp
+
+# prepare audio data
+samplerate, audio = wavfile.read("samples/jfk.wav")
+audio = audio.astype("float32") / 32768.0
+
+# run the inference
+model = WhisperCpp(model="./models/ggml-medium.en.bin")
+transcription = model.transcribe(audio)
+
+print(transcription)
+```
+
+## Rebuilding
+
+The "low level" bindings are autogenerate and can be simply regenerated as follows:
+
+```bash
+# from the root whisper.pp directory
+> ctypesgen whisper.h -l whisper -o whisper_cpp_wrapper.py
+```
+
+The interface file will probable need to be rewritten manually on big changes to whisper.cpp, but is relatively easy (compared to manual wrapping!).
diff --git a/bindings/python/whisper_cpp.py b/bindings/python/whisper_cpp.py
@@ -0,0 +1,50 @@
+import ctypes
+
+import numpy as np
+import whisper_cpp_wrapper
+
+
+class WhisperCpp:
+    """Wrapper around whisper.cpp, which is a C++ implementation of the Whisper
+    speech recognition model.
+    """
+
+    def __init__(self, model: str, params=None) -> None:
+        self.ctx = whisper_cpp_wrapper.whisper_init_from_file(model.encode("utf-8"))
+
+    def transcribe(self, audio: np.ndarray, params=None) -> str:
+        """Transcribe audio using the given parameters.
+
+        Any is whisper_cpp.WhisperParams, but we can't import that here
+        because it's a C++ class.
+        """
+        # Set the default parameters if none are given
+        if not params:
+            self.params = whisper_cpp_wrapper.whisper_full_default_params(
+                whisper_cpp_wrapper.WHISPER_SAMPLING_GREEDY  # 0, faster
+            )
+        else:
+            self.params = params
+
+        # Run the model
+        whisper_cpp_audio = audio.ctypes.data_as(ctypes.POINTER(ctypes.c_float))
+        result = whisper_cpp_wrapper.whisper_full(
+            self.ctx, self.params, whisper_cpp_audio, len(audio)
+        )
+        if result != 0:
+            raise Exception(f"Error from whisper.cpp: {result}")
+
+        # Get the text
+        n_segments = whisper_cpp_wrapper.whisper_full_n_segments((self.ctx))
+        text = [
+            whisper_cpp_wrapper.whisper_full_get_segment_text((self.ctx), i)
+            for i in range(n_segments)
+        ]
+
+        return text[0].decode("utf-8")
+
+    def __del__(self):
+        """
+        Free the C++ object when this Python object is garbage collected.
+        """
+        whisper_cpp_wrapper.whisper_free(self.ctx)