Skip to content

Whispercpp v1.8.4 api#168

Merged
absadiki merged 18 commits into
absadiki:mainfrom
scottmonster:whispercpp-v1.8.4-api
May 26, 2026
Merged

Whispercpp v1.8.4 api#168
absadiki merged 18 commits into
absadiki:mainfrom
scottmonster:whispercpp-v1.8.4-api

Conversation

@scottmonster
Copy link
Copy Markdown
Contributor

So.... I think I messed up in during that last PR for typing. When I added **params to typing I pulled just to make sure there were no conflicts from the other PRs. However I think I pulled origin (my fork) instead of upstream BEFORE my origin/main had the updated v1.8.4 whisper.cpp. So when you merged that typing branch it also reverted to the original whisper.cpp version. No big deal because this PR includes it. I just wanted to give you a heads up.

CHANGES:

  • fix prompt_tokens binding: Binding was just a pointer but constants.py indicated tuple.
  • change Model.__call_new_segment_callback from a static method to an instance method so that self._new_segment_callback resolves the instance callback not a global callback (ie you can have multiple model instances with different new_segment_callbacks)
  • change Model.transcribe so that new_segment_callback is not persistent. when new_segment_callback == None, it will always make sure the callback is cleared
  • add abort_callback to Model.transcribe, it has the same behavior
  • update whisper.cpp to 9386f239401074690479731c1e41683fbbeac557 whisper.cpp (v1.8.4) ... again
  • main.cpp bindings better integrate with whisper v1.8.4 (specifically context params)
  • bindings for all callbacks now have same shape... mostly
  • Add a compatibility alias that maps suppress_non_speech_tokens to suppress_nst internally.
  • remove all WHISPER_DEPRECATED functions as defined by whsiper.h
  • update README.md via AI... I looked over it didn't see any issues.

OTHER ISSUES/TODO:

  • misc. type issues (model.system_info() -> None: should probably return str)
  • bindings for callbacks could probably use a template
  • move all of the typing to a single place so it doesn't have to be maintained on multiple places
  • possibly add default new_segment_callback and abort callback to the model class

If you care to see some of the documents and scripts I used you can look at my working branch. Specifically the coverage and tests directories. A lot of that is AI generated and didnt feel like it substantially improved anything so I left it out. Although you might be interested in having tests/compat and tests/run-compat.py

Copy link
Copy Markdown
Owner

@absadiki absadiki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries at all, thanks @scottmonster for the heads up and for the PR.

This looks like a really solid cleanup/update overall.

One thing I noticed is that the default language was changed to "en". I think it should remain empty str so whisper.cpp can continue auto-detecting the language by default. wht do you think ?

Also I noticed the AI README changes seem to introduce quite a bit of formatting noise (extra empty lines / spacing changes) without much actual content improvement. I think it would probably be better to keep the README diff smaller and focused on meaningful changes only.

Also, feel free to include the additional test files/scripts you mentioned if you think they improve coverage or compatibility testing. More tests are always welcome when they provide useful coverage.

Comment thread pywhispercpp/constants.py
'type': str,
'description': 'for auto-detection, set to None, "" or "auto"',
'options': None,
'default': ""
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should remain "" for auto-detecting the language by default.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yes, you're right. Originally I was trying to match the default whisper-cli args which defaults to en. Once I realized I was going about it the wrong way I had to change a bunch of stuff. I'll change it back to an empty string and provide a cleaner README shortly

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, gotcha .. no worries.

revert language back to "" instead of "en" so that whisper will
auto-detect language
@scottmonster
Copy link
Copy Markdown
Contributor Author

So I literally just copy and pasted your README back in and changed line 295 to

ctx = pwcpp.whisper_init_from_file_with_params(
  'path/to/ggml/model',
  pwcpp.whisper_context_default_params(),
)

@absadiki
Copy link
Copy Markdown
Owner

I think everything looks good so far .. we can merge this for now.
Thanks again @scottmonster for the work on this PR.

@absadiki absadiki merged commit c668bd6 into absadiki:main May 26, 2026
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants