Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FR: Allow use of local whisper instance #2

Open
brimwats opened this issue Apr 8, 2023 · 29 comments
Open

FR: Allow use of local whisper instance #2

brimwats opened this issue Apr 8, 2023 · 29 comments
Labels
enhancement New feature or request

Comments

@brimwats
Copy link

brimwats commented Apr 8, 2023

I am using the local version of whisper: https://github.com/ahmetoner/whisper-asr-webservice/

Can or is there be a setting to use the local endpoint? I might have missed it!

@FeralFlora
Copy link

Or perhaps using this port of OpenAI's Whisper model in C/C++: https://github.com/ggerganov/whisper.cpp

@tmfelwu
Copy link
Contributor

tmfelwu commented Apr 26, 2023

@brimwats how is the performance of the local whisper model. Which model do you use? and whats the hardware specs?

@dahifi
Copy link

dahifi commented May 19, 2023

@tmfelwu you it's pretty easy to test.

I was going to use https://github.com/matusstas/openai-whisper-microservice but the one brimwats posted might be a better choice. As long as these services support the OpenAI whisper API then switching the URL out should be no problem, otherwise adapters will be needed.

@dahifi
Copy link

dahifi commented May 19, 2023

It doesn't seem like too big of a lift, I'm forking and tracking here: https://github.com/dahifi/whisper-asr-webservice-obsidian-plugin/issues/1

@nikdanilov
Copy link
Owner

Not many people know how to start the Whisper service on their local. So, I think it's better to keep it connected to the API for now until we get more traction from the community.

@FeralFlora
Copy link

Not many people know how to start the Whisper service on their local. So, I think it's better to keep it connected to the API for now until we get more traction from the community.

Conversely, you might get more traction, if you could run Whisper locally.

@SunnyOd
Copy link

SunnyOd commented Jun 21, 2023

@nikdanilov Firstly thanks for making the plugin, I think it's a great Idea and one that's sorely needed! I have some feedback, hope you don't mind

I think anyone that's using Obsidian over Notion or any of the other PKM systems does so out of choice. Notion may be easier to use but it's closed source, they lock in your data and make it hard to export & the service is proprietary. Using an API to do STT is the sort of thing Notion would advocate, not Obsidian

IMO Obsidian users:

  • They probably don't want to share their data with a corporate
  • They're tech savy - Setting up and enabling something as simple as a Template using Templater is pretty complex when you compare it with a couple of clicks needed in Notion

@MSBack has hit the nail on the head. I think the missing link in most of the community AI plugins for obsidian is that they're all using openai API calls to do all the heavy lifting, there are no local options and it makes no sense given the etho/principles behind Obsidian.md

@tmfelwu
Copy link
Contributor

tmfelwu commented Jun 21, 2023

why not have both the options, openai api's are alreaady implemented, lets have a switch to change from remote to local instance

@levmckinney
Copy link

levmckinney commented Jun 25, 2023

Currently, Open AI say that they don't train their models on API calls and have a 30-day data retention policy source. But that could change. My obsidian notes are about the last thing I want going in the training data. A self-host option, even if it's hard to set up, would be deeply appreciated.

@gavrilov
Copy link
Contributor

To run local whisper model and connect with plugin:
https://gist.github.com/gavrilov/4537a569b7fa8e20e64a199e924d458a

@Hunanbean-Collective
Copy link

Hunanbean-Collective commented Aug 19, 2023

Conversely, you might get more traction, if you could run Whisper locally.

Exactly. the only reason i am Not using this plugin is because it does not function with a local Whisper installation.
looking into gavrilovs solution, but really think it would make more sense to have it already built in, for all the reasons listed above, and then some.
I thank the author for the effort and share, regardless

@Hunanbean-Collective
Copy link

gavrilovs above solution worked beautifully. It also simplified the install process for me. I used a Conda environment, instead of venv, but was still able to easily follow the instructions. Thank you both, gavrilov and nikdanilov. Also, Whisper on a 12GB 3060 is almost instant (not real-time, i mean in transcribing the recording) and the accuracy is amazing. I am quite impressed.

@djmango
Copy link

djmango commented Aug 28, 2023

I am using the local version of whisper: https://github.com/ahmetoner/whisper-asr-webservice/

Can or is there be a setting to use the local endpoint? I might have missed it!

I built a plugin around this a bit ago - instructions for setting up locally with Docker are linked

https://github.com/djmango/obsidian-transcription

@dahifi
Copy link

dahifi commented Sep 13, 2023

I am using the local version of whisper: https://github.com/ahmetoner/whisper-asr-webservice/
Can or is there be a setting to use the local endpoint? I might have missed it!

I built a plugin around this a bit ago - instructions for setting up locally with Docker are linked

https://github.com/djmango/obsidian-transcription

Dude! @djmango

This worked flawlessly. Running the Docker container on my Macbook and it worked no problem. Will have to play around with the models -- output isn't up to par with whatever AI Commander's transcription command prompt is, but thank you so much for this.

@djmango
Copy link

djmango commented Sep 13, 2023

I am using the local version of whisper: https://github.com/ahmetoner/whisper-asr-webservice/
Can or is there be a setting to use the local endpoint? I might have missed it!

I built a plugin around this a bit ago - instructions for setting up locally with Docker are linked
https://github.com/djmango/obsidian-transcription

Dude! @djmango

This worked flawlessly. Running the Docker container on my Macbook and it worked no problem. Will have to play around with the models -- output isn't up to par with whatever AI Commander's transcription command prompt is, but thank you so much for this.

That's great to hear! I poked around the source of AI Commander (I was pleasantly surprised to see my stackoverflow solution linked as a source 😆), it seems to be using the OpenAI Whisper API, which I believe uses the openai/whisper-large-v2 model.

Given sufficient memory, 16gb if I recall correctly, you should be able to run that same model on Whisper ASR service - at the cost of speed, but I found quality of output to be worth it. I ended up rolling my own webservice so that I can run it on my mobile devices too, but I know use cases differ significantly in the notetaking/knowledge work space.

What do you suggest as to other tools/apps I should build transcription plugins for? I'm looking to expand the functionality of Obsidian Transcription significantly over the next two weeks, and from there build a wider library of plugins so more people can use the tool.

@dahifi
Copy link

dahifi commented Oct 6, 2023

@djmango I'm building pipelines for Discord using Steamship right now. If you're interested in touching base I can send you a link to my Discord server.

@djmango
Copy link

djmango commented Oct 10, 2023

@djmango I'm building pipelines for Discord using Steamship right now. If you're interested in touching base I can send you a link to my Discord server.

Please do!

@dahifi
Copy link

dahifi commented Oct 11, 2023

@djmango I sent a forwarded GH notification email to your swiftlink account with the invite from my handle at gmail

@djmango
Copy link

djmango commented Oct 13, 2023 via email

@dahifi
Copy link

dahifi commented Oct 16, 2023

Here's the Steamship Discord. You'll see me in there and we can connect: https://discord.gg/GaEeyGG9

@didmar
Copy link

didmar commented Dec 25, 2023

I've made a docker image to run it as a local server (based on the project that @gavrilov is referring to in their gist)
Install Docker and run:

docker run -d -p 127.0.0.1:8000:8000 -v ./.cache:/root/.cache:rw didmar/whisper-api-server:latest

Then change the URL to http://localhost:8000/v1/audio/transcriptions in the plugin settings, and that's it!

@philosowaffle
Copy link

+1 for the docker image @didmar. working great for me!

@eatgrass
Copy link

eatgrass commented Jan 1, 2024

I think a better solution would be using huggingface transformer in the browser directly. No installation, No dependency, totally local

https://huggingface.co/spaces/Xenova/whisper-web

@nikdanilov
Copy link
Owner

Folks 👋
I appreciate your feedback and understand the importance of using the local Whisper model. I'm currently exploring options to add this into the plugin. Thank you for your patience and support! 😊

@jnrsloth
Copy link

jnrsloth commented May 3, 2024

I've managed to get this working using https://github.com/ggerganov/whisper.cpp .
I couldn't find if anyone else had got it working so I thought I'd leave a short guide for how I did it.
I'm running the server in an incus container on a linux (pop-os) host. I used a container because i have an AMD 6800 xt and Rocm does not play well on pop-os. I used the following tutorial to set up my container (https://discuss.linuxcontainers.org/t/ai-tutorial-llama-cpp-and-ollama-servers-plugins-for-vs-code-vs-codium-and-intellij/19744) you just need to swap out the llama.cpp stuff for whisper.cpp (extra make commands are needed to compile for a GPU).
whisper.cpp has a server funtion which provides the endpoint that we can point to with the API URL.
I start the whisper.cpp server with ./server -m /home//ubuntu/whisper.cpp/models/ggml-medium.en.bin --host 0.0.0.0 --convert
As for the plugin settings in Obsidian:

  • API key: <Does not matter as long as it's something>
  • API URL: for me it is http://<container IP>:8080/inference
  • Model: blank
  • Prompt: blank
  • Language: blank

I hope this helps others!

@Hunanbean-Collective
Copy link

I couldn't find if anyone else had got it working

#2 (comment)

Well, all you had to do was check the messages above yours, as a working solution has been shared and verified quite some time back. a guide, if accurate, is always welcome of course.

@jnrsloth
Copy link

jnrsloth commented May 5, 2024

Ahh, I assumed because the issue was left open it wasn't a fully solved solution.
I did lean on this thread for guidance, including gavrilov's excellent example, however his solution uses the full-fat whisper models and (from my skimming of his code) generates his own server to handle requests. I think for less experienced users, a pre-established project like whisper.cpp offers a lot of extra troubleshooting support, and having something prepackaged that also runs realatively quickly on a CPU opens up usability to a lot more people.
Cheers!

@gavrilov
Copy link
Contributor

gavrilov commented Jun 8, 2024

updated instruction how to use plugin with whisper.cpp local model
https://gist.github.com/gavrilov/4537a569b7fa8e20e64a199e924d458a

@lzy-lad
Copy link

lzy-lad commented Jun 15, 2024

The api url had to be "http://127.0.0.1:8000/inference" instead of just "127.0.0.1:8000/inference" for me to get this running.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests