Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Voice functionality added with Whisper #52

Merged
merged 7 commits into from
Dec 5, 2023

Conversation

younesbram
Copy link
Contributor

Added voice functionality with whisper. Implemented a --voice command-line flag using argparse.

@joshbickett
Copy link
Contributor

@younesbram I'm excited for this PR, but I ran into an issue installing whisper-mic. It sounds like it may be resolvable with brew install portaudio, but my concern is about adding another install statement for the overall project for it to use voice which may not be used by all users.

I created a ChatGPT thread with the context here: https://chat.openai.com/share/0446a868-42f8-469e-953d-394bf21e9ad0

Do you know a better solution that doesn't require brew install portaudio?

@joshbickett
Copy link
Contributor

If there's some way to only require brew install portaudio if the user is doing voice, that could be good, but right now I think it will break for all users on pip install -r requirements.txt if they do not have portaudio

@younesbram
Copy link
Contributor Author

I don't have the issue when installing, I think its a mac thing. Wish my macbook wasn't broken else Id test on that too.

I tried it with wsl and gitbash and normal cmd. Will try to debug right now and investigate for any solutions to seamlessly not break installation for mac users who are using voice.

@younesbram
Copy link
Contributor Author

I found this in the troubleshooting section :
image
Please let me know

@younesbram
Copy link
Contributor Author

image
Many options to choose from!
I personally like adding the troubleshooting in the documentation. First place people go to after a bug is the same github repo they got it from.

@joshbickett
Copy link
Contributor

I think I like 1 conditional requirement most. My concern with 2 documentation is it sounds like all mac users will see a red fail on pip install -r requirements.txt. If they ignore and don't use mic then it sounds like it will work fine. Most will probably feel like they need to troubleshoot it even if they technically don't. Let me know if I am understanding this correctly.

I like 1 conditional requirement because it sounds like it wouldn't add any installation steps for the dev on the readme.md. That's really my goal, not to add another step. If we could add a conditional install that'd be great, but I'm not aware if that's possible with brew install portaudio. As far as I'm aware I've never used pip to start a brew install command, but honestly I don't know what's possible here. If you could figure it out and resolve, I'll look at the PR again and see if it works for me, Thanks!

@younesbram
Copy link
Contributor Author

Thing is.. if its breaking for users in the pip install -r requirements.txt part (step 5), the only valid solution is to add another step to install for voice users only. This will also be a chance to say run operate --voice as a final step. I tried doing the markers thing but you were right, the package cannot be installed with pip needs (apt or brew)

@joshbickett
Copy link
Contributor

Ohh, I think I found something. See Optional Dependencies in updated chat: https://chat.openai.com/share/0446a868-42f8-469e-953d-394bf21e9ad0

We could add separate pip install -r requirements-audio.txt for audio users. What do you think of this? If you're interested in implementing this then add a separate section to the quick start in the README.md to include "quick start with audio" or something like that. Then users using with-audio can have a separate installation guide entirely. I think this will work

@joshbickett
Copy link
Contributor

joshbickett commented Dec 2, 2023

@younesbram Oh I just realized the ChatGPT share link is not updating with new information for some reason. Here's the update for context.

__
@joshbickett message: Well that's a duesy. We really want to add accessibility to this project but we can not ask all users to run brew install portaudio. Only the users that want to use the terminal flag with-audio should have to run this. I guess "technically" users could just ignore ERROR: Could not build wheels for pyaudio which use PEP 517 and if they don't use with-audio then it will work fine, but that's a big problem because everyone running pip install -r requirements will see it and that'd be quite confusing.

__

Assistant response: You're correct in identifying the challenge of requiring all users to install additional dependencies, especially when it's only needed for a specific feature. However, there are a few strategies you could use to handle this situation more gracefully:

Optional Dependencies: Modify your requirements.txt to only include essential packages. Then, provide a separate requirements-audio.txt for the with-audio feature, which includes PyAudio and its dependencies. In your documentation, instruct users who want to use the with-audio feature to run an additional command, like:

pip install -r requirements-audio.txt

@younesbram
Copy link
Contributor Author

Updated and tested, resolved all merge conflicts. Let me know @joshbickett and cheers for the communication on X.com/didntdrinkwater

Let me know of any future help with UX testing or whatever you may need. I will be testing this project many times with people who don't have tech knowledge / carpal syndrome / amputees.

@joshbickett joshbickett merged commit fdfc6b9 into OthersideAI:main Dec 5, 2023
@joshbickett
Copy link
Contributor

Ok, just finished looking at it. It is merged. Thanks for all your help with this. If you want to make updates to the voice and accessibility features overtime that'd be great!

@younesbram
Copy link
Contributor Author

younesbram commented Dec 5, 2023 via email

@joshbickett
Copy link
Contributor

Actually there may have been an issue for users not using --voice. I reverted the commit because when I did a fresh install I was getting an import issue

@younesbram
Copy link
Contributor Author

younesbram commented Dec 5, 2023 via email

@joshbickett
Copy link
Contributor

Ok, trying to fix it now

@joshbickett
Copy link
Contributor

Having some weird git issues, but I think I was able to fix it. Will share more in a little, need to run some tests

@joshbickett
Copy link
Contributor

@younesbram so I'm not that good at git reverts. I reverted and made the following fix: b011fa8

The problem is I then went to push back to main but was having an issue and somehow lost your recent commits. Anyway, voice is merged back in main and appears to me working without causing issued of devs that don't use it! :)

@younesbram
Copy link
Contributor Author

younesbram commented Dec 5, 2023 via email

@Siddhant-Goswami
Copy link

Siddhant-Goswami commented Dec 7, 2023

Screenshot 2023-12-08 at 2 51 22 AM

Facing this issue while running operate --voice.

I have already ran brew install portaudio and pip3 install -r requirements-audio.txt.

OS: macOS Ventura 13.5.2

@joshbickett
Copy link
Contributor

@Siddhant-Goswami oh. Thanks for letting us know. Could you submit an issue ticket for this if you haven't already?

@younesbram
Copy link
Contributor Author

@Siddhant-Goswami hello friend. seems like an environment issue, try reloading with a new env. i will check on my mac VM with your version of ventura soon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants