Voice functionality added with Whisper #52

younesbram · 2023-12-01T21:33:58Z

Added voice functionality with whisper. Implemented a --voice command-line flag using argparse.

joshbickett · 2023-12-02T14:19:51Z

@younesbram I'm excited for this PR, but I ran into an issue installing whisper-mic. It sounds like it may be resolvable with brew install portaudio, but my concern is about adding another install statement for the overall project for it to use voice which may not be used by all users.

I created a ChatGPT thread with the context here: https://chat.openai.com/share/0446a868-42f8-469e-953d-394bf21e9ad0

Do you know a better solution that doesn't require brew install portaudio?

joshbickett · 2023-12-02T14:22:05Z

If there's some way to only require brew install portaudio if the user is doing voice, that could be good, but right now I think it will break for all users on pip install -r requirements.txt if they do not have portaudio

younesbram · 2023-12-02T14:38:38Z

I don't have the issue when installing, I think its a mac thing. Wish my macbook wasn't broken else Id test on that too.

I tried it with wsl and gitbash and normal cmd. Will try to debug right now and investigate for any solutions to seamlessly not break installation for mac users who are using voice.

younesbram · 2023-12-02T14:45:44Z

I found this in the troubleshooting section :

Please let me know

younesbram · 2023-12-02T14:46:58Z

Many options to choose from!
I personally like adding the troubleshooting in the documentation. First place people go to after a bug is the same github repo they got it from.

joshbickett · 2023-12-02T14:54:33Z

I think I like 1 conditional requirement most. My concern with 2 documentation is it sounds like all mac users will see a red fail on pip install -r requirements.txt. If they ignore and don't use mic then it sounds like it will work fine. Most will probably feel like they need to troubleshoot it even if they technically don't. Let me know if I am understanding this correctly.

I like 1 conditional requirement because it sounds like it wouldn't add any installation steps for the dev on the readme.md. That's really my goal, not to add another step. If we could add a conditional install that'd be great, but I'm not aware if that's possible with brew install portaudio. As far as I'm aware I've never used pip to start a brew install command, but honestly I don't know what's possible here. If you could figure it out and resolve, I'll look at the PR again and see if it works for me, Thanks!

younesbram · 2023-12-02T15:13:16Z

Thing is.. if its breaking for users in the pip install -r requirements.txt part (step 5), the only valid solution is to add another step to install for voice users only. This will also be a chance to say run operate --voice as a final step. I tried doing the markers thing but you were right, the package cannot be installed with pip needs (apt or brew)

joshbickett · 2023-12-02T15:28:31Z

Ohh, I think I found something. See Optional Dependencies in updated chat: https://chat.openai.com/share/0446a868-42f8-469e-953d-394bf21e9ad0

We could add separate pip install -r requirements-audio.txt for audio users. What do you think of this? If you're interested in implementing this then add a separate section to the quick start in the README.md to include "quick start with audio" or something like that. Then users using with-audio can have a separate installation guide entirely. I think this will work

joshbickett · 2023-12-02T15:32:31Z

@younesbram Oh I just realized the ChatGPT share link is not updating with new information for some reason. Here's the update for context.

__
@joshbickett message: Well that's a duesy. We really want to add accessibility to this project but we can not ask all users to run brew install portaudio. Only the users that want to use the terminal flag with-audio should have to run this. I guess "technically" users could just ignore ERROR: Could not build wheels for pyaudio which use PEP 517 and if they don't use with-audio then it will work fine, but that's a big problem because everyone running pip install -r requirements will see it and that'd be quite confusing.

__

Assistant response: You're correct in identifying the challenge of requiring all users to install additional dependencies, especially when it's only needed for a specific feature. However, there are a few strategies you could use to handle this situation more gracefully:

Optional Dependencies: Modify your requirements.txt to only include essential packages. Then, provide a separate requirements-audio.txt for the with-audio feature, which includes PyAudio and its dependencies. In your documentation, instruct users who want to use the with-audio feature to run an additional command, like:

pip install -r requirements-audio.txt

younesbram · 2023-12-03T17:28:39Z

Updated and tested, resolved all merge conflicts. Let me know @joshbickett and cheers for the communication on X.com/didntdrinkwater

Let me know of any future help with UX testing or whatever you may need. I will be testing this project many times with people who don't have tech knowledge / carpal syndrome / amputees.

joshbickett · 2023-12-05T14:16:16Z

Ok, just finished looking at it. It is merged. Thanks for all your help with this. If you want to make updates to the voice and accessibility features overtime that'd be great!

younesbram · 2023-12-05T14:22:58Z

For sure Josh! Thank you for your help as well I am sure this project will keep growing I will help as much as I can in my free time not looking for jobs. Cheers!Sent from my iPhoneOn Dec 5, 2023, at 9:16 AM, Josh Bickett ***@***.***> wrote: Ok, just finished looking at it. It is merged. Thanks for all your help with this. If you want to make updates to the voice and accessibility features overtime that'd be great! —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>

joshbickett · 2023-12-05T14:28:30Z

Actually there may have been an issue for users not using --voice. I reverted the commit because when I did a fresh install I was getting an import issue

younesbram · 2023-12-05T14:34:42Z

will checkSent from my iPhoneOn Dec 5, 2023, at 9:28 AM, Josh Bickett ***@***.***> wrote: Actually there may have been an issue for users not using --voice. I reverted the commit because when I did a fresh install I was getting an import issue —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>

joshbickett · 2023-12-05T14:52:30Z

Ok, trying to fix it now

joshbickett · 2023-12-05T15:17:47Z

Having some weird git issues, but I think I was able to fix it. Will share more in a little, need to run some tests

joshbickett · 2023-12-05T15:33:19Z

@younesbram so I'm not that good at git reverts. I reverted and made the following fix: b011fa8

The problem is I then went to push back to main but was having an issue and somehow lost your recent commits. Anyway, voice is merged back in main and appears to me working without causing issued of devs that don't use it! :)

younesbram · 2023-12-05T15:47:05Z

no worries! i usually use git fetch -all to the upstream repositories and then git rebase and i resolve merge conflicts if there are any. much better than git merge. thats just me though. chatgpt helps alot too! thanks for your assistance ill check it out tonightSent from my iPhoneOn Dec 5, 2023, at 10:33 AM, Josh Bickett ***@***.***> wrote: @younesbram so I'm not that good at git reverts. I reverted and made the following fix: b011fa8 The problem is I then went to push back to main but was having an issue and somehow lost your recent commits. Anyway, voice is merged back in main and appears to me working without causing issued of devs that don't use it! :) —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>

Siddhant-Goswami · 2023-12-07T21:18:48Z

Facing this issue while running operate --voice.

I have already ran brew install portaudio and pip3 install -r requirements-audio.txt.

OS: macOS Ventura 13.5.2

joshbickett · 2023-12-08T15:12:39Z

@Siddhant-Goswami oh. Thanks for letting us know. Could you submit an issue ticket for this if you haven't already?

younesbram · 2023-12-08T15:22:20Z

@Siddhant-Goswami hello friend. seems like an environment issue, try reloading with a new env. i will check on my mac VM with your version of ventura soon

michaelhhogue added the enhancement New feature or request label Dec 2, 2023

michaelhhogue mentioned this pull request Dec 2, 2023

Add support for speech recognition with a voice button #38

Closed

younesbram added 2 commits December 3, 2023 12:24

updating usability of voice feauture and installation

f2a8147

added better instructions for voice users

1077615

younesbram force-pushed the voice branch from 8f6b492 to 1077615 Compare December 3, 2023 17:26

joshbickett added 5 commits December 5, 2023 05:56

Remove duplicate voice_mode

9efbd9b

Merge branch 'main' into younesbram-voice

421907e

Create new ### Using operate Modes section

3576650

Update headers

725103c

Update more headers in README.md

ef5921f

joshbickett merged commit fdfc6b9 into OthersideAI:main Dec 5, 2023

joshbickett mentioned this pull request Dec 5, 2023

Revert "Voice functionality added with Whisper" #81

Merged

joshbickett mentioned this pull request Dec 6, 2023

Add speech to narrate actions. #24

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Voice functionality added with Whisper #52

Voice functionality added with Whisper #52

younesbram commented Dec 1, 2023

joshbickett commented Dec 2, 2023

joshbickett commented Dec 2, 2023

younesbram commented Dec 2, 2023

younesbram commented Dec 2, 2023

younesbram commented Dec 2, 2023

joshbickett commented Dec 2, 2023

younesbram commented Dec 2, 2023

joshbickett commented Dec 2, 2023

joshbickett commented Dec 2, 2023 •

edited

Loading

younesbram commented Dec 3, 2023

joshbickett commented Dec 5, 2023

younesbram commented Dec 5, 2023 via email

joshbickett commented Dec 5, 2023

younesbram commented Dec 5, 2023 via email

joshbickett commented Dec 5, 2023

joshbickett commented Dec 5, 2023

joshbickett commented Dec 5, 2023

younesbram commented Dec 5, 2023 via email

Siddhant-Goswami commented Dec 7, 2023 •

edited

Loading

joshbickett commented Dec 8, 2023

younesbram commented Dec 8, 2023

Voice functionality added with Whisper #52

Voice functionality added with Whisper #52

Conversation

younesbram commented Dec 1, 2023

joshbickett commented Dec 2, 2023

joshbickett commented Dec 2, 2023

younesbram commented Dec 2, 2023

younesbram commented Dec 2, 2023

younesbram commented Dec 2, 2023

joshbickett commented Dec 2, 2023

younesbram commented Dec 2, 2023

joshbickett commented Dec 2, 2023

joshbickett commented Dec 2, 2023 • edited Loading

younesbram commented Dec 3, 2023

joshbickett commented Dec 5, 2023

younesbram commented Dec 5, 2023 via email

joshbickett commented Dec 5, 2023

younesbram commented Dec 5, 2023 via email

joshbickett commented Dec 5, 2023

joshbickett commented Dec 5, 2023

joshbickett commented Dec 5, 2023

younesbram commented Dec 5, 2023 via email

Siddhant-Goswami commented Dec 7, 2023 • edited Loading

joshbickett commented Dec 8, 2023

younesbram commented Dec 8, 2023

joshbickett commented Dec 2, 2023 •

edited

Loading

Siddhant-Goswami commented Dec 7, 2023 •

edited

Loading