Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DODO-GPT #343

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
Open

Add DODO-GPT #343

wants to merge 9 commits into from

Conversation

mjsutcliffe99
Copy link
Contributor

Add various new features relating to Chat-GPT and Whisper APIs; namely "Dodo Hint", "Dodo Query", "Import image as ZX-diagram", and the text-to-speech and speech-to-text.

Add various new features relating to Chat-GPT and Whisper APIs; namely "Dodo Hint", "Dodo Query", "Import image as ZX-diagram", and the text-to-speech and speech-to-text.
@mjsutcliffe99
Copy link
Contributor Author

To run the DODO-GPT features, you need a paid OpenAI account (although you can add the minimum of $5 and each query costs a tiny fraction of a penny). Simply paste your API key into the users/key.txt file.

@jvdwetering
Copy link
Collaborator

The test fails because it doesn't recognise 'openai' as an installed module. You should add this to requirements.txt and the conf.

Copy link
Collaborator

@jvdwetering jvdwetering left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is cool, but just checking: do these additional imports add a lot to the loading time or to the memory requirements? Or are these openai, pydub, sounddevice and ffmpeg-python modules all relatively low weight?

zxlive/dodo.py Outdated
def get_image_prompt():
"""Returns the prompt that encourages DODO-GPT to describe the given image like a ZX-diagram."""

return """
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe put this large string at the top of the file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure!


def get_local_api_key():
"""Get the API key from key.txt file"""
f = open(GET_MODULE_PATH()+"/user/key.txt", "r")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better to have the API key be settable in the settings of ZXLive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my plan! This is just a quick temporary solution so that I can push the changes. I guess we could also save API keys locally if we hash them first, so that way you won't have to re-enter it in every new instance of the program?


self.dodo_hint = QToolButton(self)
self.dodo_hint.setIcon(QIcon(get_data("icons/dodo.png")))
self.dodo_hint.setToolTip("Dodo Hint")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this is funny, maybe make it more descriptive? "Ask DodoGPT for suggestions on how to rewrite your diagram"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure!


self.dodo_query = QToolButton(self)
self.dodo_query.setIcon(QIcon(get_data("icons/mic.svg")))
self.dodo_query.setToolTip("Dodo Query")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps make it clearer that this will use input from your mic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I can add a little note there

@@ -182,6 +186,8 @@ def __init__(self) -> None:
self.effects = {e: load_sfx(e) for e in SFXEnum}

QShortcut(QKeySequence("Ctrl+B"), self).activated.connect(self._toggle_sfx)

get_local_api_key() # load the local user's API key
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will currently give a fatal error if the key file does not exist. So then the file should be added to the manifest. Better yet would be to load the key from the ZXLive Qt Settings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, there are a few places I need to add error handling yet

@jvdwetering jvdwetering marked this pull request as draft July 26, 2024 12:34
@mjsutcliffe99
Copy link
Contributor Author

I think this is cool, but just checking: do these additional imports add a lot to the loading time or to the memory requirements? Or are these openai, pydub, sounddevice and ffmpeg-python modules all relatively low weight?

I think they're all relatively light, just for adding audio handling and the gpt API. Certainly I don't notice any lag or slowing down of the boot time or runtime, nor the memory

@mjsutcliffe99
Copy link
Contributor Author

The test fails because it doesn't recognise 'openai' as an installed module. You should add this to requirements.txt and the conf.

It's added to the requirements.txt - I'll try adding it to the conf as well!

@mjsutcliffe99
Copy link
Contributor Author

I've updated the minor changes you suggested @jvdwetering.

The relevant libraries are included in the requirements.txt, but I'm not sure what exactly needs to be added to the conf.py file to get it to pass the checks - if anyone has any ideas here please help out and try to get this to work.

And lastly, regarding the API key, I have some ideas of how we can handle this a bit better, but perhaps that could be a separate PR as I want to ensure first that all the core functionality works on machines (and API keys) other than my own.

@jvdwetering
Copy link
Collaborator

Ah sorry, I mentioned conf, but I meant pyproject.toml, that's the one that stores dependencies.

@jvdwetering
Copy link
Collaborator

The test suite fails because

import sounddevice as sd
/opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/sounddevice.py:71: in <module>
    raise OSError('PortAudio library not found')
E   OSError: PortAudio library not found

Perhaps add a check that there is indeed a sound device? Just a Try-Except around the import should be sufficient. If you also include some checks in the function that calls it that would also protect against it.

@RazinShaikh
Copy link
Collaborator

Is there any way to try this if I don't have paid chatgpt?

@mjsutcliffe99
Copy link
Contributor Author

Is there any way to try this if I don't have paid chatgpt?

Unfortunately not, but as long as you put in the minimum amount ($5) your account will count as a "paid" version and your API key should work - and then every query is very very cheap (a tiny fraction of a penny - so that $5 should last for ages).

A temporary solution until the sound system for DODO-GPT is switched for the same that the SFX uses
Again, a temporary solution to catch a failed sounddevice import
@mjsutcliffe99
Copy link
Contributor Author

Jubilation! It finally passed the tests. I'm keen to see if this all works on machines (and OpenAI APIs) other than my own so would be great to hear back from anyone testing out the DODO-GPT features! I still want to polish some things in the code but that can wait for another PR.

@mjsutcliffe99 mjsutcliffe99 marked this pull request as ready for review August 3, 2024 01:36
@RazinShaikh
Copy link
Collaborator

@mjsutcliffe99 I can't get it to work. I get this error:
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\Razin Shaikh\\OneDrive\\Documents\\code\\zxlive\\zxlive/temp/Dodo_Dave_latest.mp3'

@mjsutcliffe99
Copy link
Contributor Author

@mjsutcliffe99 I can't get it to work. I get this error: FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\Razin Shaikh\\OneDrive\\Documents\\code\\zxlive\\zxlive/temp/Dodo_Dave_latest.mp3'

Did you provide an API key in user/key.txt? (If you don't have one I can probably DM you mine to test with, just let me know)

(Nevertheless, I should probably add more error handling for cases like this)

@RazinShaikh
Copy link
Collaborator

I provided api key and have created a paid openai account, so I suppose it's not related to that. I also can't get the import from image to work. It can't find import_image_dialog

@mjsutcliffe99
Copy link
Contributor Author

I provided api key and have created a paid openai account, so I suppose it's not related to that. I also can't get the import from image to work. It can't find import_image_dialog

Hopefully just a trivial error - I'll investigate soon

@RazinShaikh
Copy link
Collaborator

any update on this?

@mjsutcliffe99
Copy link
Contributor Author

any update on this?

I haven't had the chance to look yet, I've been super busy! But I'll get round to it as soon as I can!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants