-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/audio and images action #24
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Make ActionText an abstract base class containing the core text-based methods. Refactor ActionBase to inherit from ActionText.
- Add `ActionAudio` abstract base class with `audio_speech` method - Implement `audio_speech` in `OpenaiAction` to support text-to-speech - Add `TextToSpeechRequest` model in `languru/types/audio.py` - Update `ActionBase` to inherit from both `ActionText` and `ActionAudio`
- Add `audio_transcriptions` method to `OpenaiAction` for creating audio transcriptions - Introduce `TranscriptionCreateRequest` model for transcription API requests - Update type annotations and field descriptions for consistency
- Add audio_transcriptions method to ActionAudio abstract base class - Update type hints to include FileTypes and Transcription from openai types
- Add `audio_translations` method to `ActionAudio` and `OpenaiAction` classes - Introduce `AudioTranslationRequest` model in `languru/types/audio.py` - Update `AudioTranscriptionRequest` model with more detailed field descriptions - Rename `TextToSpeechRequest` to `AudioSpeechRequest` for consistency - Rename `TranscriptionCreateRequest` to `AudioTranscriptionRequest`
- Update `ActionAudio.audio_speech()` to return an iterator of bytes - Implement `OpenaiAction.audio_speech()` using streaming response - Update `AudioSpeechRequest` model to accept text for voice field - Add test for `OpenaiAction.audio_speech()` to save streamed audio - Add `session_id_fixture` for generating unique test session IDs
- Add test_asr_model_name and test_sentence variables - Add test_openai_action_audio_transcriptions test function - Load audio file generated in test_openai_action_audio_speech - Call action.audio_transcriptions with test_asr_model_name - Assert transcription result matches test_sentence
- Add remove_punctuation function in languru/utils/common.py to remove punctuations from input string - Add test cases for remove_punctuation in tests/utils/test_common.py
- Add test for audio transcriptions with additional parameters - Use a test sentence in French for TTS and ASR tests - Compare transcription result with test sentence after removing punctuation
- Change test_sentence from French to Chinese - Update test_tts_language to "zh" for Chinese - Remove prompt parameter in test_openai_action_audio_transcriptions - Add new test case test_openai_action_audio_translations
- Add ActionImage abstract base class with images_generations method - Update ActionBase to inherit from ActionImage - Implement images_generations method in OpenaiAction - Add ImagesGenerationsRequest model in languru/types/images.py
- Add images_edits abstract method to ActionImage class - Implement images_edits method in OpenaiAction class - Add ImagesEditRequest model in images.py for handling image edit requests - Update imports in images.py to include FileTypes from openai._types
- Add images_variations abstract method to ActionImage - Implement images_variations in OpenaiAction - Add ImagesVariationsRequest model in languru/types/images.py - Update size field in image request models to accept Text type
- Add test for OpenAI image generation using DALL-E 2 model - Download generated image and save to file for verification
dockhardman
commented
Apr 26, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
- Add additional valid translation sentences for the test_sentence "你好" - Update the assertion to check if the translated text matches any of the valid translation sentences after removing punctuation and whitespace
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add Audio and Image APIs support
ActionAudio
andActionImage
inlanguru/action/base.py
to define interfaces for audio and image related APIsOpenaiAction
class inlanguru/action/openai.py
, including:audio_speech()
for text-to-speechaudio_transcriptions()
for speech recognitionaudio_translations()
for speech translationimages_generations()
for image generationimages_edits()
for image editingimages_variations()
for creating image variationslanguru/types/audio.py
andlanguru/types/images.py
to define request types for audio and image APIsremove_punctuation()
utility function inlanguru/utils/common.py
tests/action/test_openai.py
session_id_fixture
intests/conftest.py
to generate unique session IDs for testsremove_punctuation()
intests/utils/test_common.py