Feature/audio and images action #24

dockhardman · 2024-04-26T08:40:23Z

Add Audio and Image APIs support

Add new abstract base classes ActionAudio and ActionImage in languru/action/base.py to define interfaces for audio and image related APIs
Implement audio and image related APIs in OpenaiAction class in languru/action/openai.py, including:
- audio_speech() for text-to-speech
- audio_transcriptions() for speech recognition
- audio_translations() for speech translation
- images_generations() for image generation
- images_edits() for image editing
- images_variations() for creating image variations
Add new type definition files languru/types/audio.py and languru/types/images.py to define request types for audio and image APIs
Add remove_punctuation() utility function in languru/utils/common.py
Add tests for new audio and image APIs in tests/action/test_openai.py
Add session_id_fixture in tests/conftest.py to generate unique session IDs for tests
Add test for remove_punctuation() in tests/utils/test_common.py

Make ActionText an abstract base class containing the core text-based methods. Refactor ActionBase to inherit from ActionText.

- Add `ActionAudio` abstract base class with `audio_speech` method - Implement `audio_speech` in `OpenaiAction` to support text-to-speech - Add `TextToSpeechRequest` model in `languru/types/audio.py` - Update `ActionBase` to inherit from both `ActionText` and `ActionAudio`

- Add `audio_transcriptions` method to `OpenaiAction` for creating audio transcriptions - Introduce `TranscriptionCreateRequest` model for transcription API requests - Update type annotations and field descriptions for consistency

- Add audio_transcriptions method to ActionAudio abstract base class - Update type hints to include FileTypes and Transcription from openai types

- Add `audio_translations` method to `ActionAudio` and `OpenaiAction` classes - Introduce `AudioTranslationRequest` model in `languru/types/audio.py` - Update `AudioTranscriptionRequest` model with more detailed field descriptions - Rename `TextToSpeechRequest` to `AudioSpeechRequest` for consistency - Rename `TranscriptionCreateRequest` to `AudioTranscriptionRequest`

- Update `ActionAudio.audio_speech()` to return an iterator of bytes - Implement `OpenaiAction.audio_speech()` using streaming response - Update `AudioSpeechRequest` model to accept text for voice field - Add test for `OpenaiAction.audio_speech()` to save streamed audio - Add `session_id_fixture` for generating unique test session IDs

- Add test_asr_model_name and test_sentence variables - Add test_openai_action_audio_transcriptions test function - Load audio file generated in test_openai_action_audio_speech - Call action.audio_transcriptions with test_asr_model_name - Assert transcription result matches test_sentence

- Add remove_punctuation function in languru/utils/common.py to remove punctuations from input string - Add test cases for remove_punctuation in tests/utils/test_common.py

- Add test for audio transcriptions with additional parameters - Use a test sentence in French for TTS and ASR tests - Compare transcription result with test sentence after removing punctuation

- Change test_sentence from French to Chinese - Update test_tts_language to "zh" for Chinese - Remove prompt parameter in test_openai_action_audio_transcriptions - Add new test case test_openai_action_audio_translations

- Add ActionImage abstract base class with images_generations method - Update ActionBase to inherit from ActionImage - Implement images_generations method in OpenaiAction - Add ImagesGenerationsRequest model in languru/types/images.py

- Add images_edits abstract method to ActionImage class - Implement images_edits method in OpenaiAction class - Add ImagesEditRequest model in images.py for handling image edit requests - Update imports in images.py to include FileTypes from openai._types

- Add images_variations abstract method to ActionImage - Implement images_variations in OpenaiAction - Add ImagesVariationsRequest model in languru/types/images.py - Update size field in image request models to accept Text type

- Add test for OpenAI image generation using DALL-E 2 model - Download generated image and save to file for verification

dockhardman

ok

- Add additional valid translation sentences for the test_sentence "你好" - Update the assertion to check if the translated text matches any of the valid translation sentences after removing punctuation and whitespace

dockhardman added 14 commits April 24, 2024 18:16

ABC ActionText with chat/text_completion/embeddings/moderations methods

5cd34a8

Make ActionText an abstract base class containing the core text-based methods. Refactor ActionBase to inherit from ActionText.

feat(audio): add audio transcription support

feaafb7

- Add `audio_transcriptions` method to `OpenaiAction` for creating audio transcriptions - Introduce `TranscriptionCreateRequest` model for transcription API requests - Update type annotations and field descriptions for consistency

ENH: Add audio_transcriptions method to ActionAudio

67fda6b

- Add audio_transcriptions method to ActionAudio abstract base class - Update type hints to include FileTypes and Transcription from openai types

Add remove_punctuation utility function

be0181d

- Add remove_punctuation function in languru/utils/common.py to remove punctuations from input string - Add test cases for remove_punctuation in tests/utils/test_common.py

ENH: Enhance OpenAI action tests

b757e60

- Add test for audio transcriptions with additional parameters - Use a test sentence in French for TTS and ASR tests - Compare transcription result with test sentence after removing punctuation

DOC: Update test cases in test_openai.py

502b247

- Change test_sentence from French to Chinese - Update test_tts_language to "zh" for Chinese - Remove prompt parameter in test_openai_action_audio_transcriptions - Add new test case test_openai_action_audio_translations

ENH: Add support for OpenAI Images API

93f6b5b

- Add ActionImage abstract base class with images_generations method - Update ActionBase to inherit from ActionImage - Implement images_generations method in OpenaiAction - Add ImagesGenerationsRequest model in languru/types/images.py

ENH: Add OpenAI image generation support

33a4111

- Add test for OpenAI image generation using DALL-E 2 model - Download generated image and save to file for verification

dockhardman commented Apr 26, 2024

View reviewed changes

ENH: Improve OpenAI action audio translations test

5b695dc

- Add additional valid translation sentences for the test_sentence "你好" - Update the assertion to check if the translated text matches any of the valid translation sentences after removing punctuation and whitespace

dockhardman merged commit 1da51fa into master Apr 26, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/audio and images action #24

Feature/audio and images action #24

dockhardman commented Apr 26, 2024

dockhardman left a comment

Feature/audio and images action #24

Feature/audio and images action #24

Conversation

dockhardman commented Apr 26, 2024

dockhardman left a comment

Choose a reason for hiding this comment