-
Notifications
You must be signed in to change notification settings - Fork 192
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add readme for audio and cache plugins (#247)
* Add audio and cache plugins readme Signed-off-by: Lv, Liang1 <liang1.lv@intel.com> * refine audio plugin code Signed-off-by: Lv, Liang1 <liang1.lv@intel.com> * fix typo Signed-off-by: Lv, Liang1 <liang1.lv@intel.com> * fix pylint issues Signed-off-by: Lv, Liang1 <liang1.lv@intel.com> * fix neuralchat requirements name Signed-off-by: Wenxin Zhang <wenxin.zhang@intel.com> * avoid build from source when pylint Signed-off-by: Wenxin Zhang <wenxin.zhang@intel.com> --------- Signed-off-by: Lv, Liang1 <liang1.lv@intel.com> Signed-off-by: Wenxin Zhang <wenxin.zhang@intel.com> Co-authored-by: Wenxin Zhang <wenxin.zhang@intel.com>
- Loading branch information
1 parent
1436bdc
commit 9b81f05
Showing
12 changed files
with
404 additions
and
297 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
121 changes: 121 additions & 0 deletions
121
intel_extension_for_transformers/neural_chat/pipeline/plugins/audio/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,121 @@ | ||
The Audio Processing and Text-to-Speech (TTS) Plugin is a software component designed to enhance audio-related functionality in Neural Chat, specially for TalkingBot. This plugin offers a range of capabilities, primarily focused on processing audio data and converting text into spoken language. Here is a general overview of its key features: | ||
|
||
- **Audio Processing**: This component includes a suite of tools and algorithms for manipulating audio data. It can perform tasks such as cut Video, split audio, convert video to audio, noise reduction, equalization, pitch shifting, and audio synthesis, enabling developers to improve audio quality and add various audio effects to their applications. | ||
|
||
- **Text-to-Speech (TTS) Conversion**: The TTS plugin can convert written text into natural-sounding speech by synthesizing human-like voices. Users can customize the voice, tone, and speed of the generated speech to suit their specific requirements. | ||
|
||
- **Speech Recognition**: The ASR plugin support speech recognition, allowing it to transcribe spoken words into text. This can be used for applications like voice commands, transcription services, and voice-controlled interfaces. It supports both English and Chinese. | ||
|
||
- **Multi-Language Support**: The plugin typically supports multiple languages and accents, making it versatile for global applications and catering to diverse user bases. It supports both English and Chinese now. | ||
|
||
- **Integration**: Developers can easily integrate this plugin into their applications or systems using APIs. | ||
|
||
|
||
# Install System Dependency | ||
|
||
Ubuntu Command: | ||
```bash | ||
sudo apt-get install ffmpeg | ||
wget http://nz2.archive.ubuntu.com/ubuntu/pool/main/o/openssl/libssl1.1_1.1.1f-1ubuntu2.19_amd64.deb | ||
sudo dpkg -i libssl1.1_1.1.1f-1ubuntu2.19_amd64.deb | ||
``` | ||
|
||
For other operating systems such as CentOS, you will need to make slight adjustments. | ||
|
||
# English Automatic Speech Recognition (ASR) | ||
|
||
## Dependencies Installation | ||
|
||
To use the English ASR module, you need to install the necessary dependencies. You can do this by running the following command: | ||
|
||
```bash | ||
pip install transformers datasets pydub | ||
``` | ||
|
||
## Usage | ||
|
||
The AudioSpeechRecognition class provides functionality for converting English audio to text. Here's how to use it: | ||
|
||
```python | ||
from intel_extension_for_transformers.neural_chat.pipeline.plugins.audio import AudioSpeechRecognition | ||
asr = AudioSpeechRecognition() | ||
audio_path = "~/audio.wav" # Replace with the path to your English audio file (supports MP3 and WAV) | ||
result = asr.audio2text(audio_path) | ||
print("ASR Result:", result) | ||
``` | ||
|
||
# Chinese Automatic Speech Recognition (ASR) | ||
|
||
## Dependencies Installation | ||
|
||
To use the Chinese ASR module, you need to install the necessary dependencies. You can do this by running the following command: | ||
|
||
```bash | ||
pip install paddlespeech paddlepaddle | ||
``` | ||
|
||
## Usage | ||
|
||
The ChineseAudioSpeechRecognition class provides functionality for converting Chinese audio to text. Here's how to use it: | ||
|
||
```python | ||
from intel_extension_for_transformers.neural_chat.pipeline.plugins.audio import ChineseAudioSpeechRecognition | ||
asr = ChineseAudioSpeechRecognition() | ||
audio_path = "~/audio.wav" # Replace with the path to your audio file | ||
result = asr.audio2text(audio_path) | ||
print("ASR Result:", result) | ||
``` | ||
|
||
# English Text-to-Speech (TTS) | ||
|
||
## Dependencies Installation | ||
|
||
To use the English TTS module, you need to install the required dependencies. Run the following command: | ||
|
||
```bash | ||
pip install transformers soundfile speechbrain | ||
``` | ||
|
||
## Usage | ||
|
||
The TextToSpeech class in your module provides the capability to convert English text to speech. Here's how to use it: | ||
|
||
```python | ||
from intel_extension_for_transformers.neural_chat.pipeline.plugins.audio import TextToSpeech | ||
tts = TextToSpeech() | ||
text_to_speak = "Hello, this is a sample text." # Replace with your text | ||
output_audio_path = "./output.wav" # Replace with the desired output audio path | ||
voice = "default" # You can choose between "default," "pat," or a custom voice | ||
tts.text2speech(text_to_speak, output_audio_path, voice) | ||
``` | ||
|
||
# Chinese Text-to-Speech (TTS) | ||
|
||
## Dependencies Installation | ||
|
||
To use the Chinese TTS module, you need to install the required dependencies. Run the following command: | ||
|
||
```bash | ||
pip install paddlespeech paddlepaddle | ||
``` | ||
|
||
## Usage | ||
|
||
The ChineseTextToSpeech class within your module provides functionality for TTS. Here's how to use it: | ||
|
||
```python | ||
from intel_extension_for_transformers.neural_chat.pipeline.plugins.audio import ChineseTextToSpeech | ||
# Initialize the TTS module | ||
tts = ChineseTextToSpeech() | ||
# Define the text you want to convert to speech | ||
text_to_speak = "你好,这是一个示例文本。" # Replace with your Chinese text | ||
# Specify the output audio path | ||
output_audio_path = "./output.wav" # Replace with your desired output audio path | ||
# Perform text-to-speech conversion | ||
tts.text2speech(text_to_speak) | ||
|
||
# If you want to stream the generation of audio from a text generator (e.g., a language model), | ||
# you can use the following method: | ||
# audio_generator = your_text_generator_function() # Replace with your text generator | ||
# tts.stream_text2speech(audio_generator) | ||
``` |
9 changes: 9 additions & 0 deletions
9
intel_extension_for_transformers/neural_chat/pipeline/plugins/audio/requirements.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
paddlepaddle | ||
paddlespeech | ||
transformers | ||
soundfile | ||
datasets | ||
pydub | ||
python-multipart | ||
speechbrain | ||
librosa |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
59 changes: 59 additions & 0 deletions
59
intel_extension_for_transformers/neural_chat/pipeline/plugins/caching/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
# 🚀 What is caching plugin? | ||
|
||
When LLM service encounters higher traffic levels, the expenses related to LLM API calls can become substantial. Additionally, LLM services might exhibit slow response times. Hence, we leverage GPTCache to build a semantic caching plugin for storing LLM responses. This README.md file provides an overview of the functionality of caching plugin, how to use it, and some example code snippets. | ||
|
||
# 😎 What can this help with? | ||
|
||
Caching plugin offers the following primary benefits: | ||
|
||
- **Decreased expenses**: Caching plugin effectively minimizes expenses by caching query results, which in turn reduces the number of requests and tokens sent to the LLM service. | ||
- **Enhanced performance**: Caching plugin can also provide superior query throughput compared to standard LLM services. | ||
- **Improved scalability and availability**: Caching plugin can easily scale to accommodate an increasing volume of of queries, ensuring consistent performance as your application's user base expands. | ||
|
||
# 🤔 How does it work? | ||
|
||
Online services often exhibit data locality, with users frequently accessing popular or trending content. Cache systems take advantage of this behavior by storing commonly accessed data, which in turn reduces data retrieval time, improves response times, and eases the burden on backend servers. Traditional cache systems typically utilize an exact match between a new query and a cached query to determine if the requested content is available in the cache before fetching the data. | ||
|
||
However, using an exact match approach for LLM caches is less effective due to the complexity and variability of LLM queries, resulting in a low cache hit rate. To address this issue, GPTCache adopt alternative strategies like semantic caching. Semantic caching identifies and stores similar or related queries, thereby increasing cache hit probability and enhancing overall caching efficiency. GPTCache employs embedding algorithms to convert queries into embeddings and uses a vector store for similarity search on these embeddings. This process allows GPTCache to identify and retrieve similar or related queries from the cache storage. | ||
|
||
<a target="_blank" href="https://github.com/zilliztech/GPTCache/blob/main/docs/GPTCacheStructure.png"> | ||
<p align="center"> | ||
<img src="https://github.com/zilliztech/GPTCache/blob/main/docs/GPTCacheStructure.png" alt="Cache Structure" width=600 height=200> | ||
</p> | ||
</a> | ||
|
||
# Installation | ||
To use the caching plugin functionality, you need to install the `gptcache` library first. You can do this using pip: | ||
|
||
```bash | ||
pip install -r requirements.txt | ||
``` | ||
|
||
# Usage | ||
## Initializing | ||
|
||
Before using the functionality of caching plugin, you need to initialize the caching plugin with the desired configuration. The following code demonstrates how to initialize caching plugin: | ||
|
||
```python | ||
from intel_extension_for_transformers.neural_chat.pipeline.plugins.cache import CachePlugin | ||
cache_plugin = CachePlugin() | ||
cache_plugin.init_similar_cache_from_config() | ||
``` | ||
|
||
## Caching Data | ||
|
||
Once cache plugin is initialized, you can start caching data using the `put`` function. Here's an example of how to cache data: | ||
|
||
```python | ||
prompt = "Tell me about Intel Xeon Scable Processors." | ||
response = chatbot.predict(prompt) | ||
cache_plugin.put(prompt, response) | ||
``` | ||
|
||
## Retrieving Cached Data | ||
|
||
To retrieve cached data, use the get function. Provide the same prompt/question text used for caching, and it will return the cached answer. Here's an example: | ||
|
||
```python | ||
answer = cache_plugin.get("Tell me about Intel Xeon Scable Processors.") | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.