# **Text-to-Speech Conversion using Sarvam AI API**

This notebook demonstrates how to convert text into speech using the Sarvam AI Text-to-Speech API.The resulting audio files are saved as `.wav` files.

## **Prerequisites**

Before running this notebook, ensure you have the following installed:

- Python 3.7 or higher
- Required Python packages: `requests`, `base64`, `wave`

You can install the required packages using pip:

In [None]:
!pip install requests




## **Import Required Libraries**

First, let's import all the necessary libraries.

In [None]:
import requests
import base64
import wave


### **2. Set Up the API Endpoint and Payload**

To use the Saaras API, you need an API subscription key. Follow these steps to set up your API key:

1. **Obtain your API key**: If you don’t have an API key, sign up on the [Sarvam AI Dashboard](https://dashboard.sarvam.ai/) to get one.
2. **Replace the placeholder key**: In the code below, replace "YOUR_SARVAM_AI_API_KEY" with your actual API key.

In [None]:
SARVAM_AI_API="YOUR_SARVAM_AI_API_KEY"


### **Setting Up the API Endpoint and Payload**

This section defines the API endpoint and the payload for the translation request. Replace the placeholder values with your actual API key and desired parameters.

In [None]:
# API endpoint and headers
url = "https://api.sarvam.ai/text-to-speech"
headers = {
    "Content-Type": "application/json",
    "api-subscription-key": SARVAM_AI_API  # Replace with your valid API key
}





### **Text to be converted into speech**

In [None]:
text = """
Netaji Subhash Marg से Dayanand Road की तरफ, south की तरफ़ जाने से शुरू करें। Dayanand Road पर पहुँचने के बाद, बाएँ मुड़ जाएँ। 350 meters तक सीधा चलते रहें।आपको बायें तरफ़, United Bank of India ATM दिखेगा। Dayanand School के दाएँ तरफ़ से गुजरने के बाद, बाएँ मुड़ें।
120 meters के बाद, Ghata Masjid Road पर, right turn करें।
280 meters तक चलते रहें।
Mahatma Gandhi Marg पे रहें और, 2.9 kilometers तक Old Delhi की तरफ जाएँ।
फिर, HC Sen Marg पर continue करें, और Paranthe Wali Gali तक drive करें।
"""

## **Split Text into Chunks**

The Sarvam AI API may have a limit on the number of characters per request. To handle this, we split the text into chunks of 500 characters or less.

In [None]:
# Split the text into chunks of 500 characters or less
chunk_size = 500
chunks = [text[i:i + chunk_size] for i in range(0, len(text), chunk_size)]

# Print the number of chunks
print(f"Total chunks: {len(chunks)}")


Total chunks: 1


## **Process Each Chunk**

Iterate over each chunk, send it to the Sarvam AI API, and save the resulting audio as a `.wav` file.

In [None]:
# Iterate over each chunk and make the API call
for i, chunk in enumerate(chunks):
    # Prepare the payload for the API request
    payload = {
        "inputs": [chunk],
        "target_language_code": "kn-IN",  # Target language code (Kannada in this case)
        "speaker": "neel",  # Speaker voice
        "model": "bulbul:v1",  # Model to use
        "pitch": 0,  # Pitch adjustment
        "pace": 1.0,  # Speed of speech
        "loudness": 1.0,  # Volume adjustment
        "enable_preprocessing": True,  # Enable text preprocessing
    }

    # Make the API request
    response = requests.post(url, json=payload, headers=headers)

    # Check if the request was successful
    if response.status_code == 200:
        # Decode the base64-encoded audio data
        audio = response.json()["audios"][0]
        audio = base64.b64decode(audio)

        # Save the audio as a .wav file
        with wave.open(f"output{i}.wav", "wb") as wav_file:
            # Set the parameters for the .wav file
            wav_file.setnchannels(1)  # Mono audio
            wav_file.setsampwidth(2)  # 2 bytes per sample
            wav_file.setframerate(22050)  # Sample rate of 22050 Hz

            # Write the audio data to the file
            wav_file.writeframes(audio)

        print(f"Audio file {i} saved successfully as 'output{i}.wav'!")
    else:
        # Handle errors
        print(f"Error for chunk {i}: {response.status_code}")
        print(response.json())


Audio file 0 saved successfully as 'output0.wav'!


## **Output**

After running the notebook, you will have multiple `.wav` files (e.g., `output1.wav`, `output2.wav`, etc.) containing the speech for each chunk of text.

## **Conclusion**
This notebook provides a step-by-step guide to converting text into speech using the Sarvam AI API. You can modify the text, language, and other parameters to suit your specific needs.


### **Additional Resources**

For more details, refer to the our official documentation and we are always there to support and help you on our Discord Server:

- **Documentation**: [docs.sarvam.ai](https://docs.sarvam.ai)  
- **Community**: [Join the Discord Community](https://discord.gg/hTuVuPNF)

---

### **9. Final Notes**

- Keep your API key secure.
- Use clear audio for best results.

**Keep Building!** 🚀