# **📖 Transliteration API: A Hands-on Guide**  

### **🔗 Overview**  
This notebook demonstrates how to use the **Transliteration API** to convert text from one script to another while preserving pronunciation. It supports multiple Indic languages and offers customizable numeral formatting.  


## **Table of Contents**  

<p>1. <a href="#installation">Installation</a></p>  
<p>2. <a href="#authentication">Authentication</a></p>  
<p>3. <a href="#basic-usage">Understanding the Parameters</a></p>  
<p>4. <a href="#modes">Baisc Usage</a></p>  
<p>5. <a href="#advanced-features">Advanced Features</a></p>  
   <p>&nbsp;&nbsp;&nbsp;&nbsp;5.1 <a href="#numeral-format">Numeral Format</a></p>  
   <p>&nbsp;&nbsp;&nbsp;&nbsp;5.2 <a href="#transliteration">Spoken Form</a></p>  
<p>6. <a href="#error-handling">Error Handling</a></p>  
<p>7. <a href="#conclusion">Conclusion</a></p>  

## **1️⃣ Setup & Installation**  

Before you begin, ensure you have the necessary Python libraries installed. Run the following commands to install the required packages:

In [1]:
!pip install sarvamai



## **Import Required Libraries**

First, let's import all the necessary libraries.

In [2]:
from sarvamai import SarvamAI

## 2️⃣ **Authentication**


To use the API, you need an API subscription key. Follow these steps to set up your API key:

1. **Obtain your API key**: If you don’t have an API key, sign up on the [Sarvam AI Dashboard](https://dashboard.sarvam.ai/) to get one.
2. **Replace the placeholder key**: In the code below, replace "YOUR_SARVAM_AI_API_KEY" with your actual API key.

In [3]:
SARVAM_AI_API_KEY = "YOUR_SARVAM_API_KEY"
SARVAM_AI_API_KEY = "954b2595-6a49-49ec-8974-268a7cec4b69"


## **3️⃣ Understanding the Parameters**  
🔹 The API takes several key parameters:  
✔ **`input`** – The text to be transliterated.  
✔ **`source_language_code`** – Language of the input text.  
✔ **`target_language_code`** – Desired transliteration output language.  
✔ **`numerals_format`** – Choose between **international (0-9)** or **native (१-९)** numbers.  
✔ **`spoken_form`** – Whether to convert text into a natural spoken format.  
✔ **`spoken_form_numerals_language`** – Choose whether numbers should be spoken in **English** or **native** language.  


🚫 Note: Transliteration between Indic languages (e.g., Hindi → Bengali) is not supported. 


## **4️⃣ Basic Usage**  



### **Step 4.1: Read the Document**

We have two sample documents under the `data` folder:  
- `sample1.txt` contains an essay on *The Impact of Artificial Intelligence on Society* in English.  
- `sample1.txt` contains an essay on *The Impact of Artificial Intelligence on Society* in Hindi.  

In [4]:
def read_file(file_path, lang_name):
    try:
        with open(file_path, "r", encoding="utf-8") as file:
            # Read the first 5 lines
            lines = [next(file) for _ in range(5)]
            print(f"=== {lang_name} Text (First Few Lines) ===")
            print("".join(lines))  # Print first few lines

            # Read the remaining content
            remaining_text = file.read()

            # Combine all text
            full_doc = "".join(lines) + remaining_text

            # Count total characters
            total_chars = len(full_doc)
            print(f"\nTotal number of characters in {lang_name} file:", total_chars)

            return full_doc
    except FileNotFoundError:
        print(f"Error: {file_path} not found.")
        return None
    except Exception as e:
        print(f"An error occurred while reading {file_path}: {e}")
        return None

In [5]:
# Read English and Hindi documents
english_doc = read_file("data/sample1.txt", "English")
hindi_doc = read_file("data/sample2.txt", "Hindi")

=== English Text (First Few Lines) ===
The Impact of Artificial Intelligence on Society

Artificial Intelligence (AI) has emerged as one of the most transformative technologies of the 21st century, revolutionizing various aspects of human life. From healthcare to finance, transportation to education, AI is reshaping industries, automating processes, and augmenting human capabilities. While AI presents numerous benefits, it also raises ethical, economic, and societal concerns that must be carefully navigated to ensure its responsible integration into society.

One of the most profound impacts of AI is in the field of healthcare. AI-powered diagnostic tools enable early disease detection, improving patient outcomes and reducing healthcare costs. Machine learning algorithms analyze medical data to identify patterns that humans may overlook, aiding in personalized treatment plans. Robotic surgeries and AI-driven drug discoveries have further enhanced the efficiency of medical procedures. H

### **Step 4.2: Split the text into chunks of at most 1000 characters** 

Since the API has a restriction of 1000 characters per request, we need to split the text accordingly.

In [6]:
def chunk_text(text, max_length=1000):
    """Splits text into chunks of at most max_length characters while preserving word boundaries."""
    chunks = []
    
    while len(text) > max_length:
        split_index = text.rfind(" ", 0, max_length)  # Find the last space within limit
        if split_index == -1:  
            split_index = max_length  # No space found, force split at max_length
        
        chunks.append(text[:split_index].strip())  # Trim spaces before adding
        text = text[split_index:].lstrip()  # Remove leading spaces for the next chunk
    
    if text:
        chunks.append(text.strip())  # Add the last chunk
    
    return chunks

In [7]:
# Split the text
english_text_chunks = chunk_text(english_doc)

# Display chunk info
print(f"Total Chunks: {len(english_text_chunks)}")
for i, chunk in enumerate(english_text_chunks[:3], 1):  # Show only first 3 chunks for preview
    print(f"\n=== Chunk {i} (Length: {len(chunk)}) ===\n{chunk}")

Total Chunks: 2

=== Chunk 1 (Length: 993) ===
The Impact of Artificial Intelligence on Society

Artificial Intelligence (AI) has emerged as one of the most transformative technologies of the 21st century, revolutionizing various aspects of human life. From healthcare to finance, transportation to education, AI is reshaping industries, automating processes, and augmenting human capabilities. While AI presents numerous benefits, it also raises ethical, economic, and societal concerns that must be carefully navigated to ensure its responsible integration into society.

One of the most profound impacts of AI is in the field of healthcare. AI-powered diagnostic tools enable early disease detection, improving patient outcomes and reducing healthcare costs. Machine learning algorithms analyze medical data to identify patterns that humans may overlook, aiding in personalized treatment plans. Robotic surgeries and AI-driven drug discoveries have further enhanced the efficiency of medical proce

In [8]:
# Split the text
hindi_text_chunks = chunk_text(english_doc)

# Display chunk info
print(f"Total Chunks: {len(hindi_text_chunks)}")
for i, chunk in enumerate(hindi_text_chunks[:3], 1):  # Show only first 3 chunks for preview
    print(f"\n=== Chunk {i} (Length: {len(chunk)}) ===\n{chunk}")

Total Chunks: 2

=== Chunk 1 (Length: 993) ===
The Impact of Artificial Intelligence on Society

Artificial Intelligence (AI) has emerged as one of the most transformative technologies of the 21st century, revolutionizing various aspects of human life. From healthcare to finance, transportation to education, AI is reshaping industries, automating processes, and augmenting human capabilities. While AI presents numerous benefits, it also raises ethical, economic, and societal concerns that must be carefully navigated to ensure its responsible integration into society.

One of the most profound impacts of AI is in the field of healthcare. AI-powered diagnostic tools enable early disease detection, improving patient outcomes and reducing healthcare costs. Machine learning algorithms analyze medical data to identify patterns that humans may overlook, aiding in personalized treatment plans. Robotic surgeries and AI-driven drug discoveries have further enhanced the efficiency of medical proce

### **Step 4.3: Setting up the API Endpoint** 



In [9]:
client = SarvamAI(api_subscription_key=SARVAM_AI_API_KEY)

In [10]:
# Send requests for each chunk
translated_texts = []
for idx, chunk in enumerate(hindi_text_chunks):
    response = client.text.transliterate(input=chunk,
                                         source_language_code="hi-IN",
                                         target_language_code="hi-IN",
                                         spoken_form=True,
                                         numerals_format="international")
    
    translated_text = response.transliterated_text
    translated_texts.append(translated_text)

# Combine all translated chunks
final_translation = "\n".join(translated_texts)
print("\n=== Final Translated Text ===")
print(final_translation)


=== Final Translated Text ===
द इम्पैक्ट ऑफ़ आर्टिफिशियल इंटेलिजेंस ऑन सोसाइटी आर्टिफिशियल इंटेलिजेंस (ए आई) हैज़ इमर्ज़्ड एज़ वन ऑफ़ द मोस्ट ट्रांसफ़र्मेटिव टेक्नोलॉजीज़ ऑफ़ द ट्वेंटी फर्स्ट सेंचुरी, रेवोल्यूशनाइज़िंग वेरियस एस्पेक्ट्स ऑफ़ ह्यूमन लाइफ। फ्रॉम हेल्थकेयर टू फाइनेंस, ट्रांसपोर्टेशन टू एजुकेशन, एआई इज़ रीशेपिंग इंडस्ट्रीज़, ऑटोमेटिंग प्रोसेसेज़, एंड ऑगमेंटिंग ह्यूमन केपेबिलिटीज़। व्हाइल ए आई प्रेजेंट्स न्यूमरस बेनिफिट्स, इट आल्सो रेज़ एथिकल, इकोनॉमिक, एंड सोसाइअटल कंसर्न्स दैट मस्ट बी केयरफुली नेविगेटेड टू एन्शुर इट्स रिस्पॉन्ससिबल इंटीग्रेशन इनटू सोसाइटी। वन ऑफ़ द मोस्ट प्रोफाउंड इम्पैक्ट्स ऑफ़ ए आई ज़ इन द फ़ील्ड ऑफ़ हेल्थकेयर। एआई-पावर्ड डायग्नोस्टिक टूल्स इनेबल अर्ली डिसीज डिटेक्शन, इम्प्रूविंग पेशेंट आउटकम्स एंड रिड्यूसिंग हेल्थकेयर कॉस्ट्स। मशीन लर्निंग एल्गोरिथम्स एनालाइज़ मेडिकल डेटा टू आइडेंटिफाई पैटर्न्स दैट ह्यूमन्स मे ओवरलुक, एडिंग इन पर्सनलाइज़्ड ट्रीटमेंट प्लान्स। रोबोटिक सर्जरीज़ एंड ए आई-ड्रिवन ड्रग डिस्कवरीज़ हैव फर्दर एन्हांस्ड द एफिशिएंसी ऑफ़ मेडिकल प्रोसीजर्स। हाउएवर, द इंटी


## **5️⃣ Experimenting with Different Options**  


We currently have **three different transliteration models**:  




### **1️⃣ Romanization (Indic → Latin Script)**  
- Converts Indic scripts to Roman script (English alphabet).  
- Example: `मैं ऑफिस जा रहा हूँ` → `main office ja raha hun`  
- Parameters:  
  - `source_language_code = "hi-IN"`  
  - `target_language_code = "en-IN"`  


In [11]:
response = client.text.transliterate(input="मैं ऑफिस जा रहा हूँ",
                         source_language_code="hi-IN",
                         target_language_code="en-IN",
                         spoken_form=True)

transliterated_text = response.transliterated_text
print(f"Romanized Text: {transliterated_text}")


RemoteProtocolError: Server disconnected without sending a response.

### **2️⃣ Conversion to Indic Scripts**  
- Converts text into an Indic script from various sources:  

  - **Code-mixed text**  
    - Example: `मैं office जा रहा हूँ` → `मैं ऑफिस जा रहा हूँ`  
    - Parameters:  
      - `source_language_code = "hi-IN"`  
      - `target_language_code = "hi-IN"`  

  - **Romanized text**  
    - Example: `main office ja raha hun` → `मैं ऑफिस जा रहा हूँ`  
    - Parameters:  
      - `source_language_code = "hi-IN"`  
      - `target_language_code = "hi-IN"`  

  - **English text**  
    - Example: `I am going to office` → `आइ ऍम गोइंग टू ऑफिस`  
    - Parameters:  
      - `source_language_code = "en-IN"`  
      - `target_language_code = "hi-IN"`  


In [None]:
response = client.text.transliterate(input="main office ja raha hun",
                         source_language_code="hi-IN",
                         target_language_code="hi-IN",
                         spoken_form=True)

transliterated_text = response.transliterated_text
print(f"Transliterated Text: {transliterated_text}")


### **3️⃣ Spoken Indic Form**  
- Converts written text into a more natural spoken form.  
- Example: `मुझे कल 9:30am को appointment है` → `मुझे कल सुबह साढ़े नौ बजे अपॉइंटमेंट है`  

In [12]:
response = client.text.transliterate(input="मुझे कल 9:30am को appointment है",
                         source_language_code="hi-IN",
                         target_language_code="hi-IN",
                         spoken_form=True)

transliterated_text = response.transliterated_text
print(f"Spoken Text: {transliterated_text}")

Spoken Text: मुझे कल सुबह साढ़े नौ बजे अपॉइंटमेंट है


## **6️⃣ Advance Features**  

- **`numerals_format`** – Choose between **international (0-9)** or **native (१-९)** numbers.  
- **`spoken_form_numerals_language`** – Choose whether numbers should be spoken in **English** or the **native language**.  




### **Numerals Format**  
`numerals_format` is an optional parameter with two options:  

- **`international`** (default): Uses regular numerals (0-9).  
- **`native`**: Uses language-specific native numerals.  

#### **Example:**  
- If `international` format is selected → `मेरा phone number है: 9840950950`.  
- If `native` format is selected → `मेरा phone number है: ९८४०९५०९५०`.  



In [13]:
response = client.text.transliterate(input="मुझे कल 9:30am को appointment है",
                         source_language_code="hi-IN",
                         target_language_code="hi-IN",
                         spoken_form=True,
                         numerals_format="native")

transliterated_text = response.transliterated_text
print(f"Native Numerals Text: {transliterated_text}")

Native Numerals Text: मुझे कल सुबह साढ़े नौ बजे अपॉइंटमेंट है


### **Spoken Form Numerals Language**  
`spoken_form_numerals_language` is an optional parameter with two options and only works when `spoken_form` is **true**:  

- **`english`**: Numbers in the text will be spoken in **English**.  
- **`native (default)`**: Numbers in the text will be spoken in the **native language**.  

#### **Example:**  
**Input:** `"मेरे पास ₹200 है"`  
- If `english` format is selected → `"मेरे पास टू हन्डर्ड रूपीस है"`.  
- If `native` format is selected → `"मेरे पास दो सौ रुपये है"`.  

In [14]:
response = client.text.transliterate(input="मुझे कल 9:30am को appointment है",
                         source_language_code="hi-IN",
                         target_language_code="hi-IN",
                         spoken_form=True,
                         spoken_form_numerals_language="english")

transliterated_text = response.transliterated_text
print(f"Spoken Form Numerals Language Text: {transliterated_text}")


Spoken Form Numerals Language Text: मुझे कल नाइन थर्टी ए एम को अपॉइंटमेंट है


### **7. Additional Resources**

For more details, refer to the our official documentation and we are always there to support and help you on our Discord Server:

- **Documentation**: [docs.sarvam.ai](https://docs.sarvam.ai)  
- **Community**: [Join the Discord Community](https://discord.gg/hTuVuPNF)

---

### **8. Final Notes**

- Keep your API key secure.

**Keep Building!** 🚀