# **Sarvam Parse PDF API Tutorial**

This notebook demonstrates how to use the **Sarvam Parse PDF API** to extract structured data from PDF files. The API allows you to parse PDFs and retrieve the content in HTML format, making it easy to extract and analyze data.



# **1. Installation**

Before you begin, ensure you have the necessary Python libraries installed. Run the following commands to install the required packages:


In [11]:
!pip install requests ipywidgets



## **2. Setting Up the API Key**

To use the Sarvam PDF Parser API, you need an API subscription key. Follow these steps to set up your API key:

1. **Obtain your API key**: If you don’t have an API key, sign up on the [Sarvam AI platform](https://api.sarvam.ai) to get one.
2. **Enter your API key**: In the UI widget below, enter your API key in the "API Key" field.


## **3. Uploading PDF Files**

To parse a PDF file, you need to upload it to the notebook. Follow these steps:

1. **Prepare your PDF file**: Ensure your file is in `.pdf` format.
2. **Upload the file**: Use the file upload widget below to upload your PDF file.


## **4. Using the Sarvam PDF Parser API**
This section demonstrates how to use the Sarvam PDF Parser API to extract structured data from a PDF file.

### **4.1. Defining the `parse_pdf` Function**

The `parse_pdf` function sends the PDF file to the Sarvam API and retrieves the parsed content in HTML format.

In [12]:
import requests
import base64
import tempfile
import os


def parse_pdf(pdf_file_path, page_number, sarvam_mode, api_key):

    url = "https://api.sarvam.ai/parse/parsepdf"
    headers = {"api-subscription-key": api_key}

    try:
        with open(pdf_file_path, "rb") as pdf_file:
            files = {
                "pdf": ("file.pdf", pdf_file, "application/pdf"),
                "page_number": (None, str(page_number)),
                "sarvam_mode": (None, sarvam_mode),
            }

            response = requests.post(url, headers=headers, files=files)

            if response.status_code == 200:
                output = response.json().get("output", "")
                if output:
                    return base64.b64decode(output).decode(
                        "utf-8"
                    )  # Return decoded HTML
                else:
                    return "Parsing failed. No data returned."
            else:
                return f"❌ API Error ({response.status_code}): {response.text}"

    except Exception as e:
        return f"⚠️ An error occurred: {str(e)}"

### **4.2. Creating the UI Widgets**

The following code creates interactive UI widgets for uploading the PDF file, selecting the page number, choosing the parsing mode, and entering the API key.

In [13]:
import ipywidgets as widgets
from IPython.display import display, HTML
from google.colab import files

# UI Widgets
upload_widget = widgets.FileUpload(accept=".pdf", multiple=False)
page_number_widget = widgets.IntText(value=1, description="Page No:")
sarvam_mode_widget = widgets.Dropdown(
    options=["large", "small"], value="small", description="Mode:"
)
api_key_widget = widgets.Password(description="API Key:")
parse_button = widgets.Button(description="Parse PDF", button_style="primary")


### **4.3. Handling PDF Parsing**

The `handle_parse` function processes the uploaded PDF file, calls the Sarvam API, and saves the parsed HTML content.



In [14]:
def handle_parse(_):
    """
    Handles the PDF parsing process when the "Parse PDF" button is clicked.
    """
    if upload_widget.value:
        uploaded_filename = list(upload_widget.value.keys())[0]
        file_content = upload_widget.value[uploaded_filename]["content"]

        # Save the uploaded file temporarily
        temp_pdf_path = os.path.join(tempfile.gettempdir(), uploaded_filename)
        with open(temp_pdf_path, "wb") as temp_file:
            temp_file.write(file_content)

        # Get user inputs
        page_number = page_number_widget.value
        sarvam_mode = sarvam_mode_widget.value
        api_key = api_key_widget.value

        display(
            HTML(f"<p>📄 Processing <b>{uploaded_filename}</b>... Please wait.</p>")
        )

        # Parse PDF
        parsed_html = parse_pdf(temp_pdf_path, page_number, sarvam_mode, api_key)

        # Save parsed HTML properly
        output_html_path = "parsed_output.html"
        with open(output_html_path, "w", encoding="utf-8") as html_file:
            html_file.write(parsed_html)

        # Provide a download link
        display(
            HTML(
                f"""
        <p>✅ <b>Parsing complete!</b> The parsed HTML file has been downloaded.</p>
        """
            )
        )

        # Enable file download
        files.download(output_html_path)


# Bind button click event
parse_button.on_click(handle_parse)

### **4.4. Displaying the UI**

This section displays the UI widgets for uploading the PDF file, entering the page number, selecting the parsing mode, and providing the API key.

In [15]:
# Display UI
display(
    HTML(
        "<h3>📄 Sarvam PDF Parser</h3><p>Upload a PDF file, enter details, and click 'Parse PDF' to extract structured data.</p>"
    )
)
display(
    upload_widget, page_number_widget, sarvam_mode_widget, api_key_widget, parse_button
)

FileUpload(value={}, accept='.pdf', description='Upload')

IntText(value=1, description='Page No:')

Dropdown(description='Mode:', options=('large',), value='large')

Password(description='API Key:')

Button(button_style='primary', description='Parse PDF', style=ButtonStyle())

## **5. Conclusion**

This notebook demonstrated how to use the **Sarvam PDF Parser API** to extract structured data from PDF files. By following the steps, you can:

1. Upload a PDF file.
2. Parse the file using the Sarvam API.
3. Download the parsed HTML content for further analysis.

---

## **6. Additional Resources**

For more details, refer to the official **Sarvam API documentation** and join the community for support:

- **Documentation**: [docs.sarvam.ai](https://docs.sarvam.ai)  
- **Community**: [Join the Discord Community](https://discord.gg/hTuVuPNF)

---

## **7. Final Notes**

- Keep your API key secure.x
- Explore advanced features like multi-page parsing and custom output formats.

Happy parsing! 🚀
