# DocToMarkdown Example Notebook

This notebook demonstrates how to use the DocToMarkdown library to convert PDF documents to Markdown using different types of API clients. It showcases both extraction with and without LLM (Large Language Model) support.

## Supported API Clients in this Notebook

- **Groq API Client**: Use Groq's LLM for PDF-to-Markdown conversion.
- **Gemini API Client**: Use Google's Gemini Vision model for advanced extraction.
- **Azure OpenAI Client**: Use Azure-hosted OpenAI models (e.g., GPT-4o) for document conversion.
- **Ollama API Client**: Use local or remote Ollama models via OpenAI-compatible API.
- **No LLM (Standard Extraction)**: Extracts text and images using only local libraries (fitz) without any LLM.

Each section below provides example code for initializing and using these clients with DocToMarkdown.

In [1]:
# Initialize Azure OpenAI client
from langchain_openai import AzureChatOpenAI
from groq import Groq
import os
from doctomarkdown import DocToMarkdown
from dotenv import load_dotenv
load_dotenv()

True

## Groq API Client

In [2]:

client_groq = Groq(
    api_key=os.environ.get("GROQ_API_KEY"),
)

app = DocToMarkdown(llm_client=client_groq, 
                    llm_model='meta-llama/llama-4-scout-17b-16e-instruct')

result = app.convert_pdf_to_markdown(
    filepath="sample_docs/sample_ppt_2.pdf",
    extract_images=True,
    extract_tables=True,
    output_path="markdown_output",
    output_type="text"
)

for page in result.pages:
    print(f"Page Number: {page.page_number} | Page Content: {page.page_content}")

Page Number: 1 | Page Content: 
## Image to Markdown
Extraction of Complex PPT Contents (Images) into Markdown

### Resume Screening Application Design

#### Icons
* **$365**
* **Amazon S3**
* **SQL**

#### Resumes Directory
* **No-SQL DB**

#### Batch Processing
* **PDFTextExtractor Agent**

#### HR Input Parameters
* **Priority Score: 0-1**
* **List**
* **Job Description**
* **Years of Experience**
* **Education Qualification**
* **Skill Set**

#### Prompt Context
* **Information**
	+ "Education Qualification"
	+ "Projects"
	+ "YE"
	+ "Skillset"
	+ "Experience"
	+ "Company Names"
	+ "Quantitatively Impacts"

#### VectorDB Embedding Store
* **OBJECT**
	+ **Key** : **Value**
	+ **Key** : **Value**

#### Retrieval Agent
* **Top 20 Matching records**
* **LLM**
* **Re-ranking Algorithm Mechanism**
	+ *Resume Score Generation Based on factors*
	+ *Resume*

#### Top Resumes Selection

#### Future Scope
* **Profile Summary Extraction Agent**

## Flowchart Description

The flowchart illustrat

## Extraction without LLM

In [3]:
import os
from doctomarkdown import DocToMarkdown
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r'C:\Users\sayantghosh\AppData\Local\Programs\Tesseract-OCR\tesseract.exe'

app = DocToMarkdown()

result = app.convert_pdf_to_markdown(
    filepath="sample_docs/sample-1.pdf",
    extract_images=True,
    extract_tables=True,
    output_path="markdown_output",
    output_type="text"
)

for page in result.pages:
    print(f"Page Number: {page.page_number} | Page Content: {page.page_content}")

Page Number: 1 | Page Content: Why should organisations consider early 
adoption and avoid being late movers? 
Late movers
Early adopters
Market 
position
Set industry benchmarks  
and gain ﬁrst-mover market advantage. 
Struggle to catch up and miss out on 
creating competitive advantage.
Innovation
Leverage AI to innovate business 
processes, deploy the AI solutions 
eﬀectively and create diﬀerentiation.
Slow to innovate business processes and 
take full advantage of AI solutions to create 
diﬀerentiation. 
Customer 
relationships
Build deeper customer relationships 
through personalised and newer 
experiences.
Play catch-up to match the personalised 
services of early adopters.
Operational 
eﬃciency
Streamline operations and reduce 
operational cost early on.
Higher lost opportunity cost due to late entry 
and adoptions.
Learning curve
Beneﬁt from the initial learning curve and 
shape industry standards.
Miss out on early learning opportunities and 
industry inﬂuence.
Market share
In

## OpenAI API Client

In [4]:
from openai import OpenAI
from dotenv import load_dotenv
load_dotenv()

client = OpenAI(
    api_key=os.environ.get("OPENAI_API_KEY"),
)

app = DocToMarkdown(llm_client=client, 
                    llm_model='gpt-4o')

result = app.convert_pdf_to_markdown(
    filepath="sample_docs/sample-1.pdf",
    extract_images=True,
    extract_tables=True,
    output_path="markdown_output",
    output_type="text"
)

for page in result.pages:
    print(f"Page Number: {page.page_number} | Page Content: {page.page_content}")

Page Number: 1 | Page Content: 
# Why should organisations consider early adoption and avoid being late movers?

|                             | **Early adopters**                                                                 | **Late movers**                                                                 |
|-----------------------------|------------------------------------------------------------------------------------|---------------------------------------------------------------------------------|
| **Market position**         | Set industry benchmarks and gain first-mover market advantage.                     | Struggle to catch up and miss out on creating competitive advantage.            |
| **Innovation**              | Leverage AI to innovate business processes, deploy the AI solutions effectively and create differentiation. | Slow to innovate business processes and take full advantage of AI solutions to create differentiation. |
| **Customer relationships**  | Build deepe

## Gemini API Client

In [5]:
from google import genai
from dotenv import load_dotenv
load_dotenv()

import asyncio
import google.generativeai as genai
from doctomarkdown import DocToMarkdown

# Setup Gemini API
genai.configure(api_key=os.getenv("GOOGLE_API_KEY"))

# Use Gemini Pro Vision model
vision_model = genai.GenerativeModel("gemini-1.5-flash") # CHOOSE YOUR GOOGLE VISION MODEL

# Initialize DocToMarkdown with Gemini client
app = DocToMarkdown(
    llm_client=vision_model
)

result = app.convert_pdf_to_markdown(
    filepath="sample_docs/Non-text-searchable.pdf",
    extract_images=True,
    extract_tables=True,
    output_path="markdown_output",
    output_type="text"
)

for page in result.pages:
    print(f"Page Number: {page.page_number} | Page Content: {page.page_content}")


Page Number: 1 | Page Content: 
IDRH
Non-text-searchable PDF

This is an example of a non-text-searchable PDF. Because it was created from
an image rather than a text document, it cannot be rendered as plain text by the
PDF reader. Thus, attempting to select the text on the page as though it were a
text document or website will not work, regardless of how neatly it is organized.

There are no flowcharts or diagrams in the provided image.


## Azure OpenAI Client

In [8]:
from openai import AzureOpenAI
from doctomarkdown import DocToMarkdown
import os
from dotenv import load_dotenv
load_dotenv()

clinet = AzureOpenAI(
    api_key=os.environ.get("AZURE_OPENAI_API_KEY"),
    azure_endpoint=os.environ.get("AZURE_OPENAI_ENDPOINT"),
    api_version=os.environ.get("AZURE_OPENAI_API_VERSION"),
)

app = DocToMarkdown(llm_client=clinet, 
                    llm_model='gpt-4o')


result = app.convert_pdf_to_markdown(
    filepath="sample_docs/sample-1.pdf",
    extract_images=True,
    extract_tables=True,
    output_path="markdown_output"
)

for page in result.pages:
    print(f"Page Number: {page.page_number} | Page Content: {page.page_content}")

Page Number: 1 | Page Content: 
# Why should organisations consider early adoption and avoid being late movers?

|                        | **Early adopters**                                                                 | **Late movers**                                                                 |
|------------------------|------------------------------------------------------------------------------------|---------------------------------------------------------------------------------|
| **Market position**    | Set industry benchmarks and gain first-mover market advantage.                     | Struggle to catch up and miss out on creating competitive advantage.            |
| **Innovation**         | Leverage AI to innovate business processes, deploy the AI solutions effectively and create differentiation. | Slow to innovate business processes and take full advantage of AI solutions to create differentiation. |
| **Customer relationships** | Build deeper customer relationsh

## Ollama API Client

In [3]:
from openai import OpenAI

ollama_client = OpenAI(
    base_url = 'http://localhost:11434/v1',
    api_key='ollama',
)

app = DocToMarkdown(llm_client=ollama_client, 
                    llm_model='gemma3:4b')
result = app.convert_pdf_to_markdown(
    filepath="sample_docs/Non-text-searchable.pdf",
    extract_images=True,
    extract_tables=True,
    output_path="markdown_output"
)

for page in result.pages:
    print(f"Page Number: {page.page_number} | Page Content: {page.page_content}")

Page Number: 1 | Page Content: 
# IDRH

Non-text-searchable PDF

This is an example of a non-text-searchable PDF. Because it was created from an image rather than a text document, it cannot be rendered as plain text by the PDF reader. Thus, attempting to select the text on the page as though it were a text document or website will not work, regardless of how neatly it is organized.
