# Example without LLM

In [1]:
from doctomarkdown import DocToMarkdown
from dotenv import load_dotenv
load_dotenv()

app = DocToMarkdown()

result = app.convert_docx_to_markdown(
    filepath="sample_docs/sample_document.docx",
    extract_images=True,
    extract_tables=True,
    output_path="markdown_output",
    output_type="text"
)

for page in result.pages:
    print(f"Page Number: {page.page_number} | Page Content: {page.page_content}")

  0%|          | 0/1 [00:00<?, ?it/s]

Page Number: 1 | Page Content: Demonstration of DOCX support in 
calibre 
This document demonstrates the ability of the calibre DOCX Input plugin to 
convert the various typographic features in a Microsoft Word (2007 and newer) 
document. Convert this document to a modern ebook format, such as AZW3 for 
Kindles or EPUB for other ebook readers, to see it in action. 
There is support for images, tables, lists, footnotes, endnotes, links, dropcaps and 
various types of text and paragraph level formatting. 
To see the DOCX conversion in action, simply add this file to calibre using the 
“Add Books” button and then click “Convert”.  Set the output format in the top right 
corner of the conversion dialog to EPUB or AZW3 and click “OK”.
Page Number: 2 | Page Content: Text Formatting 
Inline formatting 
Here, we demonstrate various types of inline text formatting and the use of 
embedded fonts. 
Here is some bold, italic, bold-italic, underlined and struck out  text. Then, we 
have a superscri

# Example using LLM

In [None]:
from groq import Groq
from openai import AzureOpenAI
from doctomarkdown import DocToMarkdown
from dotenv import load_dotenv
import os
load_dotenv()

clinet = AzureOpenAI(
    api_key=os.environ.get("AZURE_OPENAI_API_KEY"),
    azure_endpoint=os.environ.get("AZURE_OPENAI_ENDPOINT"),
    api_version=os.environ.get("AZURE_OPENAI_API_VERSION"),
)

app = DocToMarkdown(llm_client=clinet, 
                    llm_model='gpt-4o')

result = app.convert_docx_to_markdown(
    filepath="sample_docs/sample_document.docx",
    extract_images=True,
    extract_tables=True,
    output_path="markdown_output",
    output_type="text"
)

for page in result.pages:
    print(f"Page Number: {page.page_number} | Page Content: {page.page_content}")

  0%|          | 0/1 [00:00<?, ?it/s]

Page Number: 1 | Page Content: 
# Demonstration of DOCX support in calibre

This document demonstrates the ability of the calibre DOCX Input plugin to convert the various typographic features in a Microsoft Word (2007 and newer) document. Convert this document to a modern ebook format, such as AZW3 for Kindles or EPUB for other ebook readers, to see it in action.

There is support for images, tables, lists, footnotes, endnotes, links, dropcaps and various types of text and paragraph level formatting.

To see the DOCX conversion in action, simply add this file to calibre using the **“Add Books”** button and then click **“Convert”**. Set the output format in the top right corner of the conversion dialog to EPUB or AZW3 and click **“OK”**.
Page Number: 2 | Page Content: 
# Text Formatting

## Inline formatting

Here, we demonstrate various types of inline text formatting and the use of embedded fonts.

Here is some **bold**, *italic*, ***bold-italic***, 

underlined

 and ~~struck out~~ t