## MarkitDown

In [46]:
from markitdown import MarkItDown
from openai import OpenAI
import pdf2docx
import base64
import os
import re
import io

In [47]:
llm = OpenAI()

md = MarkItDown(llm_client=llm, llm_model='gpt-5-mini')

In [65]:
def remove_base_64_images(markdown):
    pattern = r"!\[.*?\]\(data:image.+?\)"
    text = re.sub(pattern, "<REMOVED>", markdown, flags=re.DOTALL)
    return text

#### Document Conversion

#### Word

In [66]:
doc = md.convert('test_files/test.docx', keep_data_uris=False)

print(remove_base_64_images(doc.markdown))

AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation

Qingyun Wu , Gagan Bansal , Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, Ahmed Awadallah, Ryen W. White, Doug Burger, Chi Wang

# Abstract

AutoGen is an open-source framework that allows developers to build LLM applications via multiple agents that can converse with each other to accomplish tasks. AutoGen agents are customizable, conversable, and can operate in various modes that employ combinations of LLMs, human inputs, and tools. Using AutoGen, developers can also flexibly define agent interaction behaviors. Both natural language and computer code can be used to program flexible conversation patterns for different applications. AutoGen serves as a generic framework for building diverse applications of various complexities and LLM capacities. Empirical studies demonstrate the effectiveness of the framework in many example applications, with domains ranging fr

#### Powerpoint

In [None]:
# doc = md.convert('test_files/test.pptx')

print(doc.markdown)

[INFO] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
[INFO] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


<!-- Slide number: 1 -->
# AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation
Qingyun Wu , Gagan Bansal , Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, Ahmed Awadallah, Ryen W. White, Doug Burger, Chi Wang

<!-- Slide number: 2 -->
# 2cdda5c8-e50e-4db4-b5f0-9722a649f455

![Front page of a research preprint titled "AutoGen: Enabling Next‑Gen LLM Applications via Multi‑Agent Conversation" (arXiv preprint, Oct 3, 2023). The layout shows the paper title and institution affiliations at the top, followed by a wide illustrative figure and the paper abstract beneath it. The central illustration is divided into three conceptual panels: (left) "Agent Customization" — icons and small diagrams indicating that conversational agents can be created, customized and combined (LLMs, tools, humans shown as distinct components); (middle) "Multi‑Agent Conversations / Flexible Conversation Patterns" — schematic workflows showing agents

#### PDF

In [68]:
doc = md.convert('test_files/AmazonWebServices.pdf', keep_data_uris=False)

print(doc.markdown)

Account number:
296664039561

Bill to Address:
ATTN: iViveLabs Limited
93B Sai Yu Chung
Yuen Long, N.T., 0000, HK

Amazon Web Services Invoice
Email or talk to us about your AWS account or bill, visit aws.amazon.com/contact-us/

Invoice Summary

Invoice Number:
Invoice Date:

TOTAL AMOUNT DUE ON August 3 , 2014

42183017
August 3 , 2014

$4.11

This invoice is for the billing period July 1 - July 31 , 2014

Greetings from Amazon Web Services, we're writing to provide you with an electronic invoice for your use of AWS services. Additional information
regarding your bill, individual service charge details, and your account history are available on the Account Activity Page.

Summary

AWS Service Charges

Charges

Credits

Tax *

Total for this invoice

Detail

AWS Data Transfer

Charges

VAT **

Amazon Elastic Compute Cloud

Charges

VAT **

Amazon Glacier

Charges

VAT **

Amazon Simple Storage Service

Charges

VAT **

$4.11

$4.11

$0.00

$0.00

$4.11

$0.01

$0.01

$0.00

$1.87

$1.8

##### PDF2DOCX ==> MARKITDOWN

In [69]:
pdf2docx.Converter('test_files/AmazonWebServices.pdf').convert('test_files/AmazonWebServices.docx')

[INFO] Start to convert test_files/AmazonWebServices.pdf
[INFO] [1;36m[1/4] Opening document...[0m
[INFO] [1;36m[2/4] Analyzing document...[0m
[INFO] [1;36m[3/4] Parsing pages...[0m
[INFO] (1/1) Page 1
[INFO] [1;36m[4/4] Creating pages...[0m
[INFO] (1/1) Page 1
[INFO] Terminated in 0.26s.


In [76]:
doc = md.convert('test_files/AmazonWebServices.docx', keep_data_uris=False)

print(remove_base_64_images(doc.markdown))

<REMOVED>

Amazon Web Services Invoice
Email or talk to us about your AWS account or bill, visit aws.amazon.com/contact-us/

|  |  |  |
| --- | --- | --- |
| Account number: | Invoice Summary |  |
| 296664039561 |
|  | Invoice Number: | 42183017 |
|  | Invoice Date: | August 3 , 2014 |
| Bill to Address: | TOTAL AMOUNT DUE ON August 3 , 2014 | $4.11 |
| ATTN: iViveLabs Limited |

93B Sai Yu Chung
Yuen Long, N.T., 0000, HK

This invoice is for the billing period July 1 - July 31 , 2014

Greetings from Amazon Web Services, we're writing to provide you with an electronic invoice for your use of AWS services. Additional information regarding your bill, individual service charge details, and your account history are available on the Account Activity Page.

|  |
| --- |
| Summary |
| AWS Service Charges $4.11 |
| Charges $4.11 |
| Credits $0.00 |
| Tax \* $0.00 |
| Total for this invoice $4.11 |

|  |
| --- |
| Detail |
| AWS Data Transfer $0.01 |
| Charges $0.01 |
| VAT \*\* $0.00 |
| Amazo

### Image Description

![Image](test_files/diagram.png)

In [75]:
doc = md.convert('test_files/diagram.png')

print(doc.markdown)

[INFO] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"



# Description:
A hierarchical taxonomy (labels in Portuguese) of UML diagram types, with a single root node “Diagrama” branching into two main categories: “Diagrama de Estruturas” (structure diagrams) and “Diagrama de Comportamento” (behavior diagrams). The structure branch lists typical structural diagram types — for example, Diagrama de Classes (Class Diagram: models system classes, attributes, operations and relationships), Diagrama de Componentes (Component Diagram: shows software components and their interfaces), Diagrama de Estruturas Compostas (Composite Structure Diagram: internal structure of classifiers), Diagrama de Objetos (Object Diagram: instance-level snapshots), Diagrama de Pacotes (Package Diagram: organizes model elements into namespaces), Diagrama de Implantação (Deployment Diagram: hardware nodes and artifact placement), and Diagrama de Perfis (Profile Diagram: defines domain-specific stereotypes and tagged values). The behavior branch lists behavioral and interact

![Image](test_files\paper.jpg)

In [73]:
doc = md.convert('test_files/paper.jpg')

print(doc.markdown)

[INFO] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"



# Description:
Front page of a research paper titled "AutoGen: Enabling Next‑Gen LLM Applications via Multi‑Agent Conversation." The layout shows the paper title at top, a block of author and affiliation lines (names not listed here), and a central illustrative figure (Figure 1) that visualizes the AutoGen concept. The figure is divided into three labeled panels: (left) "Conversable agent" – schematic icons indicating agents that can be based on LLMs, tools, or humans and are customizable; (center) "Multi‑Agent Conversations" – diagrams of agents interacting in different patterns (joint chat and hierarchical chat) to collaborate on tasks; (right) "Example Agent Chat" – a sample chat flow with system and agent messages plus a plotted output (a stock‑price style line chart) showing how agents request actions, handle errors (e.g., missing packages), install or call tools, and produce a final result. Below the figure is a brief abstract describing AutoGen as an open‑source framework that 