Python tool for converting files and office documents to Markdown.

Install MarkItDown and restart kernel (Cmd+Shift+P, then select "Jupyter: Restart Kernel").

See [here](https://github.com/microsoft/markitdown) for more information.

In [12]:
!uv add "markitdown[all]" --quiet
!uv sync

[2mResolved [1m149 packages[0m [2min 3ms[0m[0m
[2mAudited [1m142 packages[0m [2min 2ms[0m[0m


In [1]:
from markitdown import MarkItDown
from pathlib import Path
from typing import Union

def convert_and_save(source_path: Union[str, Path]) -> Path:
    """
    Converts a file to Markdown and saves the result.
    
    Args:
        source_path: Path to the source file
        
    Returns:
        Path: Path to the created Markdown file
        
    Raises:
        FileNotFoundError: If the source file does not exist
        PermissionError: If write permissions are lacking
    """
    md = MarkItDown()
    source = Path(source_path).expanduser().resolve()
    
    # Validations
    if not source.exists():
        raise FileNotFoundError(f"File {source} does not exist")
    
    if not source.is_file():
        raise ValueError(f"{source} is not a file")
    
    # Conversion
    result = md.convert(str(source))
    
    # Save
    output_path = source.with_suffix('.md')
    output_path.write_text(result.text_content, encoding='utf-8')
    
    return output_path


In [3]:
home = Path.home()
downloads = home / "Downloads"
source = downloads / "StructuredRagPyBay25.pdf"

output_file = convert_and_save(source)
print(f"Markdown file created: {output_file}")

Cannot set gray non-stroke color because /'P35' is an invalid float value
Cannot set gray non-stroke color because /'P43' is an invalid float value
Cannot set gray non-stroke color because /'P49' is an invalid float value
Cannot set gray non-stroke color because /'P55' is an invalid float value
Cannot set gray non-stroke color because /'P61' is an invalid float value
Cannot set gray non-stroke color because /'P67' is an invalid float value


Markdown file created: /Users/alain/Downloads/StructuredRagPyBay25.md
