Skip to content

Support TSV (Tab-Separated Values) files #2022

@anandmt

Description

@anandmt

Description

The CsvConverter currently only handles .csv files with comma delimiters. TSV (Tab-Separated Values) files are extremely common in data science, bioinformatics, spreadsheet exports, and LLM pipelines, but are not supported.

Expected behavior

MarkItDown should convert .tsv files to Markdown tables, just like it does for .csv files.

Proposed solution

Extend the existing CsvConverter to:

  1. Accept .tsv files and text/tab-separated-values MIME type
  2. Auto-detect the delimiter using Python's built-in csv.Sniffer
  3. Fall back to tab for .tsv files, comma for .csv files

PR

#2021

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions