diff --git a/api-reference/supported-file-types.mdx b/api-reference/supported-file-types.mdx index 740aba45..e3fd06d3 100644 --- a/api-reference/supported-file-types.mdx +++ b/api-reference/supported-file-types.mdx @@ -2,6 +2,6 @@ title: Supported file types --- -import SupportedFileTypes from '/snippets/general-shared-text/supported-file-types.mdx'; +import SupportedFileTypes from '/snippets/general-shared-text/supported-file-types-platform.mdx'; \ No newline at end of file diff --git a/snippets/general-shared-text/supported-file-types-platform.mdx b/snippets/general-shared-text/supported-file-types-platform.mdx index 7c7ed136..860c377e 100644 --- a/snippets/general-shared-text/supported-file-types-platform.mdx +++ b/snippets/general-shared-text/supported-file-types-platform.mdx @@ -1,4 +1,4 @@ -Unstructured supports processing of the following file types: +The Unstructured user interface (UI) and Unstructured API support processing of the following file types: By file extension: @@ -8,19 +8,14 @@ By file extension: | `.bmp` | | `.csv` | | `.cwk` | -| `.dbf` | -| `.dif` | +| `.dif`[*](#notes) | | `.doc` | -| `.docm` | | `.docx` | | `.dot` | -| `.dotm` | | `.eml` | | `.epub` | | `.et` | | `.eth` | -| `.fods` | -| `.gif` | | `.heic` | | `.htm` | | `.html` | @@ -29,16 +24,14 @@ By file extension: | `.jpg` | | `.md` | | `.mcw` | +| `.msg` | | `.mw` | -| `.odt` | | `.org` | | `.p7s` | -| `.pages` | | `.pbd` | | `.pdf` | | `.png` | | `.pot` | -| `.potm` | | `.ppt` | | `.pptm` | | `.pptx` | @@ -46,23 +39,14 @@ By file extension: | `.rst` | | `.rtf` | | `.sdp` | -| `.sgl` | | `.svg` | | `.sxg` | | `.tiff` | | `.txt` | | `.tsv` | -| `.uof` | -| `.uos1` | -| `.uos2` | -| `.web` | -| `.webp` | -| `.wk2` | | `.xls` | -| `.xlsb` | | `.xlsm` | | `.xlsx` | -| `.xlw` | | `.xml` | | `.zabw` | @@ -70,25 +54,27 @@ By file type: | Category | File types | | --- | --- | -| Apple | `.cwk`, `.mcw`, `.pages` +| Apple | `.cwk`, `.mcw` | CSV | `.csv` | -| Data interchange | `.dif` | -| dBase | `.dbf` | -| E-mail | `.eml`, `.p7s` | +| E-mail | `.eml`, `.msg`, `.p7s` | | EPUB | `.epub` | | HTML | `.htm`, `.html` | -| Image | `.bmp`, `.gif`, `.heic`, `.jpeg`, `.jpg`, `.png`, `.prn`, `.svg`, `.tiff`, `.webp` | +| Image | `.bmp`, `.heic`, `.jpeg`, `.jpg`, `.png`, `.prn`, `.svg`, `.tiff` | | Markdown | `.md` | | Org Mode | `.org` | -| Open Office | `.odt`, `.sgl` | -| Other | `.eth`, `.mw`, `.pbd`, `.sdp`, `.uof`, `.web` | +| Other | `.dif`[*](#notes), `.eth`, `.mw`, `.pbd`, `.sdp` | | PDF | `.pdf` | | Plain text | `.txt` | -| PowerPoint | `.pot`, `.potm`, `.ppt`, `.pptm`, `.pptx` | +| PowerPoint | `.pot`, `.ppt`, `.pptm`, `.pptx` | | reStructured Text | `.rst` | | Rich Text | `.rtf` | -| Spreadsheet | `.et`, `.fods`, `.uos1`, `.uos2`, `.wk2`, `.xls`, `.xlsb`, `.xlsm`, `.xlsx`, `.xlw` | +| Spreadsheet | `.et`, `.xls`, `.xlsm`, `.xlsx` | | StarOffice | `.sxg` | | TSV | `.tsv` | -| Word processing | `.abw`, `.doc`, `.docm`, `.docx`, `.dot`, `.dotm`, `.hwp`, `.zabw` | +| Word processing | `.abw`, `.doc`, `.docx`, `.dot`, `.hwp`, `.zabw` | | XML | `.xml` | + +## Notes + +* For `.dif`, `\n` characters in `.dif` files are supported, but `\r\n` characters will raise the error + `UnsupportedFileFormatError: Partitioning is not supported for the FileType.UNK file type`. \ No newline at end of file