Skip to content

Unstructured.IO: ETL for LLMs

Welcome to Unstructured.IO! We're here on a mission to make all of your documents available for LLM applications, from PDFs and Word Docs to emails and markdown. To get started, check out our open source offerings.

Tried the open source library and ready for more power? Check out our products page to learn more about our paid API and Unstructured Platform, and ETL tool built around our core file transformation capabilities.

Learn more

Section Description
Company Website Unstructured.io product and company info
Documentation Full unstructured documentation

Popular repositories

  1. unstructured unstructured Public

    Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.

    HTML 6.4k 480

  2. unstructured-api unstructured-api Public

    Python 347 76

  3. pipeline-sec-filings pipeline-sec-filings Public

    Preprocessing pipeline notebooks and API supporting text extraction from SEC documents

    Jupyter Notebook 130 24

  4. unstructured-inference unstructured-inference Public

    Python 107 29

  5. unstructured-python-client unstructured-python-client Public

    A Python client for the Unstructured hosted API

    Python 42 8

  6. unstructured-api-tools unstructured-api-tools Public archive

    Python 28 9

Repositories

Showing 10 of 27 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…