Skip to content
@DS4SD

IBM Deep Search

Developer tools for IBM Deep Search

Welcome to IBM Deep Search

Deep Search extracts and structures data from documents in four steps: Parse, Interpret, Index, and Integrate. Try out the first steps on our public system, where we have a live PDF to JSON inspector. With the inspector, you can see how your (programmatic) PDF documents get converted into JSON.

Deep Search also provides a programmatic access to the service, for easy integration with other tools or in order to do bulk conversion. Our python toolkit provides these functionalities both as a client and library. Our examples repository is very useful to get started.


Publications

Find here our extensive list of publications!

Gallery

Image extraction Table Understanding
image table
List resolution Math Formula
list math
Complex Layout Colored layout
complex complex

Pinned Loading

  1. docling Public

    Get your documents ready for gen AI

    Python 23.5k 1.4k

  2. deepsearch-toolkit Public

    Interact with the Deep Search platform for new knowledge explorations and discoveries

    Python 173 24

  3. deepsearch-examples Public

    Examples using the Deep Search functionalities

    Python 67 21

  4. DocLayNet Public

    DocLayNet: A Large Human-Annotated Dataset for Document-Layout Analysis

    323 17

Repositories

Showing 10 of 29 repositories
  • docling-eval Public
    Python 8 MIT 2 0 4 Updated Mar 8, 2025
  • docling-core Public

    A python library to define and validate data types in Docling.

    Python 77 MIT 32 13 6 Updated Mar 7, 2025
  • docling-serve Public

    Running Docling as an API service

    Python 130 MIT 28 21 6 Updated Mar 7, 2025
  • docling Public

    Get your documents ready for gen AI

    Python 23,522 MIT 1,368 176 (9 issues need help) 15 Updated Mar 7, 2025
  • Python 1 MIT 0 1 0 Updated Feb 28, 2025
  • Python 79 MIT 12 11 2 Updated Feb 28, 2025
  • PatCID Public
    Python 46 MIT 2 3 0 Updated Feb 26, 2025
  • MolGrapher Public

    MolGrapher: Graph-based Visual Recognition of Chemical Structures

    Python 64 MIT 4 3 0 Updated Feb 22, 2025
  • docling-ts Public

    Use Docling output in TypeScript and JavaScript

    TypeScript 2 MIT 0 1 0 Updated Feb 18, 2025
  • docling-parse Public

    Simple package to extract text with coordinates from programmatic PDFs

    C++ 75 MIT 15 8 2 Updated Feb 18, 2025

People

This organization has no public members. You must be a member to see who’s a part of this organization.