Skip to content

About

Nlmatics extracts data from large documents sets using retrieval augemented generation (RAG). It can also be used for RAG search on knowledge bases. It comes with an extensive UI for search, data extraction and PDF viewing. It ingests documents using the llmsherpa/nlm-ingestor backend and indexes the document in elastic search which are retrieved using a hybrid search approach.

Credits

Nlmatics was founded by Ambika Sukla and Bulent Yener.

Nlmatics developed an early RAG like question answering, semantic search and data extraction pipeline using layout aware chunking, vector + bm25 indexing and language models. ‍ The open source codebase was developed from 2020-2023 by Yi Zhang, Ambika Sukla, Kiran Panicker, Niranjan Borawake, Suhail Kandanur, Wonjun Kang, Reshav Abraham, Nima Sheikholeslami, Lora Johns, Jasmin Omanovic, Karen Reeves, Sonia Joseph, Evan Li, Batya Stein, Cheyenne Zhang, Ashlan Ahmed, Nicholas Greenspan, Connie Xu, Shivangi Jha and others with product management support from Pooja Reddy, Ambika Sukla and Jan Choy.

Nlmatics is thankful to have worked with prominent early adopters in financial services, legal services and life sciences who recognized and leveraged our technology way before the current wave of generative AI.

Nlmatics raised seed funding from Felix Anthony, Silvertech Ventures, World Trade Ventures and ERS Ventures.

Popular repositories

  1. llmsherpa llmsherpa Public

    Developer APIs to Accelerate LLM Projects

    Jupyter Notebook 1.1k 106

  2. nlm-ingestor nlm-ingestor Public

    This repo provides the server side code for llmsherpa API to connect. It includes parsers for various file formats.

    Python 887 96

  3. nlm-tika nlm-tika Public

    Java 13 11

  4. nlm-app nlm-app Public

    Frontend code of nlmatics search and data extraction application

    JavaScript 4 2

  5. nlm-utils nlm-utils Public

    Common utilities used by all nlm-* libraries.

    Python 2 7

  6. nlmatics.github.io nlmatics.github.io Public

    1 4

Repositories

Showing 10 of 17 repositories
  • nlm-ingestor Public

    This repo provides the server side code for llmsherpa API to connect. It includes parsers for various file formats.

    nlmatics/nlm-ingestor’s past year of commit activity
    Python 887 Apache-2.0 96 47 2 Updated Jun 17, 2024
  • llmsherpa Public

    Developer APIs to Accelerate LLM Projects

    nlmatics/llmsherpa’s past year of commit activity
    Jupyter Notebook 1,109 MIT 106 49 1 Updated Jun 13, 2024
  • nlm-tika Public
    nlmatics/nlm-tika’s past year of commit activity
    Java 13 Apache-2.0 11 2 1 Updated Jun 12, 2024
  • nlm-discovery-engine Public

    Code to run the nlmatics retrieval pipeline

    nlmatics/nlm-discovery-engine’s past year of commit activity
    Python 1 Apache-2.0 1 0 7 Updated Jun 3, 2024
  • nlm-services Public

    API and Server side code to run nlmatics app

    nlmatics/nlm-services’s past year of commit activity
    Python 0 Apache-2.0 2 0 0 Updated May 27, 2024
  • nlm-app Public

    Frontend code of nlmatics search and data extraction application

    nlmatics/nlm-app’s past year of commit activity
    JavaScript 4 Apache-2.0 2 0 0 Updated Apr 4, 2024
  • .github Public
    nlmatics/.github’s past year of commit activity
    0 0 0 0 Updated Mar 29, 2024
  • nlmatics/nlm-model-service’s past year of commit activity
    Python 1 Apache-2.0 1 0 0 Updated Mar 26, 2024
  • nlm-utils Public

    Common utilities used by all nlm-* libraries.

    nlmatics/nlm-utils’s past year of commit activity
    Python 2 Apache-2.0 7 1 0 Updated Jan 19, 2024
  • w2n Public
    nlmatics/w2n’s past year of commit activity
    Python 0 MIT 0 0 0 Updated Feb 28, 2023

Most used topics

Loading…