Indexify

Indexify is a reactive structured extraction and indexing engine for un-structured data.

Applications leveraging LLMs for autonomous planning or queries necessitate timely index updates aligned with data changes or new extraction methods. Indexify enables both, by applying feature extractors on data in real-time and updating one or many indexes.

Why use Indexify

Makes Unstructured Data Queryable with SQL and Semantic Search
Real Time Extraction Engine to keep indexes automatically updated as new data is ingested.
Create Extraction Graph to describe data transformation and extraction of embedding and structured extraction.
Incremental Extraction and Selective Deletion when content is deleted or updated.
Extractor SDK allows adding new extraction capabilities, and many readily avaialble extractors for PDF, Image and Video indexing and extraction.
Works with any LLM Framework including Langchain, DSPy, etc.
Runs on your laptop during prototyping and also scales to 1000s of machines on the cloud.
Works with many Blob Stores, Vector Stores and Structured Databases
We have even Open Sourced Automation to deploy to Kubernetes in production.

Detailed Getting Started

To get started follow our documentation.

Quick Start

Download Indexify

curl https://tensorlake.ai | sh

Start the server

./indexify server -d

Install the Indexify Extractor and Client SDKs

pip install indexify indexify-extractors

Start an embedding extractor

indexify-extractor download hub://embedding/minilm-l6
indexify-extractor join-server minilm-l6.minilm_l6:MiniLML6Extractor

Upload some texts

from indexify import IndexifyClient
client = IndexifyClient()
client.add_extraction_policy(extractor="tensorlake/minilm-l6", name="minilml6")
client.indexes()
client.add_documents(["Adam Silver is the NBA Commissioner", "Roger Goodell is the NFL commisioner"])

Search the Index

client.search_index(name="minilm6.embedding", query="NBA commissioner", top_k=1)

Use Extracted Data in Applications

You can now use the extracted data in your application. As data is ingested by Indexify, your indexes are going to be automatically updated by Indexify. We have an example of a Langchain application here

Try out Video, Audio or PDF Extractors

We have extractors for Video, Audio and PDF as well, you can list all the available extractors

indexify-extractor list

Structured Data

Extractors which produce structured data from content, such as bounding boxes and object type, or line items of invoices are stored in structured store. You can query extracted structured data using Indexify's SQL interface.

We have an example here

Contributions

Please open an issue to discuss new features, or join our Discord group. Contributions are welcome, there are a bunch of open tasks we could use help with!

If you want to contribute on the Rust codebase, please read the developer readme.

Name		Name	Last commit message	Last commit date
Latest commit History 1,310 Commits
.devcontainer		.devcontainer
.github/workflows		.github/workflows
.repo/conf		.repo/conf
crates		crates
dockerfiles		dockerfiles
docs		docs
protos		protos
scripts		scripts
src		src
templates		templates
ui		ui
.dockerignore		.dockerignore
.env		.env
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Cross.toml		Cross.toml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
askama.toml		askama.toml
build.rs		build.rs
client_cert_config		client_cert_config
docker-compose.yaml		docker-compose.yaml
local_server_config.yaml		local_server_config.yaml
package-lock.json		package-lock.json
run_tests.sh		run_tests.sh
rustfmt.toml		rustfmt.toml
sample_config.yaml		sample_config.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Indexify

Why use Indexify

Detailed Getting Started

Quick Start

Download Indexify

Start the server

Install the Indexify Extractor and Client SDKs

Start an embedding extractor

Upload some texts

Search the Index

Use Extracted Data in Applications

Try out Video, Audio or PDF Extractors

Structured Data

Contributions

About

Uh oh!

Releases

Packages

Languages

License

rylandg/indexify

Folders and files

Latest commit

History

Repository files navigation

Indexify

Why use Indexify

Detailed Getting Started

Quick Start

Download Indexify

Start the server

Install the Indexify Extractor and Client SDKs

Start an embedding extractor

Upload some texts

Search the Index

Use Extracted Data in Applications

Try out Video, Audio or PDF Extractors

Structured Data

Contributions

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages