Data transformation framework for AI. Ultra performant, with incremental processing.
-
Updated
Jul 25, 2025 - Rust
Data transformation framework for AI. Ultra performant, with incremental processing.
Big Data and Machine Intelligence Course in Autumn 2019.
Patient Intake Form Extraction using llm
🌲 Improved Interval B+ tree implementation, in TS 🌲
This repository contains an application designed to recommend scientific papers that are most similar to a given input paragraph. The application uses the llama and weaviate libraries to achieve this.
A zero-dependency library of classes that make filtering, sorting and observing changes to arrays easier and more efficient.
System for Managing the data generated by the SEAGrid Science Gateway
Designed to store and retrieve high-dimensional data, such as embeddings, efficiently. It enables fast similarity searches by leveraging techniques.
Datafast Runtime is a high-performance subgraph processing runtime which is written from scratch and designed to handle subgraphs with unparalleled speed & storage-efficiency
BORDS is an open-access reaction search engine that leverages Google's Open Reaction Database to provide ultra-fast, comprehensive access to millions of chemical reactions. Built with a modern cloud stack, it streamlines reaction data extraction, transformation, and indexing for researchers in chemistry and related fields.
Time series analysis showing trend, seasonality, and periodicity decomposition; and forecasting using Facebook Prophet. The analysis makes extensive use of indexing data tools and of the Pandas and Datetime libraries.
Python implementation of a TF-IDF/cosine based search engine
Add a description, image, and links to the data-indexing topic page so that developers can more easily learn about it.
To associate your repository with the data-indexing topic, visit your repo's landing page and select "manage topics."