-
Anytime.AI
- Santa Clara, CA
- http://academic.hugochan.net
- @chenyu_hugo
Stars
Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.
This repository contains various advanced techniques for Retrieval-Augmented Generation (RAG) systems.
Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
This repo provides the server side code for llmsherpa API to connect. It includes parsers for various file formats.
Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
A Unified Toolkit for Deep Learning Based Document Image Analysis
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
Build resilient language agents as graphs.
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
Finetune Llama 3.3, DeepSeek-R1, Gemma 3 & Reasoning LLMs 2x faster with 70% less memory! 🦥
Knowledge Agents and Management in the Cloud
Implementation of Nougat Neural Optical Understanding for Academic Documents
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS ev…
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23
Modular Python framework for AI agents and workflows with chain-of-thought reasoning, tools, and memory.
🐢 Open-Source Evaluation & Testing for AI & LLM systems
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
Salesforce open-source LLMs with 8k sequence length.
Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of …
中文法律LLaMA (LLaMA for Chinese legel domain)