Stars
Clean, minimal, accessible reproduction of DeepSeek R1-Zero
Build resilient language agents as graphs.
Fully local web research and report writing assistant
Official Dockerfile for Delta Lake
Creation of a data lakehouse and an ELT pipeline to enable the efficient analysis and use of data
A minimal docker compose setup for experimenting with cloud agnostic Lakehouse Architectures Apache Spark with Hive Metastore + Delta Lake + MinIO
This Python Class will have you up and running with Delta lakes in no time. There are methods, particularly for beginners, that make it simple to insert update delete into delta lake.
universal-datalakehouse-postgres-ingestion-deltastreamer
Enables synchronizing metadata changes (Create/Drop table/partition) from Hive Metastore to AWS Glue Data Catalog
Jan is an open source alternative to ChatGPT that runs 100% offline on your computer
Nuclia RAG-as-a-Service, automatically indexes files and documents, from internal and external sources, to fuel diverse company use cases with LLMs.
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 2, and other large language models.
Connecting Azure Databricks Securely with PowerApps/Flow. Same approach can be used for LogicApp
MinIO is a high-performance, S3 compatible object store, open sourced under GNU AGPLv3 license.
DAOS Storage Stack (client libraries, storage engine, control plane)
Source code for Lotus: Scalable Multi-Partition Transactions on Single-Threaded Partitioned Databases
Tools for distributed alignment of massive images
Open Control Plane for Tables in Data Lakehouse
Open source alternative to AWS. Elastic compute, block storage (non replicated), firewall and load balancer, managed Postgres, and IAM services in public beta.
Core Fiberplane data models and methods for transforming them (templates, providers, markdown conversion)
Amazon Redshift Utils contains utilities, scripts and view which are useful in a Redshift environment