Skip to content
View 1a1a11a's full-sized avatar

Highlights

  • Pro

Organizations

@cacheMon

Block or report 1a1a11a

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

A tiny yet powerful LLM inference system tailored for researching purpose. vLLM-equivalent performance with only 2k lines of code (2% of vLLM).

Python 154 13 Updated Jul 5, 2024

Composable building blocks to build Llama Apps

Python 7,589 956 Updated Mar 29, 2025

New file format for storage of large columnar datasets.

C++ 497 39 Updated Mar 20, 2025

Ollama Python library

Python 7,099 633 Updated Mar 20, 2025

A C implementation of the SIEVE cache eviction algorithm, based on the research paper (https://junchengyang.com/publication/nsdi24-SIEVE.pdf)

Makefile 2 Updated Jan 22, 2025

a distributed computation platform for running Python and Bash computation tasks on multiple nodes

Python 9 2 Updated Mar 19, 2025

[NeurIPS 2024] FM-Delta: Lossless Compression for Storing Massive Fine-tuned Foundation Models

C++ 3 Updated Nov 17, 2024

PyTorch per step fault tolerance (actively under development)

Python 272 25 Updated Mar 27, 2025

A practical introduction to Rust

Rust 23 5 Updated Nov 20, 2024

VideoSys: An easy and efficient system for video generation

Python 1,949 129 Updated Mar 9, 2025

AllenAI's post-training codebase

Python 2,843 368 Updated Mar 28, 2025

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

Python 4,487 243 Updated Feb 20, 2025

A cheatsheet of modern C++ language and library features.

20,240 2,144 Updated Oct 15, 2024

[IMC 2020 (Best Paper Finalist)] Using GANs for Sharing Networked Time Series Data: Challenges, Initial Promise, and Open Questions

Python 305 74 Updated Nov 3, 2023

DCPerf benchmark suite for hyperscale cloud applications

Python 161 21 Updated Mar 28, 2025

[CVPR 2023] DepGraph: Towards Any Structural Pruning

Python 2,946 347 Updated Mar 24, 2025

Tools for profiling the Linux network stack.

Python 148 18 Updated Oct 21, 2022

Minimalistic large language model 3D-parallelism training

Python 1,733 167 Updated Mar 28, 2025

Inference code for Llama models

Python 57,958 9,720 Updated Jan 26, 2025

llama3 implementation one matrix multiplication at a time

Jupyter Notebook 14,713 1,235 Updated May 23, 2024

NVIDIA GPUDirect Storage Driver

C 231 36 Updated Dec 11, 2024

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 14,758 1,597 Updated Dec 25, 2024

grep for words with similar meaning to the query

Go 1,151 27 Updated Aug 19, 2024

A simple, high-throughput file client for mounting an Amazon S3 bucket as a local file system.

Rust 5,020 193 Updated Mar 28, 2025

Implementation of a new caching algo called SIEVE. Link to paper included in README

Python 1 Updated Jul 1, 2024

SIEVE Cache for Crystal lang

Crystal 2 Updated May 25, 2024

Cache implementation in ABAP

ABAP 4 1 Updated Jul 18, 2024

Retrieval and Retrieval-augmented LLMs

Python 9,145 657 Updated Mar 20, 2025

A web app for ranking computer science departments according to their research output in selective venues, and for finding active faculty across a wide range of areas.

Python 2,817 3,434 Updated Mar 8, 2025
Next
Showing results