Skip to content
View lxaw's full-sized avatar
🚀
🚀

Highlights

  • Pro

Block or report lxaw

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Everything about the SmolLM2 and SmolVLM family of models

Python 2,060 117 Updated Mar 27, 2025

Doge Family of Small Language Model

Python 121 11 Updated Mar 26, 2025

[ICLR 2025] SDTT: a simple and effective distillation method for discrete diffusion models

Python 21 2 Updated Jan 22, 2025

Fast inference from large lauguage models via speculative decoding

Python 695 68 Updated Aug 22, 2024

The attention map viewer for LLaMA models.

Python 31 3 Updated Dec 16, 2023

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 37,642 4,319 Updated Mar 27, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 42,875 6,505 Updated Mar 27, 2025

Official PyTorch implementation for ICLR2025 paper "Scaling up Masked Diffusion Models on Text"

Python 142 10 Updated Dec 22, 2024

ASTRA-sim2.0: Modeling Hierarchical Networks and Disaggregated Systems for Large-model Training at Scale

C++ 333 126 Updated Feb 23, 2025

Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

Python 430 27 Updated Mar 25, 2025

[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding

Python 1,221 73 Updated Mar 6, 2025

Official PyTorch implementation for "Large Language Diffusion Models"

Python 1,340 99 Updated Mar 13, 2025

[ICLR 2025] PEARL: Parallel Speculative Decoding with Adaptive Draft Length

Python 62 2 Updated Mar 12, 2025

An AI Hedge Fund Team

Python 19,470 3,546 Updated Mar 25, 2025

[ICML 2024] CLLMs: Consistency Large Language Models

Python 388 17 Updated Nov 16, 2024

Codebase of Truncated Consistency Models (ICLR 2025)

Python 19 1 Updated Jan 24, 2025

Layer- and Timestep-Adaptive Differentiable Token Compression Ratios for Efficient Diffusion Transformers

6 Updated Mar 12, 2025

Best practices for distilling large language models.

Jupyter Notebook 510 37 Updated Feb 1, 2024

s1: Simple test-time scaling

Python 6,076 709 Updated Mar 6, 2025

Development repository for the Triton language and compiler

MLIR 15,005 1,889 Updated Mar 27, 2025

[CVPR 2025] Official PyTorch Implementation of MambaVision: A Hybrid Mamba-Transformer Vision Backbone

Python 1,202 59 Updated Mar 26, 2025

ConceptAttention: A method for interpreting multi-modal diffusion transformers.

Jupyter Notebook 188 6 Updated Mar 11, 2025

Helpful tools and examples for working with flex-attention

Python 699 38 Updated Mar 18, 2025

Official codebase for "Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling".

Python 227 19 Updated Feb 19, 2025

Some Conferences' accepted paper lists (including AI, ML, Robotic)

Python 1,093 76 Updated Jan 23, 2025
Python 4 Updated Jan 28, 2025

This repository collects papers for "A Survey on Knowledge Distillation of Large Language Models". We break down KD into Knowledge Elicitation and Distillation Algorithms, and explore the Skill & V…

949 57 Updated Mar 9, 2025

Examples and guides for using the OpenAI API

MDX 62,505 10,117 Updated Mar 27, 2025

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Python 8,347 519 Updated May 3, 2024
Next
Showing results