Skip to content
View thomas-yanxin's full-sized avatar
:octocat:
Regular bencher
:octocat:
Regular bencher

Organizations

@ColugoMum @X-D-Lab

Block or report thomas-yanxin

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Roblox Foundation Model for 3D Intelligence

Jupyter Notebook 278 15 Updated Mar 21, 2025

A generative world for general-purpose robotics & embodied AI learning.

Python 24,471 2,135 Updated Mar 21, 2025
Python 44 9 Updated Feb 19, 2025

[CVPR 2025] Source codes for the paper "3D-Mem: 3D Scene Memory for Embodied Exploration and Reasoning"

Python 87 4 Updated Mar 16, 2025

DexGraspVLA: A Vision-Language-Action Framework Towards General Dexterous Grasping

Python 158 10 Updated Mar 20, 2025
Python 41 1 Updated Jan 13, 2025

VaViM and VaVAM: Autonomous Driving through Video Generative Modeling (official repository).

Jupyter Notebook 74 5 Updated Mar 7, 2025

The python library for real-time communication

JavaScript 3,172 270 Updated Mar 21, 2025

[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling

Python 1,068 77 Updated Mar 2, 2025

Tencent Hunyuan3D-1.0: A Unified Framework for Text-to-3D and Image-to-3D Generation

Python 3,347 255 Updated Jan 21, 2025

Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)

Python 69,168 8,519 Updated Mar 20, 2025

Preprocess Audio for training

Python 318 58 Updated Mar 3, 2025

GLM-4-Voice | 端到端中英语音对话模型

Python 2,778 230 Updated Dec 5, 2024

High-quality and streaming Speech-to-Speech interactive agent in a single file. 只用一个文件实现的流式全双工语音交互原型智能体!

Python 375 36 Updated Mar 4, 2025

Data-Driven Astrology 💫 Kerykeion is a Python library for astrology. It generates SVG charts and extracts detailed data for birth charts, synastry, transits, and composite charts.

Python 373 128 Updated Mar 6, 2025

Quickly produce both human-readable and JSON-formatted astrology chart data based on the Swiss Ephemeris and astro.com.

Python 67 14 Updated Mar 17, 2025

Clone a voice in 5 seconds to generate arbitrary speech in real-time

Python 53,786 8,907 Updated Aug 14, 2024

TEN Agent is a conversational voice AI agent powered by TEN, integrating Deepseek, Gemini, OpenAI, RTC, and hardware like ESP32. It enables realtime AI capabilities like seeing, hearing, and speaki…

Python 5,156 584 Updated Mar 21, 2025

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 2,860 196 Updated Nov 14, 2024

EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language Models (LLMs).

Jupyter Notebook 207 24 Updated Oct 30, 2024

Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.

Python 2,310 172 Updated Mar 19, 2025

Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs

Python 225 23 Updated Mar 20, 2025

End-to-end stack for WebRTC. SFU media server and SDKs.

Go 11,996 1,049 Updated Mar 21, 2025

非常全的文言文(古文)-现代文平行语料

Python 1,294 301 Updated Apr 21, 2024

[ICLR 2025] Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.

Python 1,275 56 Updated Mar 12, 2025
Python 6 Updated Sep 24, 2024

A pipeline for LLM knowledge distillation

Python 98 10 Updated Jan 25, 2025

✨✨VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Python 2,167 163 Updated Feb 13, 2025
Next
Showing results