Skip to content
View tr3e's full-sized avatar

Block or report tr3e

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.

Jupyter Notebook 971 41 Updated Mar 27, 2025

[NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward

Python 852 60 Updated Feb 16, 2025

SpatialLM: Large Language Model for Spatial Understanding

Python 2,466 171 Updated Mar 27, 2025

Code of LHM: Large Animatable Human Reconstruction Model for Single Image to 3D in Seconds

Python 1,152 77 Updated Mar 27, 2025

A community-driven AI automation framework that builds upon the incredible work of the open source community. Our goal is to combine language models with specialized tools for tasks like web search…

Python 5,295 564 Updated Mar 26, 2025

Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning

Jupyter Notebook 150 8 Updated Sep 26, 2022

[CVPR 2025] VGGT: Visual Geometry Grounded Transformer

Python 3,373 194 Updated Mar 25, 2025
875 25 Updated Mar 12, 2025

A fast MoE impl for PyTorch

Python 1,683 194 Updated Feb 10, 2025

MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.

Jupyter Notebook 7,283 464 Updated Nov 6, 2024

⏰ AI conference deadline countdowns

TypeScript 246 33 Updated Mar 15, 2025

Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.

Python 4,954 482 Updated Mar 27, 2025

Wan: Open and Advanced Large-Scale Video Generative Models

Python 9,199 996 Updated Mar 26, 2025

Solve Visual Understanding with Reinforced VLMs

Python 4,374 267 Updated Mar 24, 2025

[CVPR 2025] A Hierarchical Movie Level Dataset for Long Video Generation

Python 52 2 Updated Mar 16, 2025

The first open autoregressive foundational video AI model.

2,879 487 Updated Oct 14, 2024

[IJCV 2024] LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models

Python 919 61 Updated Nov 13, 2024

SkyReels V1: The first and most advanced open-source human-centric video foundation model

Python 1,903 178 Updated Mar 10, 2025
Python 1,096 88 Updated Jan 8, 2025

GLM-4-Voice | 端到端中英语音对话模型

Python 2,797 230 Updated Dec 5, 2024

MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

Python 19,081 1,374 Updated Mar 3, 2025

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 7,357 566 Updated Mar 20, 2025
Python 4,079 327 Updated Mar 12, 2025

Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。

Python 1,692 192 Updated Jan 16, 2025

An open source deep research clone. AI Agent that reasons large amounts of web data extracted with Firecrawl

TypeScript 5,177 635 Updated Feb 23, 2025

🤗 smolagents: a barebones library for agents that think in python code.

Python 15,861 1,400 Updated Mar 26, 2025

Magic to turn Cursor/Windsurf as 90% of Devin

Python 5,202 689 Updated Mar 22, 2025

Clean, minimal, accessible reproduction of DeepSeek R1-Zero

Python 11,377 1,439 Updated Mar 10, 2025
Next
Showing results