Purshow

Purshow Purshow

34 followers · 48 following

Achievements

Highlights

Stars

JiaqiLiao77 / LangBridge

Langbridge: Interpreting Image as a Combination of Language Embeddings

3 Updated Mar 25, 2025

Hon-Wong / VoRA

16 Updated Mar 27, 2025

1230young / bizgen

[CVPR 2025] This is an official inference code of the paper "BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation" . Project page: https://bizgen-msra.github.io/

Python 87 7 Updated Mar 27, 2025

NVIDIA / Cosmos-Tokenizer

A suite of image and video neural tokenizers

Jupyter Notebook 1,589 74 Updated Feb 11, 2025

NVIDIA / Cosmos

Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…

Jupyter Notebook 7,841 503 Updated Mar 28, 2025

chuanyangjin / fast-DiT

Fast Diffusion Models with Transformers

Python 811 108 Updated Oct 25, 2024

YuqingWang1029 / TokenBridge

TokenBridge: Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation

73 Updated Mar 26, 2025

deepcs233 / Visual-CoT

[Neurips'24 Spotlight] Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning

Python 280 14 Updated Dec 22, 2024

Jiayi-Pan / TinyZero

Clean, minimal, accessible reproduction of DeepSeek R1-Zero

Python 11,410 1,443 Updated Mar 10, 2025

facebookresearch / iGSM

The code for creating the iGSM datasets in papers "Physics of Language Models Part 2.1, Grade-School Math and the Hidden Reasoning Process" (arxiv 2407.20311) and "Physics of Language Models Part 2…

Python 40 2 Updated Jan 12, 2025

Gen-Verse / WideRange4D

WideRange4D: Enabling High-Quality 4D Reconstruction with Wide-Range Movements and Scenes

Jupyter Notebook 69 2 Updated Mar 19, 2025

PKU-YuanGroup / DreamDance

68 Updated Mar 18, 2025

Token-family / TokenFD

A Token-level Text Image Foundation Model for Document Understanding

Python 78 6 Updated Mar 25, 2025

Osilly / Vision-R1

This is the first paper to explore how to effectively use RL for MLLMs and introduce Vision-R1, a reasoning MLLM that leverages cold-start initialization and RL training to incentivize reasoning ca…

Python 404 9 Updated Mar 24, 2025

rongyaofang / GoT

Official repository of "GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing"

Jupyter Notebook 179 6 Updated Mar 21, 2025

lmb-freiburg / two-effects-one-trigger

Official code for the paper "Two Effects, One Trigger: On the Modality Gap, Object Bias, and Information Imbalance in Contrastive Vision-Language Models" (ICLR 2025 Oral)

3 Updated Feb 7, 2025

saccharomycetes / mllms_know

[ICLR'25] Official code for the paper 'MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs'

Python 97 4 Updated Mar 25, 2025

SxJyJay / UniToken

UniToken is an auto-regressive generation model that combines discrete and continuous representations to process visual inputs, making it easy to integrate both visual understanding and image gener…

Python 35 2 Updated Feb 21, 2025

Open-DataFlow / Awesome_MLLMs_Reasoning

74 2 Updated Mar 20, 2025

yu-rp / apiprompting

[ECCV 2024] API: Attention Prompting on Image for Large Vision-Language Models

Python 79 6 Updated Oct 10, 2024

LengSicong / MMR1

MMR1: Advancing the Frontiers of Multimodal Reasoning

148 4 Updated Mar 17, 2025

yu-rp / VisualPerceptionToken

Python 55 3 Updated Mar 22, 2025

zhangguiwei610 / V2Flow

Jupyter Notebook 16 1 Updated Mar 29, 2025

SWivid / F5-TTS

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Python 10,824 1,496 Updated Mar 28, 2025

Wang-Xiaodong1899 / CVPR25-MLLM-Paper-List

🔥CVPR 2025 Multimodal Large Language Models Paper List

123 3 Updated Mar 12, 2025

OpenRLHF / OpenRLHF-M

An Easy-to-use, Scalable and High-performance RLHF Framework designed for Multimodal Models.

Python 94 5 Updated Mar 10, 2025

Hongcheng-Gao / Awesome-Long2short-on-LRMs

Awesome-Long2short-on-LRMs is a collection of state-of-the-art, novel, exciting long2short methods on large reasoning models. It contains papers, codes, datasets, evaluations, and analyses.

171 4 Updated Mar 26, 2025

PKU-YuanGroup / WISE

WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation

Python 54 Updated Mar 19, 2025

HKUNLP / diffusion-of-thoughts

[NeurIPS 2024] Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"

Python 139 9 Updated Mar 4, 2025

ModalMinds / MM-EUREKA

MM-EUREKA: Exploring Visual Aha Moment with Rule-based Large-scale Reinforcement Learning

Python 458 16 Updated Mar 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Purshow Purshow

Achievements

Achievements

Highlights

Block or report Purshow

Stars

JiaqiLiao77 / LangBridge

Hon-Wong / VoRA

1230young / bizgen

NVIDIA / Cosmos-Tokenizer

NVIDIA / Cosmos

chuanyangjin / fast-DiT

YuqingWang1029 / TokenBridge

deepcs233 / Visual-CoT

Jiayi-Pan / TinyZero

facebookresearch / iGSM

Gen-Verse / WideRange4D

PKU-YuanGroup / DreamDance

Token-family / TokenFD

Osilly / Vision-R1

rongyaofang / GoT

lmb-freiburg / two-effects-one-trigger

saccharomycetes / mllms_know

SxJyJay / UniToken

Open-DataFlow / Awesome_MLLMs_Reasoning

yu-rp / apiprompting

LengSicong / MMR1

yu-rp / VisualPerceptionToken

zhangguiwei610 / V2Flow

SWivid / F5-TTS

Wang-Xiaodong1899 / CVPR25-MLLM-Paper-List

OpenRLHF / OpenRLHF-M

Hongcheng-Gao / Awesome-Long2short-on-LRMs

PKU-YuanGroup / WISE

HKUNLP / diffusion-of-thoughts

ModalMinds / MM-EUREKA