Skip to content
View j-min's full-sized avatar

Highlights

  • Pro

Organizations

@PyTorchKR @PyTorchKorea

Block or report j-min

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 9,328 645 Updated Mar 27, 2025
Python 20 2 Updated Feb 16, 2025

A generative world for general-purpose robotics & embodied AI learning.

Python 24,591 2,149 Updated Mar 29, 2025

HART: Efficient Visual Generation with Hybrid Autoregressive Transformer

Python 507 33 Updated Oct 16, 2024

PhD Dissertation Template for UNC Computer Science

TeX 5 2 Updated Feb 7, 2023

Implementation of the paper: "Answering Questions by Meta-Reasoning over Multiple Chains of Thought"

Python 94 12 Updated Jan 21, 2024

Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks

Python 2,097 305 Updated Mar 28, 2025

Jupyter notebook server extension to proxy web services.

Python 364 149 Updated Mar 17, 2025

PyTorch code for System-1.x: Learning to Balance Fast and Slow Planning with Language Models

Python 21 2 Updated Jul 22, 2024

Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-of-use, backed by research.

Python 3,346 222 Updated Feb 11, 2025

A reading list of video generation

531 35 Updated Mar 25, 2025

LLM Comparator is an interactive data visualization tool for evaluating and analyzing LLM responses side-by-side, developed by the PAIR team.

JavaScript 401 33 Updated Feb 11, 2025
Python 3,878 253 Updated Mar 15, 2024

Official implementation of SEED-LLaMA (ICLR 2024).

Python 606 33 Updated Sep 21, 2024

Official implementation of Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model (ICLR 2025 Oral)

Python 429 14 Updated Feb 11, 2025

4M: Massively Multimodal Masked Modeling

Python 1,703 103 Updated Mar 7, 2025

[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…

Jupyter Notebook 7,205 455 Updated Mar 22, 2025

Rethinking Interactive Image Segmentation with Low Latency, High Quality, and Diverse Prompts (CVPR 2024)

Python 76 11 Updated Sep 28, 2024

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Python 3,261 282 Updated May 4, 2024

OCR, layout analysis, reading order, table recognition in 90+ languages

Python 16,990 1,108 Updated Mar 28, 2025

Official Code Repository for EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents (COLM 2024)

Python 29 2 Updated Jul 13, 2024

Code and Data for Paper: SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data

Python 33 2 Updated Mar 12, 2024

Code for the paper "pix2gestalt: Amodal Segmentation by Synthesizing Wholes" (CVPR 2024)

Python 162 12 Updated May 3, 2024

MM-Interleaved: Interleaved Image-Text Generative Modeling via Multi-modal Feature Synchronizer

Python 220 12 Updated Apr 3, 2024

Machine Theory of Mind Reading List. Built upon EMNLP Findings 2023 Paper: Towards A Holistic Landscape of Situated Theory of Mind in Large Language Models

123 7 Updated Feb 18, 2025

[CVPR 2024] Official PyTorch implementation of "ECLIPSE: Revisiting the Text-to-Image Prior for Efficient Image Generation"

Python 62 10 Updated May 1, 2024
Python 1,799 60 Updated Jun 28, 2024

[CVPR 24] The repository provides code for running inference and training for "Segment and Caption Anything" (SCA) , links for downloading the trained model checkpoints, and example notebooks / gra…

Python 217 10 Updated Sep 30, 2024
Next
Showing results