Skip to content
View xxlllz's full-sized avatar

Block or report xxlllz

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[CVPR'25] Official Implementation of MambaIC: State Space Models for High-Performance Learned Image Compression

20 1 Updated Mar 18, 2025

[ICLR2025 Oral] ChartMoE: Mixture of Diversely Aligned Expert Connector for Chart Understanding

Jupyter Notebook 49 1 Updated Mar 21, 2025

Multimodal Large Language Models for Code Generation under Multimodal Scenarios

56 2 Updated Mar 20, 2025

Awesome-Long2short-on-LRMs is a collection of state-of-the-art, novel, exciting long2short methods on large reasoning models. It contains papers, codes, datasets, evaluations, and analyses.

171 4 Updated Mar 26, 2025

The official implementation of SRM-Hair.

26 Updated Mar 11, 2025

[CVPR 2025] Magma: A Foundation Model for Multimodal AI Agents

Python 1,517 100 Updated Mar 28, 2025

Solve Visual Understanding with Reinforced VLMs

Python 4,407 271 Updated Mar 24, 2025

TokenSkip: Controllable Chain-of-Thought Compression in LLMs

Python 103 2 Updated Mar 13, 2025

Latest Advances on System-2 Reasoning

Python 862 33 Updated Mar 27, 2025

Official code repo for the paper "ChemToolAgent: The Impact of Tools on Language Agents for Chemistry Problem Solving" (previously "Tooling or Not Tooling? The Impact of Tools on Language Agents fo…

Python 8 4 Updated Mar 8, 2025

[ICLR 2025] Official Implementation of Local-Prompt: Extensible Local Prompts for Few-Shot Out-of-Distribution Detection

Python 17 2 Updated Mar 20, 2025

This repository collects awesome survey, resource, and paper for lifelong learning LLM agents

Python 153 11 Updated Feb 1, 2025
Python 1 Updated Jan 17, 2025

Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models

Python 49 2 Updated Jan 15, 2025

ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation

Python 29 1 Updated Mar 18, 2025

Agent Laboratory is an end-to-end autonomous research workflow meant to assist you as the human researcher toward implementing your research ideas

Python 4,088 589 Updated Mar 27, 2025

MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering

Python 655 84 Updated Jan 14, 2025

Langflow is a powerful tool for building and deploying AI-powered agents and workflows.

Python 53,341 5,844 Updated Mar 30, 2025

Learning to Use Medical Tools with Multi-modal Agent

Python 128 14 Updated Feb 15, 2025

The official implementation of S2TD-Face in ACM-MM 2024.

Python 36 Updated Nov 11, 2024

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 45,631 5,577 Updated Mar 28, 2025

Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.

Python 2,267 233 Updated Mar 29, 2025

[CVPR 2024] Official PyTorch Code for "PromptKD: Unsupervised Prompt Distillation for Vision-Language Models"

Python 284 4 Updated Mar 18, 2025

A curated list of awesome prompt/adapter learning methods for vision-language models like CLIP.

484 21 Updated Mar 25, 2025

A curated list of recent and past chart understanding work based on our IEEE TKDE survey paper: From Pixels to Insights: A Survey on Automatic Chart Understanding in the Era of Large Foundation Mod…

194 19 Updated Feb 23, 2025

The codebase for our EMNLP24 paper: Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model

Python 74 4 Updated Jan 27, 2025

ChartMimic: Evaluating LMM’s Cross-Modal Reasoning Capability via Chart-to-Code Generation

Python 104 1 Updated Jul 15, 2024

Reading list for research topics in multimodal machine learning

6,353 873 Updated Aug 20, 2024

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 22,025 2,416 Updated Aug 12, 2024
Python 1,464 110 Updated May 12, 2023
Next
Showing results