Skip to content
View eai2x's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report eai2x

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

An open-source implementaion for fine-tuning Qwen2-VL and Qwen2.5-VL series by Alibaba Cloud.

Python 523 57 Updated Mar 27, 2025

This repository offers a comprehensive collection of tutorials on state-of-the-art computer vision models and techniques. Explore everything from foundational architectures like ResNet to cutting-e…

Jupyter Notebook 7,461 1,166 Updated Mar 24, 2025

Fine-tuning Qwen2.5-VL for vision-language tasks | Optimized for Vision understanding | LoRA & PEFT support.

Python 31 3 Updated Feb 7, 2025

Frontier Multimodal Foundation Models for Image and Video Understanding

Jupyter Notebook 675 43 Updated Mar 21, 2025

Evaluating and reproducing real-world robot manipulation policies (e.g., RT-1, RT-1-X, Octo) in simulation under common setups (e.g., Google Robot, WidowX+Bridge) (CoRL 2024)

Jupyter Notebook 553 76 Updated Feb 25, 2025

🙃 A delightful community-driven (with 2,400+ contributors) framework for managing your zsh configuration. Includes 300+ optional plugins (rails, git, macOS, hub, docker, homebrew, node, php, python…

Shell 177,083 26,037 Updated Mar 25, 2025

Cosmos-Reason1 models understand the physical common sense and generate appropriate embodied decisions in natural language through long chain-of-thought reasoning processes.

193 2 Updated Mar 28, 2025

NVIDIA Isaac GR00T N1 is the world's first open foundation model for generalized humanoid robot reasoning and skills.

Jupyter Notebook 2,882 344 Updated Mar 24, 2025
Jupyter Notebook 11 1 Updated Mar 11, 2025

Solve Visual Understanding with Reinforced VLMs

Python 4,380 268 Updated Mar 24, 2025

Embodied Reasoning Question Answer (ERQA) Benchmark

Python 102 3 Updated Mar 12, 2025

Standard Open Arm 100

CMake 1,196 88 Updated Mar 25, 2025

RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation

Python 1,035 104 Updated Mar 13, 2025

🔥 SpatialVLA: a spatial-enhanced vision-language-action model that is trained on 1.1 Million real robot episodes.

Python 177 9 Updated Mar 21, 2025

Github Homepage

3 Updated Mar 27, 2025

⚡ Dynamically generated stats for your github readmes

JavaScript 72,205 24,183 Updated Mar 27, 2025

Scripts for converting OpenX(rlds) dataset to LeRobot dataset.

Python 65 3 Updated Mar 18, 2025

🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning

Python 11,130 1,223 Updated Mar 27, 2025
Python 242 2 Updated Mar 27, 2025

Re-implementation of pi0 vision-language-action (VLA) model from Physical Intelligence

Python 779 46 Updated Jan 31, 2025

OpenVLA: An open-source vision-language-action model for robotic manipulation.

Python 2,370 300 Updated Mar 23, 2025

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 16,903 2,211 Updated Feb 1, 2025

DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

Python 4,632 1,691 Updated Feb 26, 2025

Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 9,259 641 Updated Mar 27, 2025

Fully open reproduction of DeepSeek-R1

Python 23,400 2,127 Updated Mar 27, 2025

A generative world for general-purpose robotics & embodied AI learning.

Python 24,571 2,148 Updated Mar 27, 2025
Next
Showing results