Skip to content
View CVHub520's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report CVHub520

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Integrate the DeepSeek API into popular softwares

29,756 3,223 Updated Mar 21, 2025

Use the Moondream 2 model to detect faces and their gaze directions in videos.

Python 39 2 Updated Jan 13, 2025

[CVPR 2025] Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis

Python 1,234 154 Updated Mar 15, 2025

Industry leading face manipulation platform

Python 22,084 3,341 Updated Mar 13, 2025

Valley is a cutting-edge multimodal large model designed to handle a variety of tasks involving text, images, and video data.

Python 223 13 Updated Feb 27, 2025

Official implementation of the paper "Watermark Anything with Localized Messages"

Jupyter Notebook 976 39 Updated Mar 18, 2025

🎬 ScreenToGif allows you to record a selected area of your screen, edit and save it as a gif or video.

C# 24,580 2,227 Updated Mar 17, 2025

Python tool for converting files and office documents to Markdown.

Python 41,116 1,942 Updated Mar 21, 2025

[CVPR 2025] DEIM: DETR with Improved Matching for Fast Convergence

Python 533 84 Updated Mar 12, 2025

Segment Anything Model 2 CPP Wrapper for macOS and Ubuntu CPU/GPU

C++ 122 12 Updated Nov 28, 2024

Retrieval and Retrieval-augmented LLMs

Python 9,060 651 Updated Mar 20, 2025

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 59 2 Updated Mar 10, 2025

Code for ChatRex: Taming Multimodal LLM for Joint Perception and Understanding

Python 168 8 Updated Jan 24, 2025

A high-performance C++ headers for real-time object detection and segmentation using YOLO models, leveraging ONNX Runtime and OpenCV for seamless integration. Supports multiple YOLO (v5, v7, v8, v9…

C++ 455 53 Updated Mar 17, 2025

Implementation of the "Learn No to Say Yes Better" paper.

Python 30 2 Updated Mar 3, 2025

Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"

Python 6,626 426 Updated Mar 18, 2025

A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python

Python 18,305 2,527 Updated Mar 14, 2025
JavaScript 3,035 1,158 Updated Jun 21, 2024

:electron: Another Mihomo GUI.

TypeScript 10,682 754 Updated Mar 3, 2025

📄 A curated list of awesome .cursorrules files

17,217 1,199 Updated Mar 20, 2025

Let your Claude able to think

TypeScript 14,772 1,716 Updated Mar 10, 2025

[TMLR 2025🔥] A survey for the autoregressive models in vision.

444 15 Updated Mar 22, 2025

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)

Python 5,821 565 Updated Mar 22, 2025

ModelScope: bring the notion of Model-as-a-Service to life.

Python 7,588 781 Updated Mar 21, 2025

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,875 129 Updated Oct 30, 2024

A Toolkit to Help Optimize Onnx Model

Python 125 13 Updated Mar 22, 2025

Effortless data labeling with AI support from Segment Anything and other awesome models.

Python 5,097 578 Updated Feb 26, 2025

Official code repo for the O'Reilly Book - "Hands-On Large Language Models"

Jupyter Notebook 5,742 1,264 Updated Feb 15, 2025

The repository provides code for running image, video, and camera inference using the SAM 2.

Jupyter Notebook 5 Updated Sep 25, 2024
Next
Showing results