Skip to content

Pinned Loading

  1. sonic Public

    A blazingly fast JSON serializing & deserializing library

    Go 7.7k 367

  2. monoio Public

    Rust async runtime based on io-uring.

    Rust 4.3k 232

  3. UI-TARS-desktop Public

    A GUI Agent application based on UI-TARS(Vision-Language Model) that allows you to control your computer using natural language.

    TypeScript 7.9k 568

  4. LatentSync Public

    Taming Stable Diffusion for Lip Sync!

    Python 3.3k 487

Repositories

Showing 10 of 329 repositories
  • InfiniteYou Public

    🔥 InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity

    Python 902 Apache-2.0 69 11 2 Updated Mar 25, 2025
  • videx Public

    Virtual Index for MySQL

    Python 20 1 1 0 Updated Mar 26, 2025
  • flowgram.ai Public
    TypeScript 709 MIT 50 7 2 Updated Mar 25, 2025
  • UI-TARS-desktop Public

    A GUI Agent application based on UI-TARS(Vision-Language Model) that allows you to control your computer using natural language.

    TypeScript 7,853 Apache-2.0 568 96 (4 issues need help) 7 Updated Mar 25, 2025
  • volclava Public

    A free and open-source workload scheduler which supports diverse high-performance computing and analytical applications.

    C 13 GPL-2.0 6 1 0 Updated Mar 25, 2025
  • MTVQA Public

    MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering. A comprehensive evaluation of multimodal large model multilingual text perception and comprehension capabilities across nine widely-used yet low-resource languages.

    Python 54 2 3 0 Updated Mar 25, 2025
  • g3 Public

    Enterprise-oriented Generic Proxy Solutions

    Rust 553 Apache-2.0 43 19 (3 issues need help) 4 Updated Mar 25, 2025
  • tarsier Public

    Tarsier -- a family of large-scale video-language models, which is designed to generate high-quality video descriptions , together with good capability of general video understanding.

    Python 331 Apache-2.0 19 19 1 Updated Mar 25, 2025
  • coap Public

    COAP is a memory-efficient training method that reduces computational overhead without sacrificing performance.

    Python 4 Apache-2.0 0 0 0 Updated Mar 25, 2025
  • LVLM_Interpretation Public

    The official repo for "Where do Large Vision-Language Models Look at when Answering Questions?"

    Python 26 GPL-3.0 0 2 0 Updated Mar 24, 2025