Skip to content

jefferyZhan/Griffon

Repository files navigation

Welcome to Griffon

This is the offical repo of Griffon series (v1 & v2). Griffon is the first high-resolution (over 1K) LVLM capable of localizing everything you are interested in describing the region you specify. In the latest version, Griffon support visual-language co-referring. You can input an image or some descriptions. Griffon achieves excellent performance in REC, object detection, object counting, visual/phrase grounding and REG.


Griffon: Spelling out All Object Locations at Any Granuality with Large Language Model

📕Paper 🌀Usage 🤗Model

Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling and Visual-Language Co-Referring

📕Paper

News

  • 2023.03.15 🔥Griffon v2's paper has been released in 📕Arxiv.
  • 2024.03.11 🔥We are excited to announce the arrival of Griffon v2. Griffion v2 brings fine-grained perception performance to new heights with high-resolution expert-level detection and counting, and supports visual-language co-referring. Take a look at our demo first. Paper, codes, demos and models will be released soon.
  • 2023.12.13 🔥Ready to release the Language-prompted Localization Dataset after final approval in 🤗HuggingFace.
  • 2023.12.06 🔥Release the inference code and model in 🤗HuggingFace.
  • 2023.11.29 🔥Paper has been released in 📕Arxiv.

What can Griffon do now?

Griffon v2 can perform localization with free-form text inputs and visual target inputs with locally cropped images now, supporting the tasks shown as below.

Acknowledgement

  • LLaVA provides the base codes and pre-trained models.
  • Shikra provides the insight of how to organize datasets and some base processed annotations.
  • Llama provides the large language model.
  • volgachen provides the basic environment setting config.

License

Code License Data License

The data and checkpoint is licensed for research use only. All of them are also restricted to uses that follow the license agreement of LLaVA, LLaMA and GPT-4. The dataset is CC BY NC 4.0 (allowing only non-commercial use) and models trained using the dataset should not be used outside of research purposes.

About

The official repo of Griffon

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages