A curated list of general AI methods for Anything: AnyObject, AnyGeneration, AnyModel, AnyTask, etc.
Contributions are welcome!
- Awesome-Anything
- AnyObject - Segmentation, Detection, Classification, etc.
- AnyGeneration - Text-to-Image Generation, Editing, Inpainting.
- AnyTask - LLM Controller + ModelZoo, General Decoding, Multi-Task Learning.
- AnyModel - Network Pruning, Network Quantization, Model Reuse.
- AnyX - Other Topics: Captioning, etc.
- Paper List
Title & Authors | Intro | Useful Links |
---|---|---|
Segment Anything Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alex Berg, Wan-Yen Lo, Piotr Dollar, Ross Girshick Preprint'23 [Segment Anything (Project)] |
[Github] [Page] [Demo] |
|
Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection Shilong Liu and Zhaoyang Zeng and Tianhe Ren and Feng Li and Hao Zhang and Jie Yang and Chunyuan Li and Jianwei Yang and Hang Su and Jun Zhu and Lei Zhang Preprint'23 [Grounded-SAM, GroundingDINO (Project)] |
[Github] [Demo] |
|
SegGPT: Segmenting Everything In Context Xinlong Wang, Xiaosong Zhang, Yue Cao, Wen Wang, Chunhua Shen, Tiejun Huang Preprint'23 [SegGPT (Project)] |
[Github] | |
V3Det: Vast Vocabulary Visual Detection Dataset Jiaqi Wang, Pan Zhang, Tao Chu, Yuhang Cao, Yujie Zhou, Tong Wu, Bin Wang, Conghui He, Dahua Lin Preprint'23 |
-- | |
segment-anything-video (Project) Kadir Nar |
[Github] | |
Towards Segmenting Anything That Moves Achal Dave, Pavel Tokmakov, Deva Ramanan ICCV'19 Workshop [segment-any-moving (Project)] |
[Github] | |
Semantic Segment Anything Jiaqi Chen, Zeyu Yang, Li Zhang [Semantic-Segment-Anything (Project)] |
[Github] | |
Grounded Segment Anything: From Objects to Parts (Project) Peize Sun and Shoufa Chen |
[Github] | |
GroundedSAM-zero-shot-anomaly-detection (Project) Yunkang Cao |
[Github] | |
Segment Anything Labelling Tool (SALT) (Project) Anurag Ghosh |
[Github] | |
Prompt-Segment-Anything (Project) Rockey |
[Github] | |
SAM-RBox (Project) Qingyun Li |
[Github] | |
VISAM (Project) Feng Yan, Weixin Luo, Yujie Zhong, Yiyang Gan, Lin Ma |
[Github] |
|
Segment Anything Prompt (Project) MagicSource |
[Github] | |
Segment Anything EO tools: Earth observation tools for Meta AI Segment Anything (Project) Aliaksandr Hancharenka, Alexander Chichigin |
[Github] | |
napari-segment-anything: Segment Anything Model (SAM) native Qt UI (Project) Jordão Bragantini, Kyle I S Harrington, Ajinkya Kulkarni |
[Github] |
Title & Authors | Intro | Useful Links |
---|---|---|
High-Resolution Image Synthesis with Latent Diffusion Models Robin Rombach and Andreas Blattmann and Dominik Lorenz and Patrick Esser and Björn Ommer CVPR'22 [Stable-Diffusion (Project)] |
[Github] [Page] [Demo] |
|
Adding Conditional Control to Text-to-Image Diffusion Models Lvmin Zhang, Maneesh Agrawala Preprint'23 [ControlNet (Project)] |
[Github] [Demo] |
|
GigaGAN: Large-scale GAN for Text-to-Image Synthesis Minguk Kang, Jun-Yan Zhu, Richard Zhang, Jaesik Park, Eli Shechtman, Sylvain Paris, Taesung Park CVPR'23 |
[Page] | |
Inpaint-Anything: Segment Anything Meets Image Inpainting (Project) Tao Yu |
[Github] | |
IEA: Image Editing Anything (Project) Zhengcong Fei |
[Github] | |
EditAnything (Project) Shanghua Gao, Pan Zhou |
[Github] | |
Segment Anything for Stable Diffusion Webui (Project) Chengsong Zhang |
[Github] | |
Segment Anything with Clip (Project) Jinwoo Park |
[Github] |
Title & Authors | Intro | Useful Links |
---|---|---|
HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace Yongliang Shen, Kaitao Song, Xu Tan, Dongsheng Li, Weiming Lu, Yueting Zhuang Preprint'23 [Jarvis (Project)] |
[Github] [Demo] |
|
TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs Yaobo Liang, Chenfei Wu, Ting Song, Wenshan Wu, Yan Xia, Yu Liu, Yang Ou, Shuai Lu, Lei Ji, Shaoguang Mao, Yun Wang, Linjun Shou, Ming Gong, Nan Duan Preprint'23 |
[Github] | |
Generalized Decoding for Pixel, Image and Language Xueyan Zou, Zi-Yi Dou, Jianwei Yang, Zhe Gan, Linjie Li, Chunyuan Li, Xiyang Dai, Harkirat Behl, Jianfeng Wang, Lu Yuan, Nanyun Peng, Lijuan Wang, Yong Jae Lee, Jianfeng Gao CVPR'23 [X-Decoder (Project)] |
[Github] [Page] [Demo] |
|
Pre-Trained Image Processing Transformer Chen, Hanting and Wang, Yunhe and Guo, Tianyu and Xu, Chang and Deng, Yiping and Liu, Zhenhua and Ma, Siwei and Xu, Chunjing and Xu, Chao and Gao, Wen CVPR'21 [Pretrained-IPT (Project)] |
[Github] | |
OpenAGI: When LLM Meets Domain Experts Yingqiang Ge, Wenyue Hua, Jianchao Ji, Juntao Tan, Shuyuan Xu, Yongfeng Zhang [OpenAGI (Project)] |
Github |
Title & Authors | Intro | Useful Links |
---|---|---|
DepGraph: Towards Any Structural Pruning Gongfan Fang, Xinyin Ma, Mingli Song, Michael Bi Mi, Xinchao Wang CVPR'23 [Torch-Pruning (Project)] |
[Github] [Demo] |
|
MQBench: Towards Reproducible and Deployable Model Quantization Benchmark Yuhang Li and Mingzhu Shen and Jian Ma and Yan Ren and Mingxin Zhao and Qi Zhang and Ruihao Gong and Fengwei Yu and Junjie Yan NeurIPS'21 [MQBench (Project)] |
[Github] [Page] |
|
OTOv2: Automatic, Generic, User-Friendly Tianyi Chen, Luming Liang, Tianyu Ding, Ilya Zharkov ICLR'23 [Only Train Once (Project)] |
[Github] | |
Deep Model Reassembly Xingyi Yang, Daquan Zhou, Songhua Liu, Jingwen Ye, Xinchao Wang NeurIPS'22 [Deep Model Reassembly (Project)] |
[Github] [Page] |
Title & Authors | Intro | Useful Links |
---|---|---|
Caption Anything (Project) Teng Wang, Jinrui Zhang, Junjie Fei, Yunlong Tang, Zhe Li, Mingqi Gao |
[Github] | |
... |
A paper list for Anything AI
Paper | First Author | Venue | Topic |
---|---|---|---|
Segment Anything | Alexander Kirillov | Preprint'23 | Segmentation |
Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection | Shilong Liu | Preprint'23 | Grouding+Detection |
SegGPT: Segmenting Everything In Context | Xinlong Wang | Preprint'23 | Segmentation |
V3Det: Vast Vocabulary Visual Detection Dataset | Jiaqi Wang | Preprint'23 | Dataset |
Paper | First Author | Venue | Topic |
---|---|---|---|
High-Resolution Image Synthesis with Latent Diffusion Models | Robin Rombach | CVPR'22 | Text-to-Image Generation |
Adding Conditional Control to Text-to-Image Diffusion Models | Lvmin Zhang | Preprint'23 | Controlllable Generation |
GigaGAN: Large-scale GAN for Text-to-Image Synthesis | Minguk Kang | CVPR'23 | Large-scale GAN |
Paper | First Author | Venue | Topic |
---|---|---|---|
DepGraph: Towards Any Structural Pruning | Gongfan Fang | CVPR'23 | Network Pruning |
MQBench: Towards Reproducible and Deployable Model Quantization Benchmark | Yuhang Li | NeurIPS'21 | Network Quantization |
OTOv2: Automatic, Generic, User-Friendly | Tianyi Chen | ICLR'23 | Network Pruning |
Deep Model Reassembly | Xingyi Yang | NeurIPS'22 | Model Reuse |
Paper | First Author | Venue | Topic |
---|---|---|---|
HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace | Yongliang Shen | Preprint'23 | Modelzoo + LLM |
TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs | Yaobo Liang | Preprint'23 | Modelzoo + LLM |
Generalized Decoding for Pixel, Image and Language | Xueyan Zou | CVPR'23 | Multi Tasking |
Pre-Trained Image Processing Transformer | Chen, Hanting | CVPR'21 | Low-level Vision |