imr555

🤖

There is no easy day. The only easy day was yesterday - Pritom Mojumder

Ifty Mohammad Rezwan imr555

🤖

There is no easy day. The only easy day was yesterday - Pritom Mojumder

Present: AIS@Axon Previously: M.Sc. CV @ UCF, MLE @ Neovotech, NybSys || GRA @ CRCV UCF, Apurba NSU

66 followers · 820 following

Florida, United States
https://imr555.github.io/
@imr165

Achievements

Highlights

Developer Program Member

Stars

visual_language_models_ucf

46 repositories

lucidrains / taylor-series-linear-attention

Explorations into the recently proposed Taylor Series Linear Attention

Python 100 3 Updated Aug 18, 2024

awaisrauf / Awesome-CV-Foundational-Models

547 35 Updated Nov 7, 2024

NiFangBaAGe / Explicit-Visual-Prompt

[CVPR 2023 & TPAMI 2025] Explicit Visual Prompting for Low-Level Structure Segmentations

Python 221 16 Updated Oct 22, 2025

tianrun-chen / SAM-Adapter-PyTorch

Adapting Meta AI's Segment Anything to Downstream Tasks with Adapters and Prompts

Python 1,486 121 Updated Dec 1, 2025

lucidrains / magvit2-pytorch

Implementation of MagViT2 Tokenizer in Pytorch

Python 661 34 Updated Jan 12, 2025

VideoNetworks / TokShift-Transformer

Python 71 7 Updated Oct 6, 2023

tsb0601 / MMVP

Python 360 11 Updated Jan 27, 2024

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 24,554 2,745 Updated Aug 12, 2024

rowanz / merlot

MERLOT: Multimodal Neural Script Knowledge Models

Python 226 25 Updated Mar 15, 2022

pliang279 / awesome-multimodal-ml

Reading list for research topics in multimodal machine learning

6,835 897 Updated Aug 20, 2024

IDEA-Research / Grounded-Segment-Anything

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Jupyter Notebook 17,456 1,582 Updated Sep 5, 2024

OpenGVLab / Awesome-DragGAN

Awesome-DragGAN: A curated list of papers, tutorials, repositories related to DragGAN

83 2 Updated Nov 8, 2023

OpenGVLab / InternImage

[CVPR 2023 Highlight] InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions

Python 2,800 262 Updated Mar 25, 2025

OpenGVLab / OmniQuant

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.

Python 890 76 Updated Nov 26, 2025

Sense-GVT / DeCLIP

Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm

Python 677 32 Updated Sep 19, 2022

clip-vil / CLIP-ViL

[ICLR 2022] code for "How Much Can CLIP Benefit Vision-and-Language Tasks?" https://arxiv.org/abs/2107.06383

Python 420 35 Updated Oct 28, 2022

google-research / big_vision

Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.

Jupyter Notebook 3,377 214 Updated May 19, 2025

facebookresearch / FixRes

This repository reproduces the results of the paper: "Fixing the train-test resolution discrepancy" https://arxiv.org/abs/1906.06423

Python 1,044 147 Updated Aug 11, 2021

salesforce / LAVIS

LAVIS - A One-stop Library for Language-Vision Intelligence

Jupyter Notebook 11,183 1,102 Updated Nov 18, 2024

TRI-ML / prismatic-vlms

A flexible and efficient codebase for training visually-conditioned language models (VLMs)

Python 947 963 Updated Jul 4, 2024

rikeilong / Bay-CAT

[ECCV’24] Official Implementation for CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios

Python 58 1 Updated Sep 4, 2024

OpenGVLab / InternVideo

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Python 2,215 142 Updated Dec 15, 2025

daqingliu / awesome-rec

A curated list of research papers in Referring Expression Comprehension (REC)

46 6 Updated May 13, 2021

allenai / reclip

Python 87 15 Updated Apr 15, 2022

lichengunc / speaker_listener_reinforcer

Torch Implementation of Speaker-Listener-Reinforcer for Referring Expression Generation and Comprehension

Jupyter Notebook 34 12 Updated Mar 8, 2018

vikhyat / moondream

tiny vision language model

Python 9,419 738 Updated Nov 14, 2025

mbzuai-oryx / PALO

(WACV 2025 - Oral) Vision-language conversation in 10 languages including English, Chinese, French, Spanish, Russian, Japanese, Arabic, Hindi, Bengali and Urdu.

Python 84 7 Updated Aug 5, 2025

alon-albalak / data-selection-survey

A Survey on Data Selection for Language Models

255 15 Updated Apr 29, 2025

tatsu-lab / alpaca_farm

A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.

Python 842 63 Updated Jul 1, 2024

JoakimHaurum / TokenReduction

Official PyTorch implementation of Which Tokens to Use? Investigating Token Reduction in Vision Transformers presented at ICCV 2023 NIVT workshop

Python 35 5 Updated Aug 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ifty Mohammad Rezwan imr555

Achievements

Achievements

Highlights

Block or report imr555

visual_language_models_ucf

lucidrains / taylor-series-linear-attention

awaisrauf / Awesome-CV-Foundational-Models

NiFangBaAGe / Explicit-Visual-Prompt

tianrun-chen / SAM-Adapter-PyTorch

lucidrains / magvit2-pytorch

VideoNetworks / TokShift-Transformer

tsb0601 / MMVP

haotian-liu / LLaVA

rowanz / merlot

pliang279 / awesome-multimodal-ml

IDEA-Research / Grounded-Segment-Anything

OpenGVLab / Awesome-DragGAN

OpenGVLab / InternImage

OpenGVLab / OmniQuant

Sense-GVT / DeCLIP

clip-vil / CLIP-ViL

google-research / big_vision

facebookresearch / FixRes

salesforce / LAVIS

TRI-ML / prismatic-vlms

rikeilong / Bay-CAT

OpenGVLab / InternVideo

daqingliu / awesome-rec

allenai / reclip

lichengunc / speaker_listener_reinforcer

vikhyat / moondream

mbzuai-oryx / PALO

alon-albalak / data-selection-survey

tatsu-lab / alpaca_farm

JoakimHaurum / TokenReduction