vision-transformer

Here are 504 public repositories matching this topic...

akbar2habibullah / Homunculus-Project

My project about a custom AI architecture. Consist of cutting-edge technique in machine learning such as Flash-Attention, Group-Query-Attention, ZeRO-Infinity, BitNet, etc.

python machine-learning deep-learning jupyter-notebook pytorch transformer bitnet pytorch-lightning vision-transformer large-language-models low-rank-adaptation flash-attention

Updated Sep 20, 2024
Python

Oneflow-Inc / libai

Star

LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training

nlp deep-learning transformer large-scale data-parallelism model-parallelism distributed-training self-supervised-learning oneflow pipeline-parallelism vision-transformer

Updated Sep 20, 2024
Python

ananthu-aniraj / pdiscoformer

Star

[ECCV 2024 Oral] Official implementation of the paper "PDiscoFormer: Relaxing Part Discovery Constraints with Vision Transformers"

computer-vision deep-learning fine-grained-classification interpretable-machine-learning vision-transformer eccv2024 part-discovery

Updated Sep 20, 2024
Python

joseph-nagel / attention-mechanism

Star

An introduction to attention mechanisms and the vision transformer

deep-neural-networks transformer attention-mechanism transformer-architecture vision-transformer

Updated Sep 20, 2024
Python

marqo-ai / marqo-FashionCLIP

Star

State-of-the-art CLIP-like models finetuned for the fashion domain. +57% increase in evaluation metrics vs FashionCLIP 2.0.

search transformers embeddings clip informationretrieval multimodal recomendations fashion-classifier vision-transformer vectorsearch fashionclip

Updated Sep 20, 2024
Python

mist-medical / MIST

Star

MIST: A simple, scalable, and end-to-end framework for 3D medical imaging segmentation.

deep-learning pytorch medical-imaging convolutional-neural-networks image-segmentation unet vision-transformer attention-unet nnunet 3d-medical-imaging-segmentation unetr

Updated Sep 20, 2024
Python

Wang-ML-Lab / interpretable-foundation-models

Star

[ICML 2024] Probabilistic Conceptual Explainers: Trustworthy Conceptual Explanations for Vision Foundation Models

graphical-models probabilistic-graphical-models interpretability bayesian-deep-learning vision-transformer foundation-models large-language-models llm multimodal-large-language-models large-multimodal-models

Updated Sep 19, 2024
Python

sun-hailong / LAMDA-PILOT

Star

🎉 PILOT: A Pre-trained Model-Based Continual Learning Toolbox

machine-learning deep-learning toolkit reproducible-research pytorch incremental-learning lifelong-learning continual-learning pre-trained-models vision-transformer vision-language-model

Updated Sep 19, 2024
Python

OpenGVLab / InternVideo

Star

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

benchmark action-recognition video-understanding video-data self-supervised multimodal video-dataset open-set-recognition video-retrieval video-question-answering masked-autoencoder temporal-action-localization contrastive-learning spatio-temporal-action-localization zero-shot-retrieval video-clip vision-transformer zero-shot-classification foundation-models instruction-tuning

Updated Sep 19, 2024
Python

Scripts and trained models from our paper: M. Ntrougkas, N. Gkalelis, V. Mezaris, "T-TAME: Trainable Attention Mechanism for Explaining Convolutional Networks and Vision Transformers", IEEE Access, 2024. DOI:10.1109/ACCESS.2024.3405788.

deep-learning cnn attention-mechanism explainable-ai xai model-interpretability vision-transformer

Updated Sep 18, 2024
Python

GuidoManni / DeepLearningImplementation

Star

This repository contains implementations of prominent computer vision deep learning architectures. The focus is on simplifying these architectures while relying solely on the PyTorch library. The goal is to provide accessible and streamlined versions of key models in the field.

computer-vision deep-learning cnn deep-learning-tutorial pytorch-implementation vision-transformer

Updated Sep 18, 2024
Python

Blaizzy / mlx-vlm

Sponsor

Star

MLX-VLM is a package for running Vision LLMs locally on your Mac using MLX.

mlx vision-framework apple-silicon vision-transformer llm vision-language-model llava local-ai idefics paligemma

Updated Sep 18, 2024
Python

billpsomas / rscir

Star

Official PyTorch implementation and benchmark dataset for IGARSS 2024 ORAL paper: "Composed Image Retrieval for Remote Sensing"

computer-vision deep-learning satellite remote-sensing satellite-imagery earth-observation vision-language vision-transformer vision-language-model

Updated Sep 17, 2024
Python

mbari-org / fastapi-vss

Star

RESTful API for vector similarity search. It uses the Python web framework FastAPI. This accelerates machine learning workflows that require vector similarity search using foundational models.

image-classification fastapi vision-transformer foundation-models

Updated Sep 16, 2024
Python

lironui / GeoSR

Star

deep-learning uav pytorch remote-sensing segmentation super-resolution semantic-segmentation vision-transformer

Updated Sep 16, 2024
Python

junyuchen245 / TransMorph_Transformer_for_Medical_Image_Registration

Star

TransMorph: Transformer for Unsupervised Medical Image Registration (PyTorch)

deep-learning transformer diffeomorphism image-registration bayesian-deep-learning image-alignment vision-transformer swin-transformer

Updated Sep 15, 2024
Python

kyegomez / RT-X

Sponsor

Star

Pytorch implementation of the models RT-1-X and RT-2-X from the paper: "Open X-Embodiment: Robotic Learning Datasets and RT-X Models"

computer-vision artificial-intelligence vision attention-model attention-is-all-you-need multimodal vision-transformer gpt4 gpt4all

Updated Sep 15, 2024
Python

alessioborgi / AdaViT

Star

Adaptive Vision Transformer for efficient image classification, implementing dynamic token sparsification to reduce computational costs while maintaining accuracy.

attention adaptive attention-is-all-you-need halting self-at attentio vision-transformer mlp-mixer transforme

Updated Sep 14, 2024
Python

emcf / thepipe

Star

Extract clean markdown from PDFs, URLs, Word docs, slides, videos, and more, ready for any LLM. ⚡

pdf web scrapers multimodal vision-transformer gpt-4 large-language-models gpt-4o

Updated Sep 13, 2024
Python

zer0int / CLIP-XAI-GUI

Star

CLIP GUI - XAI app ~ explainable (and guessable) AI with ViT & ResNet models

game gui attention vit image-to-text clip gradient-ascent xai attention-visualization vision-transformer

Updated Sep 13, 2024
Python

Improve this page

Add a description, image, and links to the vision-transformer topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the vision-transformer topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vision-transformer

Here are 504 public repositories matching this topic...

akbar2habibullah / Homunculus-Project

Oneflow-Inc / libai

ananthu-aniraj / pdiscoformer

joseph-nagel / attention-mechanism

marqo-ai / marqo-FashionCLIP

mist-medical / MIST

Wang-ML-Lab / interpretable-foundation-models

sun-hailong / LAMDA-PILOT

OpenGVLab / InternVideo

IDT-ITI / T-TAME

GuidoManni / DeepLearningImplementation

Blaizzy / mlx-vlm

billpsomas / rscir

mbari-org / fastapi-vss

lironui / GeoSR

junyuchen245 / TransMorph_Transformer_for_Medical_Image_Registration

kyegomez / RT-X

alessioborgi / AdaViT

emcf / thepipe

zer0int / CLIP-XAI-GUI

Improve this page

Add this topic to your repo