This is a series of computer vision foundational projects that anyone diving into the field must tackle.
-
Updated
Nov 1, 2024 - Jupyter Notebook
This is a series of computer vision foundational projects that anyone diving into the field must tackle.
Implementation of VisionLLaMA from the paper: "VisionLLaMA: A Unified LLaMA Interface for Vision Tasks" in PyTorch and Zeta
Implementation of Midas from [Towards Robust Monocular Depth Estimation] in Pytorch and Zeta
An implementation of gated MLPs in tinygrad, as an alternative to transformers.
Enhance your skills in prompt engineering for vision models. Learn to effectively prompt, fine-tune, and track experiments for models like SAM, OWL-ViT, and Stable Diffusion 2.0 to achieve precise image generation, segmentation, and object detection.
A simple to use package to call various model providers such as openai, anthropic, and others with utmost reliability, security, and performance.
we generate captions to the images which are given by user(user input) using prompt engineering and Generative AI
A framework to compute threshold sensitivity of deep networks to visual stimuli.
These notes and resources are compiled from the crash course Prompt Engineering for Vision Models offered by DeepLearning.AI.
In This repo i FineTuned a Pretrained ResNet18 model from PyTorch library
Testing the Moondream tiny vision model
Vision-based swarms in the Presence of Occlusions
A comprehensive repository for research, code, and insights on convolutional neural networks and deep vision models
Diffusion Models crash course with Pytorch from DeepLearningAI
building AVA from ex-machina; a lightweight multi-modal system from scratch, just for learning & experimentation
Add a description, image, and links to the vision-models topic page so that developers can more easily learn about it.
To associate your repository with the vision-models topic, visit your repo's landing page and select "manage topics."