Lists (2)
Sort Name ascending (A-Z)
Stars
Clean, minimal, accessible reproduction of DeepSeek R1-Zero
An open-source RAG-based tool for chatting with your documents.
[CVPR 2024] DART: Implicit Doppler Tomography for Radar Novel View Synthesis
VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.
[AAAI 2025] DepthFM: Fast Monocular Depth Estimation with Flow Matching
Library to make any existing neural network architecture equivariant
An open-source framework for training large multimodal models.
API libraries, samples, and system images for AIY Projects (Voice Kit and Vision Kit). If you find this helpful, buy me a coffee to help me keep the drivers updated - https://paypal.me/viraniac
🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
GeFolki is a coregistration algorithm develop at ONERA in Medusa project
Documenting reverse engineering of the original Lytro lightfield camera
OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
The WeightWatcher tool for predicting the accuracy of Deep Neural Networks
Easy web analytics. No tracking of personal data.
🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
Taming Transformers for High-Resolution Image Synthesis
An open source implementation of CLIP.
Google Research
Model explainability that works seamlessly with 🤗 transformers. Explain your transformers model in just 2 lines of code.
Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…
COLMAP - Structure-from-Motion and Multi-View Stereo
Code for "On the Spectral Bias of Neural Networks", to appear in ICML 2019 (Long Beach, CA).
Application for camera and sensor data logging (iOS)