A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
-
Updated
Jun 16, 2024 - C++
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
Port of MiniGPT4 in C++ (4bit, 5bit, 6bit, 8bit, 16bit CPU inference with GGML)
Fast Multimodal LLM on Mobile Devices
LLaVA server (llama.cpp).
[ECCV 2022] Multimodal Transformer with Variable-length Memory for Vision-and-Language Navigation
Open-source simulation engine for robotic general intelligence (RGI)
Highway Driving (project 7 of 9 from Udacity Self-Driving Car Engineer Nanodegree)
ROS2 package that integrates L3CAM sensors using L3CAM SDK
ROS2 package for the visualization of the fusion of the L3Cam device sensors
Repository to document and advertise our McGill Capstone Group 22 Project
Add a description, image, and links to the multimodal topic page so that developers can more easily learn about it.
To associate your repository with the multimodal topic, visit your repo's landing page and select "manage topics."