Implementing OCR with a local visual model run by ollama.
-
Updated
Nov 27, 2024 - TypeScript
Implementing OCR with a local visual model run by ollama.
A most Frontend Collection and survey of vision-language model papers, and models GitHub repository. Continuous updates.
Crowd Analyzer is a Python application for analyzing pedestrian and crowd mobility patterns using computer vision and machine learning techniques.
flutter app with art collection API from metropolitan museum of art
A comprehensive framework for fine-tuning vision-language models using LoRA (Low-Rank Adaptation) with support for streaming datasets, advanced evaluation metrics, and robust training pipelines.
Transcription of books w/ Ollama
Next-Gen E-Commerce is a full-stack e-commerce platform leveraging React.js for the frontend and Spring Boot (Java) for the backend. The project integrates LLama Vision technology to enhance product recognition and optimize inventory management. With SQL databases, it provides a scalable and efficient architecture.
This project presents a fully automated delivery vehicle equipped with an intelligent robotic arm, designed to improve operations in modern smart warehousing environments. The system uses advanced object detection through a real-time machine learning model called LLama vision, which enables the vehicle to identify objects and place it in its basket
GraderPro is a robust and efficient grading system designed to simplify the process of evaluating exams. It has the capability to grade handwritten documents and the diagrams when a diagram reference is given. This is repository is an extension of the AutoGrader repository.
Add a description, image, and links to the llama-vision-model topic page so that developers can more easily learn about it.
To associate your repository with the llama-vision-model topic, visit your repo's landing page and select "manage topics."