Skip to content

eddy10957/ObjectRecognitionApp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Object Recognition App

Overview

The exponential increase in the processing capabilities of mobile devices has ushered in a new era of intelligent applications, capable of performing machine learning operations directly on user hardware. The need for tools enabling more complete and immediate interaction with the surrounding world makes these solutions highly relevant. The decision to use YOLOv3 "tiny" version as the recognition model stems from its speed of inference, robust classification, versatility, and, notably for the tiny version, its compact size, only 8.9 MB. Despite its size, YOLOv3 tiny can recognize 80 different classes.

With a focus on providing a practical assistance service, special consideration was given to applications dedicated to user categories with specific needs, such as visually impaired individuals. This choice aligns with the application possibilities suggested by the author of YOLO and responds to a growing social and technological demand. Before delving into the heart of the implementation, it's crucial to examine the frameworks and libraries forming the technological foundation of the app.

Technologies Used

  • Core ML: The machine learning library from Apple acts as a bridge between the YOLOv3 model and Swift code, offering an efficient and highly optimized interface for inference on pre-trained models.

  • AVFoundation: Used to handle real-time video acquisition from the device's camera, providing detailed control over properties such as resolution and frame rate.

  • UIKit: Employed for building the app's user interface (UI). UIKit, a fundamental framework for iOS app development, offers a wide range of tools and components for creating user interfaces.

  • Vision Framework: Integral to Apple's development ecosystem, Vision offers high-level APIs optimized for visual data processing. Vision APIs were used for preliminary and post-processing tasks, preparing data in a format compatible with the YOLOv3 model.

Model

The YOLOv3 "tiny" model was chosen due to its proven effectiveness in real-time object recognition. Its small size, combined with Core ML's efficiency, allows for rapid and accurate processing directly on the user's device. The Model is auto-generated by Core ML, transforming the pre-trained YOLOv3 model into Swift classes optimized for iOS. Core ML generates classes like "YOLOv3TinyInt8LUT," "YOLOv3TinyInt8LUTInput," and "YOLOv3TinyInt8LUTOutput," enabling real-time inference directly from camera input.

Implementation

The application follows the Model-View-Controller (MVC) design pattern, a widely-used architecture in iOS applications. This separation of components facilitates code modularity, improves reusability, and eases software maintenance.

Conclusion

This Object Recognition app represents a synergy of cutting-edge technologies, leveraging machine learning, real-time video processing, and user interface design to provide a transformative tool for users. The combination of YOLOv3 "tiny" and Core ML demonstrates the efficiency and effectiveness of on-device object recognition, catering to a variety of applications, including those with real-time processing requirements.

About

A Simple iOS Object Recognition App for my thesis using Yolov3.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages