Skip to content

An Android application for on-device text recognition and translation from images, based on the integration of ML Kit SDK. Developed as the final project for my master's degree in Multimedia Technologies.

Notifications You must be signed in to change notification settings

spanmartina/Text-Recognition-and-Translation-MLKit

Repository files navigation

Text Recognition and Translation from Images with ML Kit

The development of the Android application was divided into two main processes:

  1. One process is responsible for recognizing and extracting text from an image (uploaded from gallery or captured with the camera directly from the app).
  2. One process responsible for translating the text recognized in the first process in the target language (English).

The application implements Optical Character Recognition technology to extract and recognize text from the input image and translate it to the target language (English) using the ML Software Development Kit.

ML Kit is the mobile SDK that brings Google's machine learning experience to Android and iOS applications. Machine learning models run directly on the user's device, allowing on-device processing without the need for a constant internet connection. The models are downloaded via Goole Play Store.

Dependecies

// Machine Learning dependencies
implementation 'com.google.mlkit:text-recognition:16.0.0'
implementation 'org.jetbrains.kotlinx:kotlinx-coroutines-core:1.5.2'
implementation 'com.google.mlkit:language-id:17.0.4'
implementation 'com.google.mlkit:translate:17.0.1'

Firebase Authentication is used for user authentication and managing user identity within the application. Firebase Realtime Database is employed for storing and synchronizing data in real-time.

// Firebase dependencies
implementation 'com.google.firebase:firebase-database-ktx:20.3.0'
implementation 'com.google.firebase:firebase-database:20.3.0'
implementation 'com.google.firebase:firebase-auth:20.0.1'

The CameraX library simplifies the implementation of camera functionalities in the Android application, abstracting the complexity associated with camera management. More specifically, it is used for camera initialization, image capture, and preview management.

 // CameraX dependencies
def camerax_version = "1.2.2"
implementation "androidx.camera:camera-core:${camerax_version}"
implementation "androidx.camera:camera-camera2:${camerax_version}"
implementation "androidx.camera:camera-lifecycle:${camerax_version}"
implementation "androidx.camera:camera-video:${camerax_version}"
implementation "androidx.camera:camera-view:${camerax_version}"
implementation "androidx.camera:camera-extensions:${camerax_version}"

Demo

Sign-up Activity Dashboard Screen Settings Activity
WhatsApp Image 2024-02-12 at 12 31 52 (1) Image 2 WhatsApp Image 2024-02-12 at 12 29 57

The following screen flow ensures a smooth progression from capturing an image to previewing it and, finally, receiving the translated text. Users experience a streamlined process for analyzing and understanding text content in different languages.

Camera Activity Preview Activity Translator Activity
Image 3 Image 4 WhatsApp Image 2024-02-12 at 12 29 56 (2)

About

An Android application for on-device text recognition and translation from images, based on the integration of ML Kit SDK. Developed as the final project for my master's degree in Multimedia Technologies.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages