MediScanFlow

MediScanFlow is an application seamlessly integrates Optical Character Recognition (OCR) to automate the classification, extraction, transformation, and loading of Department of Health (DOH) medical forms data directly into a Spreadsheet.

What does the project do?

Integration of Optical Character Recognition (OCR): The application incorporates Azure AI Document Intelligence OCR technology, enabling the recognition and interpretation of text from images or scanned documents.
Automated Classification, Extraction, Transformation, and Loading (ETL): The application performs a series of automated tasks, including the classification of medical forms, extraction of relevant data, transformation of the data into a structured format, and loading it into a spreadsheet.
DOH Medical Forms: The application specifically focuses on handling data from Department of Health (DOH) medical forms. It can classify various forms, including HIV Certificate Forms, Medical Certificate for Land-based Overseas Workers, Medical Certificate for Service at Sea, Medical Examination Report for Land-based Overseas Workers, Medical Examination Report for Seafarers, and Tabulated Psychological Evaluation Form.
Export to a Spreadsheet File: The end result of the processing is the creation of a spreadsheet file, indicating that the extracted and transformed data is organized and presented in a spreadsheet format.

Promotional Video

Motivation

Our project was created to compete with an "AI and Machine Learning Challenge," aiming to create a smart program that can quickly gather information from medical forms.

Time-Efficiency: Automation minimizes manual data entry efforts, swiftly processing large volumes of medical forms for heightened operational efficiency.

Enhanced Accessibility: Digitally organizing and storing extracted data facilitates easy analysis, reporting, and integration with other systems, improving overall healthcare information accessibility and usability.

Enhanced Focus: Automating routine tasks allows healthcare professionals to redirect efforts towards value-added activities like patient care, research, and decision-making.

Getting Started

To run this project locally, you'll need to set up a virtual environment and install the required dependencies. Follow the steps below to get started.

Prerequisites

Python (3.10.11 or higher) installed on your system.

Setting up a Virtual Environment

A virtual environment is a way to isolate your project's dependencies. It's a good practice to use one to avoid conflicts with other projects. To set up a virtual environment, follow these steps:

Open a terminal in the root directory of your project.
Run the following command to create a virtual environment:
```
python -m venv venv
```
Activate the virtual environment:
```
venv\Scripts\activate
```
You'll now be working within the virtual environment, and you can deactivate it by running deactivate in the terminal.

Installing Dependencies

This project uses a requirements.txt file to specify its dependencies. To install these dependencies, follow these steps:

Make sure your virtual environment is activated (as explained in the previous section).
Run the following command to install the dependencies:
```
pip install -r requirements.txt
```

Running the Project

Now that you have set up the virtual environment and installed the dependencies, you can run the project.

Simply run this in the root directory of your project.

python main.py

How to use

In this section, you will see a demonstration of how to use the created application for Dashlabs. You can check this video or read through it. Promotional Video

Home Screen

This is our home screen, it has four buttons. Button for uploading a single file, uploading a folder for processing multiple files, inserting the data extracted JSON to main CSV, and View CSV

Upload a Single File

To process a single file simply click upload a file button
After clicking the button, a file dialog will pop up. Choose a form you want to process. For this example, I will be processing a landbase_cert_3.jpg file or a Medical Certificate for Landbased Overseas Workers

Processing Document

A new window will pop up that has an empty canvas and a process document button

This button will start to process document
During the processing of document, a print diagnostic will be shown.

File Information
- File being processed: landbase_cert_3.jpg - Tells what file is under the process
- MIME Type: image/jpeg - Tells the content of the file which a jpeg file
Document Classifier Status - This tells the status of Document Classifier
- Document Classifier Status (1): running
- Document Classifier Status (2): succeeded
Document Information
- Document Type: Landbase Certificate - This tells that the Landbase Certificate is the model being processed
- Accuracy: 23.7% - Please note that the accuracy of 23.7% suggests that the document classifier may have identified the document type with a relatively low confidence level. Further review and validation may be required, depending on the specific use case.
Processing Steps
1. Processing Document... - The system initiated the processing of the document.
2. Request Successful - The initial request to process the document was successful.
3. Creating JSON file - The system generated a JSON file, possibly containing the extracted information.
4. Text extraction successful to landbase_cert_3.jpg - The text extraction process from the document (landbase_cert_3.jpg) was successful.
Processing Result
- Processing Successful! - The document processing was completed successfully, and the extracted text in json file is ready for insert

Upload a Multiple File by uploading a folder

In your file explorer add the files you want to process. For this example I will be uploading for upload forms folder with the following content

.
Click the Upload a Folder button
Choose the for upload forms folder. Do step 4 for processing document and see the printing status. Please be noted that the window screen will freeze when it starts to process the form. To see live status refrain from touching the window screen.

HIV Certificate Processed

Medical Certificate for Landbase Overseas Workers

Medical Certificate for Service at Sea

Pyschological Evaluation Form

Medical Examination Report for Landbased Overseas Workers

Medical Examination Report for Seafarers

Insert Data to CSV

In the home screen there is a insert data button. The purpose of this button was to actually use the json data extracted from processing the document, convert it to a dataframe, and then insert it to a dedicated csv file for storage.
After click the insert data button, a window will pop up telling the result of insertion
When you clicked the insert data button data again, a new message will show telling No dataframes to concatenate. Upload a form. This is just common since we successfully inserted the data in csv and the json files are move to new folder.

Viewing of CSV

To view the stored data, simply click the View CSV button.
After clicking the View CSV button, a new window will pop up displaying different buttons dedicated for each form.
As an example, let us view the Medical Examination Report for Overseas Workers.
It will automatically open the csv file holding the extracted data. You can edit and fix the wrongly extracted data

ZIP and Encrypt All CSV Files

Click the button ZIP and Encrypt All CSV Files.
After clicking the ZIP and Encrypt All CSV Files a pop up will show asking you to enter the name of your zip file.
After entering your name a pop up will show asking you to add a password to your zip file
After setting the password a filedialog will pop up. This will help you navigate where to put your zip file
You can now access the downloaded zipped file. Note that use winrar to open and extract the files.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
dashlabs		dashlabs
output_csv_folder		output_csv_folder
processed_forms		processed_forms
processed_json		processed_json
tkinterdesign/build		tkinterdesign/build
utils		utils
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
generate_csv_file.py		generate_csv_file.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MediScanFlow

What does the project do?

Motivation

Getting Started

Prerequisites

Setting up a Virtual Environment

Installing Dependencies

Running the Project

How to use

Home Screen

Upload a Single File

Processing Document

Upload a Multiple File by uploading a folder

Insert Data to CSV

Viewing of CSV

ZIP and Encrypt All CSV Files

About

Releases

Packages

Contributors 3

Languages

jlozion026/dashlab_challenge

Folders and files

Latest commit

History

Repository files navigation

MediScanFlow

What does the project do?

Motivation

Getting Started

Prerequisites

Setting up a Virtual Environment

Installing Dependencies

Running the Project

How to use

Home Screen

Upload a Single File

Processing Document

Upload a Multiple File by uploading a folder

Insert Data to CSV

Viewing of CSV

ZIP and Encrypt All CSV Files

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages