Skip to content

Anjali1751/Extracting-data-of-scanned-images

Repository files navigation

Data-Extractor

Data Extraction from images

Problem

It is difficult to assemble or organise the health deatils files(for eg. lab test report ).Doctor also took time to examine the reports.

Solution

Combining both the senarios and coming with automate and digital solution. User just to upload the lab reports (images ) in the application .All the data will be extract and will give the digital reports. Benifits:

1: Get rid of oraganising the bundle of reports.

2: Secure Digital data analysing.

3: Patient can also track his or her health care by distinct parameters.

4: Doctor will also get a simple methodology to examine the reports of perticular patient.

How to run

python run.py

Detailed process of installation of packages in mentioned below

About Project

Initial Handlings and Packages

Programing Language: Python Methods: Image Processing , Tesseract-OCR , Flask(For UI Interface)

Inititial installations:

Python :3.7 Softwares: Anaconda,Atom,Spyder,PyCharm (Any One)

Packages

Flask : for UI Interface

Install Commands:

pip install flask

Other packages inside flask:

pip install flask-wtf : Direct forms interface with flask

pip install flask-sqlalchemy : Database connectivity with flask

For Image Processing:

Opencv : pip install opencv

NumPy : as usually installed

Imutils : pip install imutils

Argparse : pip install argparse

Skimage : pip install scikit-image

PIL : pip install pillow

Data Extractiong from Iamges and PDFs

Pytesseract: pip install pytesseract

Procedure

Phase 1- Developed Scanner using OpenCv

Building a scanner with OpenCV can be accomplished in just three simple steps:

Step 1: Detect edges.

original Image Edge detection

Step 2: Use the edges in the image to find the contour (outline) representing the piece of paper being scanned.

Edge 2

Step 3: Apply a perspective transform to obtain the top-down view of the document. original ImageOutput Image2

Phase 2- Convert scanned image into text file

Using Tessereact, all the data is extracted from processed image and stored in text file for data mining and analysis.

Screenshot (2)

Phase 3- Desiging the User Interface for better Intersection and Visualization

Screenshot (3)

Screenshot (4)

Screenshot (6)

Screenshot (7)

Phase 4 :

Data analysing and visualization is going on.

Key Points:

  1. Simple User Interface
  2. 95% and above percentage accurancy of data Extraction from Image
  3. User accessiblity in UI is simple
  4. DataBase is also Implemented

Releases

No releases published

Packages

No packages published

Languages