Skip to content

amilavm/lucid

Repository files navigation

Lucid - Intelligent Document Processing & Review System ©


Main Features

  • OCR (Optical Character Recognition)
  • Key Word Search
  • Intelligent Search (Contextual Search)

Setting up the App Locally

Here is the guide for app installation in your own environments.

Prerequisites

  • Download the Sentence Transformers model Artefacts from here.

  • Download the Layout Parser Model from here.

Setup a Virtual Environment (Optional)

  • Create a new virtual environment

    $ conda create --name <env-name> python=3.7.5
    $ # specify a convienient name for <env-name> as for the new env
  • To activate the created environment

    $ conda activate <env-name>
    $ # replace the specified name with <env-name>

Installation

Follow the below steps for the installation.

  • Clone the repository:

    $ git clone https://github.com/engenuityai/lucid-engen-server.git

    or download the zip file from the repository & extract it.

    Then navigate to the project root folder.

    $ cd lucid-engen-server
  • Copy the Downloaded Sentence Transformer Model Artefacts and place inside the model folder in intelligent_search_module directory.

  • Copy the Downloaded Layout Parser Model and place inside the ocr directory.

  • Install the Requirements:

    $ pip install -r requirements.txt
  • Run the App on localhost:

    $ flask run
    $ # You can change to any specific host and port to serve the app
  • App will be Running at: http://127.0.0.1:5000

    Endpoint for the Keyword Search http://127.0.0.1:5000/keyword-search

    Endpoint for the Intelligent Search http://127.0.0.1:5000/intelligent-search


✨ Code-base structure

The project code base structure is as below:

< PROJECT ROOT >
   |
   |-- assets/                              # Folder to store input images
   |     
   |-- intelligent_search_module/           # Module for Intelligent Search
   |    |-- model/                          # Folder containing all the Model Artefacts for Sentence Transformer 
   |    |-- __init__.py                     # Module Initialization
   |    |-- bert_process.py                 # Sentence Transformer Operations for Intelligent Search
   |
   |    
   |-- keyword_search_module/               # Module for Keyword Search    
   |    |-- __init__.py                     # Module Initialization
   |    |-- ocr_word_search_complete.py     # Keyword Search Operations
   |
   |
   |-- ocr/                                 # Module for OCR Operations
   |    |-- __init__.py                     # Module initialization
   |    |-- img_gen.py                      # Image generations
   |    |-- Layout_det.py                   # Layout Parsing
   |    |-- pyteseract_para_ocr_bb.py       # Paragraph Level OCR Processors
   |    |-- Pytesseract_table_ocr_bb.py     # Table Level Operations
   |    |-- run_ocr.py                      # Run Tasks Operations
   |    |-- test_07_14.h5                   # Layout Parser Model
   |
   |
   |-- utils/                               # Support Functions
   |    |-- __init__.py                     # Module initialization
   |    |-- cleaning.py                     # Text data preprocessing/postprocessing
   |    |-- generate.py                     # Result generation operations
   |
   |
   |-- intelligent_search_output/           # Output Result Directory for Intelligent Search   
   |    |-- csv                             # Contains reslted csv files
   |    |-- pics                            # Contains generated images
   |    |-- result_pdfs                     # Contains final pdf outputs
   |
   |
   |-- keyword_search_output/               # Output Result Directory for Keyword Search
   |
   |
   |-- requirements.txt                     # Requirements, all the required dependencies
   |-- .flaskenv                            # Flask environment configurations
   |-- Dockerfile                           # Dockerfile Script for the App
   |
   |-- app.py                               # Setup App Configuration
   |-- main.py                              # Main App Starter - WSGI gateway
   |
   |-- ************************************************************************

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages