<a href="https://colab.research.google.com/github/Anibrata-Ghatak/Bank_Data_Extraction/blob/main/Bank_document_extraction.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [2]:
!apt-get install -y tesseract-ocr
!pip install -q pytesseract opencv-python langchain_ollama
!curl -fsSL https://ollama.com/install.sh | sh
!nohup ollama serve &
%env OLLAMA_HOST=0.0.0.0
!ollama pull mistral:7b

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
tesseract-ocr is already the newest version (4.1.1-2.1build1).
0 upgraded, 0 newly installed, 0 to remove and 35 not upgraded.
>>> Installing ollama to /usr/local
>>> Downloading Linux amd64 bundle
######################################################################## 100.0%
>>> Creating ollama user...
>>> Adding ollama user to video group...
>>> Adding current user to ollama group...
>>> Creating ollama systemd service...
>>> The Ollama API is now available at 127.0.0.1:11434.
>>> Install complete. Run "ollama" from the command line.
nohup: appending output to 'nohup.out'
env: OLLAMA_HOST=0.0.0.0
[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h

In [5]:
import gradio as gr
import pytesseract
import cv2
import json
import tempfile
from langchain_ollama.llms import OllamaLLM

# Load the LLM model
model = OllamaLLM(model="mistral:7b", format="json")

# Extraction class
class PassbookInfoExt:
    def __init__(self, job_id, image_path):
        self.job_id = job_id
        self.image_path = image_path

    def image_load(self):
        self.img = cv2.imread(self.image_path)
        if self.img is None:
            raise ValueError("Image not found or unreadable.")

    def image_preprocessing(self):
        self.preprocessed_img = cv2.cvtColor(self.img, cv2.COLOR_BGR2GRAY)

    def ocr(self):
        custom_config = r'--oem 3 --psm 6'
        self.ocr_text = pytesseract.image_to_string(self.preprocessed_img, config=custom_config)

    def extract(self):
        self.image_load()
        self.image_preprocessing()
        self.ocr()
        prompt = f"""
        You are a structured document extraction assistant.

        Your task is to extract the following fields from the OCR-scanned text of a bank passbook:
        - Account Holder Name
        - Account Number
        - IFSC Code

        Extraction Rules:
        - The IFSC Code must be 11 characters: 4 capital letters + '0' + 6 alphanumeric characters (e.g., SBIN0001234)
        - If the IFSC code contains letter 'O' instead of zero '0', correct it (e.g., 'CBINOR40012' becomes 'CBIN0R40012').
        - Account Number should be numeric and 10–16 digits long.
        - Account Holder Name should be the **exact full name** that follows known prefixes like “MISS”, “MR”, “MRS” and should not include address or other details.
        - Limit name to **3 words max** (e.g., "MISS RUNU PARVIN").

        Important Instructions:
        - Only return a valid JSON.
        - Do not explain.
        - Do not wrap in <think> or markdown.
        - Do not include headings or notes.

        Format:
        {{
          "Account Holder Name": "...",
          "Account Number": "...",
          "IFSC Code": "..."
        }}

        OCR text:
        \"\"\"{self.ocr_text}\"\"\"
        """
        response = model.invoke(prompt)
        result = json.loads(response)
        return result

# Gradio interface logic
def extract_passbook_info(image):
    if image is None:
        return "Please upload an image."

    with tempfile.NamedTemporaryFile(suffix=".jpg", delete=False) as tmp_file:
        image.save(tmp_file.name)
        extractor = PassbookInfoExt(job_id="1", image_path=tmp_file.name)
        try:
            result = extractor.extract()
        except Exception as e:
            return f"Error: {e}"

    # Convert to plain text output
    plain_text = (
        f"Account Holder Name: {result.get('Account Holder Name', 'N/A')}\n"
        f"Account Number: {result.get('Account Number', 'N/A')}\n"
        f"IFSC Code: {result.get('IFSC Code', 'N/A')}"
    )
    return plain_text

# Gradio app layout
with gr.Blocks() as demo:
    gr.Markdown("## Bank Passbook Info Extractor")
    with gr.Row():
        with gr.Column():
            image_input = gr.Image(type="pil", label="Upload Bank Passbook Image")
            clear_btn = gr.Button("Clear")
        with gr.Column():
            result_output = gr.Textbox(label="Extracted Info", lines=10)
            submit_btn = gr.Button("Extract Info")

    submit_btn.click(fn=extract_passbook_info, inputs=image_input, outputs=result_output)
    clear_btn.click(fn=lambda: (None, ""), inputs=[], outputs=[image_input, result_output])

demo.launch()

It looks like you are running Gradio on a hosted Jupyter notebook, which requires `share=True`. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://569b051e269f78338b.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


