In [1]:
import pytesseract
from PIL import Image

# Load the image
img = Image.open("image.png")

# Perform OCR
text = pytesseract.image_to_string(img)

print(text)

A dataset is instrumental for Optical Character Recognition (OCR)
tasks because it enables the model to learn and understand various
fonts, sizes, and orientations of text. This in turn leads to improved
OCR accuracy in real-world applications.



In [3]:
import os
from groq import Groq
from requests.exceptions import ConnectionError, Timeout, RequestException
from tenacity import retry, wait_exponential, stop_after_attempt

# Set your Gorq AI API key
GROQ_API_KEY = os.getenv("GROQ_API_KEY", "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")

if not GROQ_API_KEY:
    raise ValueError("GROQ_API_KEY is not set. Please set your API key.")

client = Groq(
    api_key=GROQ_API_KEY,
)

@retry(wait=wait_exponential(multiplier=1, min=4, max=10), stop=stop_after_attempt(5))
def get_chat_completion():
    try:
        chat_completion = client.chat.completions.create(
            messages=[
                {
                    "role": "user",
                    "content": text,
                }
            ],
            model="llama3-8b-8192",
        )
        return chat_completion
    except ConnectionError:
        print("Error: Failed to connect to the server. Retrying...")
        raise
    except Timeout:
        print("Error: The request timed out. Retrying...")
        raise
    except RequestException as e:
        print(f"Error: An error occurred. {e}")
        raise

try:
    chat_completion = get_chat_completion()
    print(chat_completion.choices[0].message.content)
except Exception as e:
    print(f"Failed after several retries: {e}")


That's correct! A dataset plays a crucial role in Optical Character Recognition (OCR) tasks. A well-curated dataset allows the model to learn and understand various aspects of text, including:

1. Fonts: Different fonts, such as serif, sans-serif, and script, may require specific features and patterns to be accurately recognized.
2. Sizes: Text of varying sizes, from tiny to large, can have distinct characteristics that the model needs to learn to recognize.
3. Orientations: Text can be written at different angles, such as upright, tilted, or even rotated, which requires the model to generalize to different orientations.

By training on a diverse dataset that covers these variations, an OCR model can improve its accuracy in real-world applications. A dataset with a large variety of fonts, sizes, and orientations can help the model learn robust features that enable it to recognize text in different scenarios.

Some examples of OCR datasets that are commonly used for training models incl

# Pipeline

In [6]:
import os
import pytesseract
from PIL import Image
from groq import Groq
from requests.exceptions import ConnectionError, Timeout, RequestException
from tenacity import retry, wait_exponential, stop_after_attempt
import pickle

# Set your Gorq AI API key
GROQ_API_KEY = os.getenv("GROQ_API_KEY", "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")

if not GROQ_API_KEY:
    raise ValueError("GROQ_API_KEY is not set. Please set your API key.")

client = Groq(
    api_key=GROQ_API_KEY,
)

@retry(wait=wait_exponential(multiplier=1, min=4, max=10), stop=stop_after_attempt(5))
def get_chat_completion(text):
    try:
        chat_completion = client.chat.completions.create(
            messages=[
                {
                    "role": "user",
                    "content": text,
                }
            ],
            model="llama3-8b-8192",
        )
        return chat_completion
    except ConnectionError:
        print("Error: Failed to connect to the server. Retrying...")
        raise
    except Timeout:
        print("Error: The request timed out. Retrying...")
        raise
    except RequestException as e:
        print(f"Error: An error occurred. {e}")
        raise

def extract_text_from_image(image_path):
    # Load the image
    img = Image.open(image_path)
    
    # Perform OCR
    text = pytesseract.image_to_string(img)
    
    return text

def process_image(image_path):
    text = extract_text_from_image(image_path)
    if text:
        try:
            chat_completion = get_chat_completion(text)
            print(chat_completion.choices[0].message.content)
        except Exception as e:
            print(f"Failed after several retries: {e}")
    else:
        print("No text found in the image.")

if __name__ == "__main__":
    image_path = "image.png"
    process_image(image_path)


That's absolutely correct! A dataset is crucial for Optical Character Recognition (OCR) tasks because it provides the model with a comprehensive range of examples to learn from. By including various fonts, sizes, and orientations of text, the dataset helps the model develop a robust understanding of how to recognize and classify characters from diverse sources.

Having a large and diverse dataset allows the OCR model to learn the following:

1. **Font variability**: The dataset exposes the model to different fonts, which enables it to recognize characters even if they are not presented in the same font as the training data.
2. **Size variability**: The model learns to recognize characters in different sizes, which is important for OCR systems that need to process text from documents or images with varying font sizes.
3. **Orientation variability**: The dataset includes characters in different orientations, such as horizontal, vertical, and diagonal, helping the model understand how to 

In [7]:
import os
import pytesseract
from PIL import Image
from groq import Groq
from requests.exceptions import ConnectionError, Timeout, RequestException
from tenacity import retry, wait_exponential, stop_after_attempt
import pickle

# Set your Gorq AI API key
GROQ_API_KEY = os.getenv("GROQ_API_KEY", "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")

if not GROQ_API_KEY:
    raise ValueError("GROQ_API_KEY is not set. Please set your API key.")

class ImageToTextPipeline:
    def __init__(self, api_key):
        self.api_key = api_key
        self.client = Groq(api_key=self.api_key)

    @retry(wait=wait_exponential(multiplier=1, min=4, max=10), stop=stop_after_attempt(5))
    def get_chat_completion(self, text):
        try:
            chat_completion = self.client.chat.completions.create(
                messages=[
                    {
                        "role": "user",
                        "content": text,
                    }
                ],
                model="llama3-8b-8192",
            )
            return chat_completion
        except ConnectionError:
            print("Error: Failed to connect to the server. Retrying...")
            raise
        except Timeout:
            print("Error: The request timed out. Retrying...")
            raise
        except RequestException as e:
            print(f"Error: An error occurred. {e}")
            raise

    def extract_text_from_image(self, image_path):
        # Load the image
        img = Image.open(image_path)
        
        # Perform OCR
        text = pytesseract.image_to_string(img)
        
        return text

    def process_image(self, image_path):
        text = self.extract_text_from_image(image_path)
        if text:
            try:
                chat_completion = self.get_chat_completion(text)
                print(chat_completion.choices[0].message.content)
            except Exception as e:
                print(f"Failed after several retries: {e}")
        else:
            print("No text found in the image.")

    def save_pipeline(self, filename):
        # Temporarily remove the client before pickling
        client = self.client
        self.client = None
        with open(filename, 'wb') as f:
            pickle.dump(self, f)
        # Restore the client
        self.client = client

    @classmethod
    def load_pipeline(cls, filename, api_key):
        with open(filename, 'rb') as f:
            pipeline = pickle.load(f)
        # Reinitialize the client
        pipeline.client = Groq(api_key=api_key)
        return pipeline

if __name__ == "__main__":
    api_key = os.getenv("GROQ_API_KEY", "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")
    pipeline = ImageToTextPipeline(api_key)
    
    # Get the image path from the user
    image_path = input("Please enter the path to the image: ")
    
    # Process the image
    pipeline.process_image(image_path)

    # Save the pipeline
    pipeline.save_pipeline('vatta.pkl')



That's correct! A dataset is crucial for optical character recognition (OCR) tasks as it provides the model with a large amount of labeled data to learn from. This labeled data includes examples of text in various fonts, sizes, and orientations, which helps the model train and become proficient in recognizing and understanding different types of text.

With a well-curated dataset, the OCR model can learn to:

1. Identify and recognize various fonts, including serif, sans-serif, script, and decorative fonts.
2. Handle text in different sizes, ranging from tiny headings to large body text.
3. Recognize text in different orientations, including upright text, rotated text, and text with skewing.
4. Detect and correct common OCR errors, such as character substitutions, insertions, and deletions.

By training the model on a large and diverse dataset, the model becomes more accurate and robust, which leads to improved OCR performance in real-world applications, such as:

1. Document scanning 

In [8]:
    # Load the pipeline and process another image
loaded_pipeline = ImageToTextPipeline.load_pipeline('vatta.pkl', api_key)
image_path = input("Please enter the path to another image: ")
loaded_pipeline.process_image(image_path)

It appears to be a cancelled cheque issued by the State Bank of India in favor of NAGAMALLI ENCLAVE FLAT OWNERS‘ ASSOCIATION VSP.

The details on the cheque are as follows:

* Account No: 10849853744
* Customer Name: NAGAMALLI ENCLAVE FLAT OWNERS‘ ASSOCIATION VSP
* Address: 14-25-16, F-3RD FLOOR, NAGAMALLI ENCLAVE, MAHARANIPETA
* Phone: 2551284
* Email: SA.00754@SBI.CO
* Branch Code: 754 (Visakhapatnam branch)
* Date of Issue: 20/07/2018
* Amount: ₹87,009.00

The back of the cheque appears to be a registration form for a nomination, but it seems to be incomplete and has some handwritten notes and corrections.
