<a href="https://colab.research.google.com/github/CatFatOw/Hack_Rice_2025/blob/main/Hackathon_Work_Segmentor.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Extract text from images (turn to flash card/summary?)

## PaddleOCR Text Extraction Pipeline

This project provides a Python pipeline for performing OCR (Optical Character Recognition) on images using PaddleOCR. The pipeline allows you to:  

- Run OCR on a folder of images  
- Save predicted images with bounding boxes  
- Save recognized text to JSON files  
- Extract text from JSON results  
- Compare original and predicted images visually  

---

## 📦 Installation

Make sure you have the required packages installed:

```bash
git clone https://github.com/PaddlePaddle/PaddleOCR.git
pip install paddleocr paddlepaddle opencv-python matplotlib tqdm


In [32]:
from tqdm import tqdm
from paddleocr import PaddleOCR
import cv2
import matplotlib.pyplot as plt
import os
import json
import requests


def model_predict(img_dir, output_dir_name):
  """
  Function calls an OCR model and extracts text

  img_path:
    - Path to the image (string)

  """
  ocr = PaddleOCR(
      use_doc_orientation_classify=True,
      use_doc_unwarping=True,
      use_textline_orientation=False)

  # Loop through the directory
  img_list_dir = os.listdir(img_dir)
  for img in img_list_dir:
    img = os.path.join(img_dir, img)
    result = ocr.predict(
        input = img
    )
    for res in tqdm(result, desc="Please be patient", total=len(result)):
      res.print()
      res.save_to_img(output_dir_name)
      res.save_to_json(output_dir_name)



def display_stuff(original, predicted):
  """
  Function takes in original image, predicted image, and extracted text json

  original (string): path to original image
  predicted (string): path to predicted image
  text_json (string): path to text json
  """

  og_img = cv2.imread(original)
  pred_img = cv2.imread(predicted)

  # Display the images first
  og_img = cv2.cvtColor(og_img, cv2.COLOR_BGR2RGB)
  pred_img = cv2.cvtColor(pred_img, cv2.COLOR_BGR2RGB)

  og_img = cv2.resize(og_img, (500, 500))
  pred_img = cv2.resize(pred_img, (500, 500))

  plt.title("Original VS Predicted")
  plt.axis("Off")
  plt.subplot(1,2,1)
  plt.imshow(og_img)
  plt.axis("Off")
  plt.subplot(1,2,2)
  plt.imshow(pred_img)
  plt.axis("Off")

  plt.show()


def extract_text(text_json):
  """
  Function takes in original image, predicted image, and extracted text json

  text_json (string): path to text json
  """
  # Load the JSON (from file or string)
  with open(text_json, "r") as f:
      data = json.load(f)
      # Extract just the recognized words
      words = data["rec_texts"]
  return "\n".join(words)

text_json = "/content/preds/xi_test_res.json"
text = extract_text(text_json)


def use_gemini_to_summerize(prompt, text):

  """
  Function takes in a prompt by the user, and the extracted text to do some stuff

  text_json (string): path to text json
  """

  url = "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent"

  headers = {
    "Content-Type": "application/json",
    "X-goog-api-key": "AIzaSyAe6tiuxFXER1C2plptHVh0A4fWCfLhdwY"  # replace with your actual API key
  }

  data = {
    "contents": [
        {
            "parts": [
                {
                    "text": f"Based on this text: {text}\n\n {prompt}."
                }
            ]
        }
    ]
  }

  response = requests.post(url, headers=headers, json=data)

  return response.json()["candidates"][0]["content"]["parts"][0]["text"]



prompt = "Please generate a *markdown code* multiple choice quiz and give me an answer at the new line in 中文 and vietnemese with detailed explanations for a student. Add tables or sections"

gemini_return_text = use_gemini_to_summerize(prompt, text)

return_text = gemini_return_text

print(return_text)












# Predicts on folder of images
#img_dir = "/content/test_imgs"
#out_dir = "/content/preds"
#model_predict(img_dir, out_dir)

# Displays the images
#original = "/content/test_images/xi_test.png"
#predicted = "/content/model_output/xi_test_ocr_res_img.png"

#display_stuff(original, predicted)


# Extracts the text
#text_json = "/content/preds/xi_test_res.json"
#words = extract_text(text_json)
#print(words)



Okay, here is a multiple-choice quiz based on the provided text, along with the answers, explanations in both Chinese and Vietnamese, and formatted using Markdown.

```markdown
## 人民日报 (Renmin Ribao - People's Daily) 2017年10月 Quiz

**根据以上《人民日报》文章，请选择最准确的答案：**

**1. 2017年10月，习近平与哪位外国领导人进行了通话？**

a) 普京 (Putin)
b) 默克尔 (Merkel)
c) 特朗普 (Trump)
d) 安倍晋三 (Abe Shinzo)

**2. 特朗普在通话中对习近平表达了什么祝贺？**

a) 中国国庆节快乐 (Happy Chinese National Day)
b) 中共十八大胜利闭幕 (Successful conclusion of the 18th CPC National Congress)
c) 中共十九大胜利闭幕以及再次当选中共中央总书记 (Successful conclusion of the 19th CPC National Congress and his re-election as General Secretary of the CPC Central Committee)
d) 在联合国大会上取得成功 (Success at the UN General Assembly)

**3.  根据文章，中国共产党第十九届中央委员会第一次全体会议在哪一天举行？**

a) 2017年10月5日
b) 2017年10月25日
c) 2017年10月2日
d) 2017年10月26日

**4. 谁当选为中共中央总书记？**

a) 李克强 (Li Keqiang)
b) 栗战书 (Li Zhanshu)
c) 习近平 (Xi Jinping)
d) 王沪宁 (Wang Huning)

**5. 以下哪一项不属于中央政治局常务委员会委员？**

a) 习近平
b) 李克强
c) 栗战书
d) 丁薛祥
---
**答案：**

1.  c) 特朗普 (Tru

In [None]:
# 1️⃣ Clone the repo (or skip if already cloned)
!git clone https://github.com/CatFatOw/Hack_Rice_2025.git /content/Hack_Rice_2025

# 2️⃣ Change directory into the repo
%cd /content/Hack_Rice_2025

# 3️⃣ Check current branch
!git branch

# 4️⃣ If your branch is 'master' but you want 'main', rename it
!git branch -M main

# 5️⃣ Pull latest changes from GitHub to avoid conflicts
!git pull origin main

# 6️⃣ Copy your test images into the repo (if they are outside)
!mkdir -p test_images  # create folder if it doesn't exist
!cp /content/test_images/* ./test_images/

# 7️⃣ Add new/modified files
!git add test_images

# 8️⃣ Commit changes
!git commit -m "Add test images"

# 9️⃣ Push changes to GitHub
!git push -u origin main


fatal: destination path '/content/Hack_Rice_2025' already exists and is not an empty directory.
/content/Hack_Rice_2025
* [32mmain[m
From https://github.com/CatFatOw/Hack_Rice_2025
 * branch            main       -> FETCH_HEAD
Already up to date.
[main 8032b89] Add test images
 3 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 test_images/INL_Exceptional_Innovation_Wu_Michael.jpg
 create mode 100644 test_images/spot_it_test.png
 create mode 100644 test_images/xi_test.png
fatal: could not read Username for 'https://github.com': No such device or address


In [None]:
!git config --global user.email "michaelwufluffy@gmail.com"
!git config --global user.name "CatFatOw"


In [None]:
# 1️⃣ Clone the repo (or skip if already cloned)
!git clone https://github.com/CatFatOw/Hack_Rice_2025.git /content/Hack_Rice_2025

# 2️⃣ Change directory into the repo
%cd /content/Hack_Rice_2025

# 3️⃣ Check current branch
!git branch

# 4️⃣ If your branch is 'master' but you want 'main', rename it
!git branch -M main

# 5️⃣ Pull latest changes from GitHub to avoid conflicts
!git pull origin main

# 6️⃣ Copy your test images into the repo (if they are outside)
!mkdir -p test_images  # create folder if it doesn't exist
!cp /content/test_images/* ./test_images/

# 7️⃣ Add new/modified files
!git add /content/test_images

# 8️⃣ Commit changes
!git add test_images


# 9️⃣ Push changes to GitHub
!git push -u origin main


fatal: destination path '/content/Hack_Rice_2025' already exists and is not an empty directory.
/content/Hack_Rice_2025
* [32mmain[m
From https://github.com/CatFatOw/Hack_Rice_2025
 * branch            main       -> FETCH_HEAD
Already up to date.
fatal: /content/test_images: '/content/test_images' is outside repository at '/content/Hack_Rice_2025'
fatal: could not read Username for 'https://github.com': No such device or address
