<a href="https://colab.research.google.com/github/michaelwnau/consequential-products/blob/main/mindflayer_v1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#**Mindflayer: Analyzing Scientific Articles with OCR and Multimodal Models**

###This Colab Notebook demonstrates how to:

- Ingest and extract text from a PDF scientific article.
- Extract images, figures, and tables from the PDF.
- Apply OCR to images to extract embedded text.
- Use multimodal models to generate captions for images.
- Display and interpret the extracted content.

## Problem Statement


> "OBJECTIVE:
>The Defense Threat Reduction Agency (DTRA) seeks to develop AI/ML models and pipelines capable of identifying, extracting, and processing elements from scanned technical and scientific documents. This project aims to automate the extraction of tables, plots, photos, and other elements embedded within structured and unstructured text, ensuring high fidelity and accuracy in a production environment."

Source: "DTRA243-003 AI/ML Data Extraction from Scientific Documents." See the [full problem statement](https://www.dodsbirsttr.mil/submissions/api/public/download/solicitationDocuments?solicitation=DOD_SBIR_2024_P1_C3&documentType=INSTRUCTIONS&component=DTRA) here.




##**Table of Contents**
1. Setup
2. Ingest the article
3. Extract the images from the PDF
4. Apply OCR to Extract Text from Images
5. Interpret Images Using Multimodal Models
6. Extract and Interpret Tables
7. Display Results
8. Advanced Interpretation (Optional)

##**1. Setup**
First, let's install :the necessary libraries by running the cell.

In [None]:
# Install libraries
!pip install PyPDF2
!pip install PyMuPDF
!pip install pytesseract
!pip install Pillow
!pip install transformers
!pip install torch

In [None]:
# Import the libraries
import PyPDF2
import fitz  # PyMuPDF
import pytesseract
from PIL import Image
from transformers import BlipProcessor, BlipForConditionalGeneration
import torch
import os
from IPython.display import Image as IPythonImage, display
import pandas as pd
import io