# Datasheet Comparator - MVP

This notebook is part of a project that compares technical specifications from two electronic component datasheets.

Initially, the PDFs are provided as local files, but future versions will allow users to:
- Select datasheets interactively from within the notebook
- Search and retrieve part information from distributor APIs (e.g. Mouser, Digi-Key)
- Use AI to extract, analyze, and summarize key specifications and differences

The goal is to support engineers in identifying part changes, upgrades, or replacements efficiently.

# 📌 Section A: Setup

In [None]:
import os
import requests
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display
from openai import OpenAI
import fitz  # PyMuPDF for PDF parsing
import pandas as pd

# Load OpenAI API key from environment variable (recommended)

In [None]:
load_dotenv(override=True)
api_key = os.getenv("OPENAI_API_KEY")

In [None]:
openai = OpenAI()

# Define paths to datasheets
💬 **Note:** These example datasheet paths will later be replaced by a user-driven file selection dialog within the Jupyter notebook; optionally, this section could be extended to fetch component data directly from distributor websites.

In [None]:
pdf_path_1 = "./datasheets/part_old.pdf"
pdf_path_2 = "./datasheets/part_new.pdf"

# 📌 Section B: Extract text from datasheets

In [None]:
def extract_text_from_pdf(path):
    text = ""
    with fitz.open(path) as doc:
        for page in doc:
            text += page.get_text()
    return text

# 📌 Section C: Use ChatGPT to summarize and compare

## Section C.1: Define system_prompt

In [None]:
system_prompt = "You are a technical assistant helping to compare electronic component datasheets."

## Section C.2: Define user_prompt, summerize and compare

In [None]:
def summarize_datasheet(text, part_name, system_prompt):
    user_prompt = f"""
    Summarize the most important technical characteristics of the electronic component '{part_name}' based on this datasheet text:
    ---
    {text}
    ---
    Give a structured list of properties like voltage, current, dimensions, operating temperature, etc.
    """
    response = openai.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_prompt}
        ]
    )
    return response.choices[0].message.content
    
def compare_parts(text1, text2, system_prompt):
    user_prompt = f"""
    Compare the following two summaries of electronic components and evaluate whether the second part is a valid replacement for the first one.
    Identify any differences in electrical specs, mechanical dimensions, and compliance with medical device requirements.
    Suggest what changes would be required to use the second part in place of the first (e.g., schematic/layout changes).
    
    Old Part Summary:
    {text1}

    New Part Summary:
    {text2}

    Provide a table of differences and a short final recommendation.
    """
    response = openai.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_prompt}
        ]
    )
    return response.choices[0].message.content

# 📌 Section D: Put it all together and print it nicely.

In [None]:
def display_summary_and_compare(part1, part2, system_prompt):
    content1 = extract_text_from_pdf(part1)
    content2 = extract_text_from_pdf(part2)
    summary1 = summarize_datasheet(content1, "Old Part", system_prompt)
    summary2 = summarize_datasheet(content2, "New Part", system_prompt)
    compare = compare_parts(summary1, summary2, system_prompt)
    report = summary1 + summary2 + compare
    display(Markdown(report))

In [None]:
display_summary_and_compare(pdf_path_1, pdf_path_2, system_prompt)

# 📌 Section E: Next Steps (to be developed)

# - Parse key properties into structured tables (e.g., using regex or ChatGPT)

# - Automatically download datasheets from distributor websites

# - Search for compatible parts via web APIs

# - Export results to Excel or Markdown