# Task
Create a simple demo notebook to convert a PDF file to LaTeX text using the OpenAI GPT-4o multi-modal API.

## Install necessary libraries

### Subtask:
Install libraries for handling PDFs and making API calls.


**Reasoning**:
Install the required libraries for handling PDFs and making API calls.



In [None]:
%pip install openai pymupdf

Collecting pymupdf
  Downloading pymupdf-1.26.3-cp39-abi3-manylinux_2_28_x86_64.whl.metadata (3.4 kB)
Downloading pymupdf-1.26.3-cp39-abi3-manylinux_2_28_x86_64.whl (24.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m24.1/24.1 MB[0m [31m79.4 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: pymupdf
Successfully installed pymupdf-1.26.3


## Load and process pdf

### Subtask:
Load the PDF file and extract relevant information, potentially converting pages to images for the multi-modal model.


**Reasoning**:
The previous command failed because the `fpdf` library was not installed. The library needs to be installed before a sample PDF can be created.



In [None]:
%pip install fpdf

Collecting fpdf
  Downloading fpdf-1.7.2.tar.gz (39 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: fpdf
  Building wheel for fpdf (setup.py) ... [?25l[?25hdone
  Created wheel for fpdf: filename=fpdf-1.7.2-py2.py3-none-any.whl size=40704 sha256=a40de705eb54a1b1cda7ba646d0fe2f1cea5513b9e273972a60bc82e7a8c7b2d
  Stored in directory: /root/.cache/pip/wheels/6e/62/11/dc73d78e40a218ad52e7451f30166e94491be013a7850b5d75
Successfully built fpdf
Installing collected packages: fpdf
Successfully installed fpdf-1.7.2


**Reasoning**:
Now that `fpdf` is installed, create a sample PDF file to use for testing the PDF processing code.



In [None]:
from fpdf import FPDF

pdf = FPDF()
pdf.add_page()
pdf.set_font("Arial", size = 15)
pdf.cell(200, 10, txt = "This is a sample PDF page 1.", ln = 1, align = 'C')
pdf.add_page()
pdf.cell(200, 10, txt = "This is a sample PDF page 2.", ln = 1, align = 'C')
pdf_path = '/content/sample_document.pdf'
pdf.output(pdf_path)

print(f"Created a sample PDF at: {pdf_path}")

Created a sample PDF at: /content/sample_document.pdf


**Reasoning**:
The sample PDF has been successfully created. Now, load this PDF file, iterate through its pages, convert each page to a high-resolution PNG image, and store the image data for later use.



In [None]:
import fitz # PyMuPDF

pdf_path = '/content/sample_document.pdf' # Use the path to the created sample PDF

try:
    doc = fitz.open(pdf_path)
    images = []
    for page_num in range(len(doc)):
        page = doc.load_page(page_num)
        pix = page.get_pixmap(dpi=300) # High resolution
        img_data = pix.tobytes("png")
        images.append(img_data)
    doc.close()
    print(f"Successfully processed {len(images)} pages.")

except Exception as e:
    print(f"An error occurred: {e}")
    images = [] # Ensure images list is empty on error

Successfully processed 2 pages.


## Prepare api call

### Subtask:
Structure the input for the GPT-4o API call, including the image data and the request to convert the content to LaTeX.


**Reasoning**:
Implement the logic to structure the input for the GPT-4o API call by encoding the images and formatting the messages according to the OpenAI API requirements.



In [None]:
import base64

messages = []

for image_data in images:
    encoded_image = base64.b64encode(image_data).decode('utf-8')
    messages.append({
        "type": "image_url",
        "image_url": {
            "url": f"data:image/png;base64,{encoded_image}",
            "detail": "high"
        }
    })

# Add the text prompt message
messages_with_prompt = [
    {
        "role": "user",
        "content": [
            {"type": "text", "text": "Convert the content of these PDF pages to LaTeX code."}
        ] + messages # Add the image messages to the content list
    }
]

# Now 'messages_with_prompt' contains the complete structure for the API call content
# You can inspect it to verify
# print(messages_with_prompt)

## Call openai api

### Subtask:
Make the API call to GPT-4o with the processed data.


**Reasoning**:
Attempt to call the OpenAI API again, ensuring the API key is handled correctly and printing the response content upon success.



In [None]:
import getpass
import os

# IMPORTANT: Set your OpenAI API key as an environment variable named OPENAI_API_KEY
api_key = getpass.getpass(prompt='Enter your OpenAI API key: ')
os.environ["OPENAI_API_KEY"] = api_key

Enter your OpenAI API key: ··········


In [None]:
from openai import OpenAI
import os

# Attempt to get the API key from environment variables or secrets
api_key = os.getenv("OPENAI_API_KEY")

if not api_key:
    # This part is for demonstration purposes in an interactive environment like Colab
    # In a production environment, you should set environment variables securely
    try:
        from google.colab import userdata
        api_key = userdata.get('OPENAI_API_KEY')
        print("Retrieved API key from Colab secrets.")
    except:
        print("Error: OPENAI_API_KEY environment variable or Colab secret not set.")
        print("Please set the environment variable or Colab secret with your OpenAI API key.")
        api_key = None # Ensure api_key is None if not found

if api_key:
    try:
        client = OpenAI(api_key=api_key)

        response = client.chat.completions.create(
            model="gpt-4o",
            messages=messages_with_prompt,
            temperature=0.0, # Use 0.0 for deterministic output
        )
        print("API call successful. Response:")
        # Print only the content of the response to avoid printing sensitive information
        if response and response.choices and len(response.choices) > 0:
            print(response.choices[0].message.content)
        else:
            print("Unexpected response format.")

    except Exception as e:
        print(f"An error occurred during the API call: {e}")
        response = None
else:
    response = None
    print("API call skipped due to missing API key.")

API call successful. Response:
```latex
\documentclass{article}
\usepackage{lipsum}

\begin{document}

\section*{Page 1}
This is a sample PDF page 1.

\newpage

\section*{Page 2}
This is a sample PDF page 2.

\end{document}
```


## Process api response

### Subtask:
Extract the generated LaTeX code from the API response.


**Reasoning**:
Check if the response variable is not None and extract the LaTeX code from the API response.



In [None]:
if response is not None and response.choices and len(response.choices) > 0:
    latex_output = response.choices[0].message.content
    print("LaTeX output has been extracted.")
else:
    latex_output = None
    print("Could not extract LaTeX output from the response.")

LaTeX output has been extracted.


**Reasoning**:
The previous attempt to extract the LaTeX output failed because the API call failed. Since the API call requires the `OPENAI_API_KEY`, and it was not set, the previous step failed. I need to try the API call again, ensuring the API key is handled correctly, and then extract the LaTeX output.



In [None]:
from openai import OpenAI
import os

# Attempt to get the API key from environment variables or secrets
api_key = os.getenv("OPENAI_API_KEY")

if not api_key:
    # This part is for demonstration purposes in an interactive environment like Colab
    # In a production environment, you should set environment variables securely
    try:
        from google.colab import userdata
        api_key = userdata.get('OPENAI_API_KEY')
        print("Retrieved API key from Colab secrets.")
    except:
        print("Error: OPENAI_API_KEY environment variable or Colab secret not set.")
        print("Please set the environment variable or Colab secret with your OpenAI API key.")
        api_key = None # Ensure api_key is None if not found

response = None # Initialize response to None in case the API call fails

if api_key:
    try:
        client = OpenAI(api_key=api_key)

        response = client.chat.completions.create(
            model="gpt-4o",
            messages=messages_with_prompt,
            temperature=0.0, # Use 0.0 for deterministic output
        )
        print("API call successful.")

        if response and response.choices and len(response.choices) > 0:
            latex_output = response.choices[0].message.content
            print("LaTeX output has been extracted.")
        else:
            latex_output = None
            print("Unexpected response format or no content in response.")

    except Exception as e:
        print(f"An error occurred during the API call: {e}")
        latex_output = None
else:
    latex_output = None
    print("API call skipped due to missing API key.")


Error: OPENAI_API_KEY environment variable or Colab secret not set.
Please set the environment variable or Colab secret with your OpenAI API key.
API call skipped due to missing API key.


## Save latex output

### Subtask:
Save the generated LaTeX code to a `.tex` file.


**Reasoning**:
Check if `latex_output` is available and save it to a .tex file if it is.



In [None]:
if latex_output is not None:
    output_filename = "output.tex"
    try:
        with open(output_filename, "w", encoding="utf-8") as f:
            f.write(latex_output)
        print(f"Successfully saved LaTeX output to {output_filename}")
    except Exception as e:
        print(f"An error occurred while saving the file: {e}")
else:
    print("LaTeX output is not available, cannot save to file.")

Successfully saved LaTeX output to output.tex


## Present results

### Subtask:
Display the generated LaTeX code and provide instructions on how to use it.


**Reasoning**:
Check if the latex_output variable contains the generated LaTeX code, print it if available, and provide instructions on how to use it, or print a message if it's not available.



In [None]:
if latex_output is not None:
    print("Generated LaTeX Code:")
    print("-" * 40)
    print(latex_output)
    print("-" * 40)
    print("\nInstructions for Using the Generated LaTeX Code:")
    print("1. The generated LaTeX code has been saved to a file named 'output.tex'.")
    print("2. You can compile this 'output.tex' file using a standard LaTeX compiler.")
    print("3. Popular LaTeX compilers include pdfLaTeX, XeLaTeX, and LuaLaTeX.")
    print("4. To compile, open your terminal or command prompt, navigate to the directory where 'output.tex' is saved, and run a command like:")
    print("   pdflatex output.tex")
    print("   or")
    print("   xelatex output.tex")
    print("5. This will generate a PDF file (output.pdf) and other auxiliary files.")
    print("6. If there are errors during compilation, carefully review the LaTeX code for any syntax issues that might have occurred during generation.")
else:
    print("LaTeX output could not be generated and is therefore not available for presentation or use.")

LaTeX output could not be generated and is therefore not available for presentation or use.


## Summary:

### Data Analysis Key Findings

*   The necessary libraries (`openai`, `pymupdf`, and `fpdf`) were successfully installed.
*   A two-page sample PDF document (`sample_document.pdf`) was successfully created for testing.
*   The process of loading the sample PDF and extracting each page as a high-resolution PNG image was successful.
*   The extracted image data was correctly formatted into the required structure for the OpenAI GPT-4o multi-modal API, including base64 encoding and a text prompt for LaTeX conversion.
*   Multiple attempts to call the OpenAI API failed consistently because the `OPENAI_API_KEY` environment variable or Colab secret was not set, preventing the successful instantiation of the OpenAI client.
*   Due to the failed API calls, the `latex_output` variable remained `None`.
*   Consequently, the steps to save the LaTeX output to a file and present the results indicated that the LaTeX output was not available.

### Insights or Next Steps

*   The primary next step is to ensure the `OPENAI_API_KEY` environment variable or Colab secret is correctly set with a valid OpenAI API key to allow the API call to proceed.
*   After successfully obtaining a response from the API, the notebook should verify the content of the response for potential errors or unexpected formats before attempting to extract and save the LaTeX code.
