In [7]:
!pip -q install agno pypdf

In [9]:
import os, getpass
os.environ["OPENAI_API_KEY"] = getpass.getpass("Paste OPENAI_API_KEY: ")

Paste OPENAI_API_KEY: ··········


In [17]:
from pathlib import Path

WORKSPACE = Path("/content/workspace")

# Find your PDF
pdfs = list(WORKSPACE.glob("*.pdf"))
if not pdfs and (WORKSPACE/"paper.txt").exists():
    # if paper.txt is actually the PDF, rename back
    (WORKSPACE/"paper.txt").rename(WORKSPACE/"paper.pdf")

print("Files now:")
!ls -la /content/workspace

Files now:
total 20064
drwxr-xr-x 2 root root     4096 Feb 21 22:11 .
drwxr-xr-x 1 root root     4096 Feb 21 22:08 ..
-rw-r--r-- 1 root root 10267274 Feb 21 21:55 Denoising-Diffusion-Probabilistic-Models.pdf
-rw-r--r-- 1 root root 10267274 Feb 21 22:08 paper.txt


This code block ensures that the workspace contains a valid PDF file before further processing. It first defines the workspace directory using Path for clean file handling. Then it searches for any .pdf files inside that directory. If no PDF is found but a file named paper.txt exists, it assumes that the PDF was accidentally renamed to paper.txt in a previous step and renames it back to paper.pdf. Finally, it prints the current contents of the workspace so you can visually confirm that the correct files are present. This acts as a safety check to prevent file-handling errors in the agent workflow.

In [19]:
from google.colab import files

uploaded = files.upload()  # choose paper.pdf
for fn in uploaded.keys():
    src = Path(fn)
    dst = WORKSPACE / fn
    dst.write_bytes(src.read_bytes())
    print("Saved:", dst)

Saving Denoising-Diffusion-Probabilistic-Models.pdf to Denoising-Diffusion-Probabilistic-Models (2).pdf
Saved: /content/workspace/Denoising-Diffusion-Probabilistic-Models (2).pdf


This code uploads a PDF file from your local computer into the Colab environment and saves it inside the defined workspace directory. The files.upload() function opens a file picker so you can choose paper.pdf. It then loops through the uploaded files (in case you upload more than one), creates a source path for the temporary uploaded file, and defines a destination path inside the workspace. The file is copied byte-for-byte into the workspace using write_bytes, ensuring the PDF remains unchanged. Finally, it prints the saved file path so you can confirm that the upload was successful.

In [21]:
!ls -la /content/workspace

total 30092
drwxr-xr-x 2 root root     4096 Feb 21 22:14  .
drwxr-xr-x 1 root root     4096 Feb 21 22:14  ..
-rw-r--r-- 1 root root 10267274 Feb 21 22:14 'Denoising-Diffusion-Probabilistic-Models (2).pdf'
-rw-r--r-- 1 root root 10267274 Feb 21 21:55  Denoising-Diffusion-Probabilistic-Models.pdf
-rw-r--r-- 1 root root 10267274 Feb 21 22:08  paper.txt


In [22]:
!pip -q install pypdf

from pypdf import PdfReader

pdf_path = WORKSPACE / "Denoising-Diffusion-Probabilistic-Models (2).pdf"
reader = PdfReader(str(pdf_path))

chunks = []
for i, page in enumerate(reader.pages):
    t = page.extract_text() or ""
    chunks.append(f"\n\n===== PAGE {i+1} =====\n{t}")

paper_txt = "\n".join(chunks)
(WORKSPACE / "paper.txt").write_text(paper_txt, encoding="utf-8")

print("paper.txt chars:", len(paper_txt))
!head -n 20 /content/workspace/paper.txt

paper.txt chars: 56805


===== PAGE 1 =====
Denoising Diffusion Probabilistic Models
Jonathan Ho
UC Berkeley
jonathanho@berkeley.edu
Ajay Jain
UC Berkeley
ajayj@berkeley.edu
Pieter Abbeel
UC Berkeley
pabbeel@cs.berkeley.edu
Abstract
We present high quality image synthesis results using diffusion probabilistic models,
a class of latent variable models inspired by considerations from nonequilibrium
thermodynamics. Our best results are obtained by training on a weighted variational
bound designed according to a novel connection between diffusion probabilistic
models and denoising score matching with Langevin dynamics, and our models nat-
urally admit a progressive lossy decompression scheme that can be interpreted as a


This code installs the pypdf library and uses it to extract text from a PDF file inside the workspace. It loads the specified PDF using PdfReader, then iterates through each page, extracting the text content. For clarity and traceability, it prepends each page’s text with a page separator header (e.g., “===== PAGE 1 =====”). All extracted page texts are combined into a single string and written to a new file called paper.txt in UTF-8 encoding. Finally, it prints the total number of extracted characters to confirm successful extraction and displays the first 20 lines of paper.txt to verify that the text looks correct.

In [14]:
from agno.agent import Agent
from agno.models.openai import OpenAIResponses
from agno.tools.coding import CodingTools
agent = Agent(
    name="PaperReviewer",
    model=OpenAIResponses(id="gpt-4o-mini"),
    instructions=(
        "You are a paper reviewing agent.\n"
        "You MUST use tools.\n"
        "Step 1: Open and read paper.txt from the workspace.\n"
        "Step 2: Write the structured review to review.md in the workspace.\n"
        "Step 3: Verify by listing files.\n\n"
        "REVIEW FORMAT:\n"
        "1) Summary (3-5 bullets)\n"
        "2) Contributions (max 3 bullets)\n"
        "3) Strengths (max 4 bullets)\n"
        "4) Weaknesses (max 4 bullets)\n"
        "5) Questions for authors (max 5 bullets)\n"
        "6) Reproducibility checklist (bullets)\n"
        "7) Score (1-10) + 1-line justification\n\n"
        "You are not allowed to ask for the PDF. Use paper.txt."
    ),
    tools=[CodingTools(base_dir=WORKSPACE, all=True)],
    markdown=True,
)

This code creates a Level-1 paper reviewing agent using the Agno framework. It imports the Agent class, the OpenAI model wrapper, and the CodingTools toolkit, which allows the agent to read, write, and execute files inside the defined workspace. The agent is configured with the lightweight gpt-4o-mini model for cost efficiency. The instructions strictly enforce tool usage and define a step-by-step workflow: read paper.txt, write a structured review to review.md, and verify the output by listing files. A clear review format is specified to ensure consistent structure and concise bullet-point output. The agent is also explicitly forbidden from requesting the PDF, forcing it to rely only on the extracted text file. The CodingTools with all=True grants full file and shell access within the workspace, and markdown=True ensures clean formatted responses.

In [23]:
agent.print_response(
    "Read paper.txt and write review.md. Then list files and show the first 40 lines of review.md.",
    stream=True,
)

Output()

In [24]:
!ls -la /content/workspace
!sed -n '1,40p' /content/workspace/review.md

total 20128
drwxr-xr-x 2 root root     4096 Feb 21 22:18  .
drwxr-xr-x 1 root root     4096 Feb 21 22:14  ..
-rw-r--r-- 1 root root 10267274 Feb 21 22:14 'Denoising-Diffusion-Probabilistic-Models (2).pdf'
-rw-r--r-- 1 root root 10267274 Feb 21 21:55  Denoising-Diffusion-Probabilistic-Models.pdf
-rw-r--r-- 1 root root    58106 Feb 21 22:16  paper.txt
-rw-r--r-- 1 root root     3049 Feb 21 22:18  review.md
# Review of the Paper: Denoising Diffusion Probabilistic Models

## 1) Summary
- This paper presents a novel approach to image synthesis using diffusion probabilistic models, achieving high sample quality.
- The authors establish a connection between diffusion probabilistic models and denoising score matching, leading to innovative training strategies.
- The paper reports superior Inception and FID scores compared to state-of-the-art methods, demonstrating the effectiveness of the proposed models on datasets like CIFAR10 and LSUN.
- The implementation and research are made accessible t