In [6]:
import os
import getpass
os.environ["ANTHROPIC_API_KEY"] = getpass.getpass()

# Ranker

In [57]:
RANKER_PROMPT = """You are a highly critical and discerning research program evaluator with deep expertise in compression and the sciences.

There has been a recent BREAKTHROUGH in the field of compression. Your task is to explore novel and impactful discoveries that this BREAKTHROUGH will enable.

You will be given two inputs:
1) a summary of the research breakthrough.
2) another research paper that explores a specific problem.

Your task is to EVALUATE whether this breakthrough could be leveraged to improve the given research paper significantly.

Think step by step, and plan a very specific strategy you could work on to improve this research paper in different ways:
- Change the methodology.
- Optimize the results.
- Change the results.
- Change the conclusion.

**Evaluation Metrics:**

1.  **NOVELTY (Score 1-5, 5 being highly novel):**
    *   How original is this idea?
    *   Does it propose a genuinely new application, methodology, or theoretical extension of the LMCompress breakthrough?
    *   Or, is it an obvious next step, a minor tweak to existing work, or a rehash of old concepts merely relabeled with "LLM compression"?
    *   **Justification required.**

2.  **IMPORTANCE (Score 1-5, 5 being critically important):**
    *   What is the potential impact of this idea if successfully realized?
    *   Would it solve a significant problem, open up major new capabilities, or substantially advance our understanding or application of semantic compression?
    *   Consider its practical, theoretical, or societal relevance.
    *   **Justification required.**

3.  **FEASIBILITY & VERIFIABILITY (Combined Score 1-5, 5 being highly feasible & verifiable):**
    *   **Feasibility:** Can a proof-of-concept (PoC) or a minimal viable demonstration for this idea be reasonably implemented with current or near-future resources and technology (assuming the LMCompress capability exists)?
    *   **Verifiability:** Can the success or failure of this idea be clearly demonstrated and validated *quickly*?
        *   This means an output that can be checked by code (e.g., a compression ratio achieved, a task completed, a specific measurable outcome).
        *   OR, by a human inspecting a tangible result (e.g., a decoded image's quality, a summarized text's accuracy, a correctly derived mathematical step, a functioning compressed communication channel) within minutes.
        *   Is the proposed verification method robust and unambiguous?
    *   **Justification required.**

4.  **RELIANCE ON SEMANTIC COMPRESSION (Score 1-5, 5 meaning critically reliant):**
    *   Is the *efficient compression of semantic data* (as enabled by LMCompress) the *core* enabler or the *primary bottleneck* that this idea addresses?
    *   In other words, was this idea largely impractical, significantly less effective, or even impossible *before* a breakthrough like LMCompress, specifically due to the inability to efficiently compress or leverage the semantic understanding of data like text, images, audio, or complex symbolic structures?
    *   If the idea could have been pursued almost as effectively with older compression techniques or non-compression-focused AI, it scores lower here.
    *   **Justification required.**

**Overall Assessment:**
Sum up all the scores yielded by the individual metrics to get a final score.

**Input Format:**
Here is the breaktrhough paper: [breakthrough_paper]
Here is the research paper: [research_paper]

**Output Format:**
-   Research paper summary: [...]
-   Strategy to improve the research paper: [...]
-   Novelty: [Score]/5. Justification: [...]
-   Importance: [Score]/5. Justification: [...]
-   Feasibility & Verifiability: [Score]/5. Justification: [...]
-   Reliance on Semantic Compression: [Score]/5. Justification: [...]
-   Overall Assessment: [Score]/20
---
"""

In [31]:
import PyPDF2
def extract_text_from_pdf(pdf_path):
    text = ""
    with open(pdf_path, 'rb') as file:
        reader = PyPDF2.PdfReader(file)
        for page in reader.pages:
            text += page.extract_text()
    return text

In [56]:
from anthropic import Anthropic
client = Anthropic()

def rank(breakthrough_paper, research_paper):
    breakthrough_paper = extract_text_from_pdf(breakthrough_paper)
    research_paper = extract_text_from_pdf(research_paper)

    res = client.messages.create(
        model="claude-opus-4-0",
        system=RANKER_PROMPT,
        max_tokens=2000,
        messages=[
            {"role": "user", "content": f"Here is the breakthrough paper: {breakthrough_paper}\n\nHere is the research paper: {research_paper}"}
        ]
    )
    with open(f'ranker_outputs/{breakthrough_paper}-{research_paper}.txt', 'wb') as f:
        f.write(res.content[0].text.encode('utf-8'))
    return res.content[0].text


# Main Loop

In [60]:
from concurrent.futures import ThreadPoolExecutor, as_completed
import tqdm

BREAKTHROUGH_DIR = '../optimizations/understanding_is_compression'
IDEA_DIR = '../optimizations/understanding_is_compression'

with ThreadPoolExecutor(max_workers=10) as executor:
    futures = []
    breakthroughs = os.listdir(BREAKTHROUGH_DIR)
    ideas = os.listdir(IDEA_DIR)
    
    for breakthrough in breakthroughs:
        for idea in ideas:
            futures.append(executor.submit(rank, f'{BREAKTHROUGH_DIR}/{breakthrough}', f'{IDEA_DIR}/{idea}'))
            print(f'submitted ranking of {breakthrough}-{idea}')
    
    for future in tqdm.tqdm(as_completed(futures), total=len(futures)):
        print(f"finished a ranking")


submitted ranking of https:arxiv.org:pdf:2407.07723.pdf-https:arxiv.org:pdf:2407.07723.pdf


100%|██████████| 1/1 [00:48<00:00, 48.32s/it]

finished ranking of ## Research Paper Summary

The research paper "Understanding is Compression" by Li et al. presents LMCompress, a revolutionary data compression method that leverages large language models (LLMs) to achieve unprecedented compression ratios. The core insight is that better understanding of data leads to better compression. LMCompress uses different models for different data types (iGPT for images/videos, fine-tuned LLaMA for audio/text) combined with arithmetic coding. The method achieves 2x better compression than JPEG-XL for images, FLAC for audio, H.264 for video, and 4x better than bz2 for text.

## Strategy to Improve the Research Paper

### 1. Change the Methodology
- **Multi-modal unified compression**: Instead of using separate models for different data types, develop a unified multi-modal LLM that can understand and compress all data types through a single architecture
- **Hierarchical compression**: Implement a two-stage approach where the first stage uses L


