In [1]:
import scipdf
import sys
sys.path.append('../')

from pdf_parser import parse_pdf_content, parse_pdf_abstract, generate_user_input 

This is the system prompt for generating the reviews.

In [8]:
SYSTEM_PROMPT = """
You are a professional machine learning conference reviewer who reviews a given paper and considers 4 criteria: [Significance and novelty], [Potential reasons for acceptance], [Potential reasons for rejection], and [Suggestions for improvement]. Please ensure that for each criterion, you summarize and provide random number of detailed supporting points from the content of the paper. And for each supporting point within each of criteria, use the format: '<title of supporting point>' followed by a detailed explanation. The criteria you need to focus on are:

1. [Significance and novelty]: Assess the importance of the paper in its research field and the innovation of its methods or findings。
2. [Potential reasons for acceptance]: Summarize reasons that may support the acceptance of the paper, based on its quality, research results, experimental design, etc.
3. [Potential reasons for rejection]: Identify and explain flaws or shortcomings that could lead to the paper's rejection.
4. [Suggestions for improvement]: Provide specific suggestions to help the authors improve the paper and increase its chances of acceptance.

After reading the content of the paper provided below, your response should only include your reviews only, which means always start with [Significance and novelty], dont' repeat the given paper and output things other than your reviews in required format, just extract and summarize information related to these criteria from the provided paper. The paper is given as follows:
"""

Parse the PDF file and extract the text.

In [28]:
pdf = scipdf.parse_pdf_to_dict('../demo/Transformers.pdf')

# Model Inference

In [1]:
import torch
from transformers import AutoTokenizer
from peft import AutoPeftModelForCausalLM

  warn("The installed version of bitsandbytes was compiled without GPU support. "


'NoneType' object has no attribute 'cadam32bit_grad_fp32'


In [3]:
model_id = "travis0103/mistral_7b_paper_review_lora"

model = AutoPeftModelForCausalLM.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

tokenizer_config.json:   0%|          | 0.00/1.61k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.80M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/557 [00:00<?, ?B/s]

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


adapter_model.safetensors:   0%|          | 0.00/1.20G [00:00<?, ?B/s]

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


In [None]:
abstract = parse_pdf_abstract(pdf) 
abstract_input = generate_user_input(abstract) 
print(abstract_input)

In [49]:
messages = [
    {"role": "user", "content": SYSTEM_PROMPT},
    {"role": "user", "content": abstract_input}
]
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
encoded_input = tokenizer.apply_chat_template(messages, return_tensors="pt").to(device)
generated_ids = model.generate(encoded_input, max_new_tokens=1024, do_sample=True, pad_token_id=tokenizer.eos_token_id, eos_token_id=tokenizer.eos_token_id)
decoded_output = tokenizer.decode(generated_ids[0], skip_special_tokens=True)

In [4]:
import re

def extract_sections(text):
    # Define the pattern to capture each section
    pattern = r"\[(Significance and novelty|Potential reasons for acceptance|Potential reasons for rejection|Suggestions for improvement)\]" \
            r"(.*?)(?=\n\[(Significance and novelty|Potential reasons for acceptance|Potential reasons for rejection|Suggestions for improvement)\]|\Z)"
    
    # Use re.findall to get all matches
    sections = re.findall(pattern, text, flags=re.DOTALL)

    # Convert list of tuples into a dictionary for easier access
    section_dict = {section[0]: section[1].strip() for section in sections}
    
    final_output = ""
    for key, value in section_dict.items():
        final_output += f"[{key}]\n{value}\n\n"
        
    return final_output


In [45]:
print(extract_sections(decoded_output))


[Significance and novelty]
<Innovation in model architecture> The paper introduces a new and innovative way of doing sequence-to-sequence translation with the transformer architecture, demonstrating significant advantages over previous methods through extensive experimental evaluation.
<Thorough experimental evaluation> The paper conducts a thorough evaluation and comparison against other attention models, addressing concerns and demonstrating the superiority of the proposed approach through substantial improvements in translation performance.

[Potential reasons for acceptance]
<Impressive experimental results> The proposed transformer architecture achieves strong results and advances the state-of-the-art in sequence-to-sequence translation tasks.
<Addressing concerns> The authors have made effort to address the concerns raised by the reviewers and have committed to improving the paper based on the feedback received.

[Potential reasons for rejection]
<Inadequate experimental evaluati

## Run Inference on review results

In [6]:
import pandas as pd 

df = pd.read_csv('../demo/data/review_results.csv')
df.head()

Unnamed: 0,title,json_file,full_input,abstract_input,human_reviews_summaries,human_reviews_summaries_lengths,gpt_3.5_full_reviews_summaries,gpt_3.5_full_reviews_summaries_lengths,gpt_3.5_full_reviews_summaries_match,gpt_3.5_full_reviews_summaries_match_lengths,...,gpt_4_full_reviews_summaries_match,gpt_4_full_reviews_summaries_match_lengths,gpt_4_abstract_reviews_summaries,gpt_4_abstract_reviews_summaries_lengths,gpt_4_abstract_reviews_summaries_match,gpt_4_abstract_reviews_summaries_match_lengths,gpt_4_abstract_reviews_one_shot_summaries,gpt_4_abstract_reviews_one_shot_summaries_lengths,gpt_4_abstract_reviews_one_shot_summaries_match,gpt_4_abstract_reviews_one_shot_summaries_match_lengths
0,A Walk with SGD: How SGD Explores Regions of D...,B1l6e3RcF7.json,[TITLE]\nA WALK WITH SGD: HOW SGD EXPLORES REG...,[TITLE]\nA WALK WITH SGD: HOW SGD EXPLORES REG...,"{\n ""1"": {\n ""summary"": ""The review ...",3,"{\n ""1"": {\n ""summary"": ""The paper l...",4,"{\n ""A2-B4"": {\n ""rationale"": ""Both ...",2,...,{},0,"{\n ""1"": {\n ""summary"": ""The paper l...",2,"{\n ""A3-B2"": {\n ""rationale"": ""Both ...",1,"{\n ""1"": {\n ""summary"": ""Lack of the...",2,"{\n ""A2-B1"": {\n ""rationale"": ""Both ...",2
1,Neural representation and generation for RNA s...,snOgiCYZgJ7.json,[TITLE]\nNEURAL REPRESENTATION AND GENERATION ...,[TITLE]\nNEURAL REPRESENTATION AND GENERATION ...,"{\n ""1"": {\n ""summary"": ""The paper l...",4,"{\n ""1"": {\n ""summary"": ""Concerns ab...",4,"{\n ""A1-B4"": {\n ""rationale"": ""Both ...",2,...,"{\n ""A1-B1"": {\n ""rationale"": ""Both ...",1,"{\n ""1"": {\n ""summary"": ""Lack of emp...",2,{},0,"{\n ""1"": {\n ""summary"": ""Complexity ...",2,{},0
2,Deep Neural Networks as Gaussian Processes,B1EA-M-0Z.json,[TITLE]\nDEEP NEURAL NETWORKS AS GAUSSIAN PROC...,[TITLE]\nDEEP NEURAL NETWORKS AS GAUSSIAN PROC...,"{\n ""1"": {\n ""summary"": ""Lack of com...",4,"{\n ""1"": {\n ""summary"": ""Computation...",4,"{\n ""A1-B4"": {\n ""rationale"": ""Both ...",1,...,"{\n ""A2-B1"": {\n ""rationale"": ""Both ...",2,"{\n ""1"": {\n ""summary"": ""The paper m...",3,"{\n ""A2-B2"": {\n ""rationale"": ""Both ...",2,"{\n ""1"": {\n ""summary"": ""Computation...",3,"{\n ""A1-B1"": {\n ""rationale"": ""Both ...",3
3,PseudoSeg: Designing Pseudo Labels for Semanti...,-TwO99rbVRu.json,[TITLE]\nPSEUDOSEG: DESIGNING PSEUDO LABELS FO...,[TITLE]\nPSEUDOSEG: DESIGNING PSEUDO LABELS FO...,"{\n ""1"": {\n ""summary"": ""Concerns ab...",3,"{\n ""1"": {\n ""summary"": ""Insufficien...",4,"{\n ""A2-B2"": {\n ""rationale"": ""Both ...",1,...,{},0,"{\n ""1"": {\n ""summary"": ""The paper l...",2,"{\n ""A2-B1"": {""rationale"": ""Review A2 quest...",1,"{\n ""1"": {\n ""summary"": ""Comparison ...",2,"{\n ""A2-B2"": {\n ""rationale"": ""Both ...",2
4,Model-Free Model Reconciliation,Byex4T2XcV.json,[TITLE]\nModel-Free Model Reconciliation\n\n[A...,[TITLE]\nModel-Free Model Reconciliation\n\n[A...,"{\n ""1"": {\n ""summary"": ""Lack of cla...",2,"{\n ""1"": {\n ""summary"": ""Assumption ...",4,"{\n ""A2-B4"": {\n ""rationale"": ""Both ...",1,...,"{\n ""A2-B2"": {\n ""rationale"": ""Both ...",3,"{\n ""1"": {\n ""summary"": ""The paper l...",3,"{\n ""A1-2-B2"": {\n ""rationale"": ""Bot...",2,"{\n ""1"": {\n ""summary"": ""Concern abo...",2,"{\n ""A2-2-B2"": {\n ""rationale"": ""Bot...",1


In [None]:
model_reviews = []

In [None]:
from tqdm import tqdm

for abstract in tqdm(df['abstract_input'], total=len(df), desc="Generating reviews"):
    messages = [
        {"role": "user", "content": SYSTEM_PROMPT},
        {"role": "user", "content": abstract},
    ]
    encoded_input = tokenizer.apply_chat_template(messages, return_tensors="pt")
    generated_ids = model.generate(
        encoded_input,
        max_new_tokens=1024,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id,
        eos_token_id=tokenizer.eos_token_id,
    )
    decoded_output = tokenizer.decode(generated_ids[0], skip_special_tokens=True)
    model_reviews.append(extract_sections(decoded_output))

# GPT Inference

In [15]:
import os
import openai

os.environ["OPENAI_API_KEY"] = ""
client = openai.Client()

In [None]:
full_content = parse_pdf_content(pdf)
full_input = generate_user_input(full_content)

In [16]:
response = client.chat.completions.create(
    model="gpt-4-turbo",
    messages=[
      {"role": "system", "content": SYSTEM_PROMPT},
      {"role": "user", "content": full_input},
    ]
)

In [17]:
print(response.choices[0].message.content)

[Significance and novelty]
The "Transformer" paper introduces a novel architecture that challenges the dominant sequence transduction models which rely on recurrent neural networks (RNNs) and convolutional neural networks (CNNs). The main contributions and innovations of this work include the introduction of an architecture based entirely on attention mechanisms, eliminating the need for recurrence or convolution within the model, which represents a significant shift in the approach to machine translation and potentially other sequence modeling tasks.

- 'Elimination of recurrence and convolution': The proposed Transformer model is novel in its exclusive reliance on attention mechanisms, avoiding traditional RNNs and CNNs which can be computationally expensive and slow due to their inherent sequential nature.
- 'Optimization for Parallelization': The Transformer is designed to maximize parallel computational capabilities, reducing training time significantly compared to existing models