# Turorial 2: Understanding CycleReviewer 

The CycleReviewer (WhizReviewer) is a set of generative large language models that have undergone additional supervised training, with sizes of 8B, 70B, and 123B respectively. All models are pure text language models, with the 8B and 70B derived from the Llama3.1 pre-trained language model, and the 123B from the Mistral-Large-2 model. They all use the Transformer architecture.




In [None]:
import re

import torch
from transformers import AutoTokenizer
from vllm import LLM, SamplingParams

model_name = "WestlakeNLP/WhizReviewer-ML-Pro-123B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
llm = LLM(
        model=model_name,
        tensor_parallel_size=8,
        max_model_len=18000,
        gpu_memory_utilization=0.95,
    )

system_prompt = \
"""You are an expert academic reviewer tasked with providing a thorough and balanced evaluation of research papers. For each paper submitted, conduct a comprehensive review addressing the following aspects:

1. Summary: Briefly outline main points and objectives.
2. Soundness: Assess methodology and logical consistency.
3. Presentation: Evaluate clarity, organization, and visual aids.
4. Contribution: Analyze significance and novelty in the field.
5. Strengths: Identify the paper's strongest aspects.
6. Weaknesses: Point out areas for improvement.
7. Questions: Pose questions for the authors.
8. Rating: Score 1-10, justify your rating.
9. Meta Review: Provide overall assessment and recommendation (Accept/Reject).

Maintain objectivity and provide specific examples from the paper to support your evaluation.

You need to fill out **4** review opinions."""


sampling_params = SamplingParams(temperature=0.4, top_p=0.95, max_tokens=4096)



In [5]:
import json
def get_paper_context(item):
    context = ""
    context += r'\title{'+item['title']+'}\n'
    context += r'\begin{abstract}'+'\n'+item['abstract']+'\n\end{abstract}\n'
    
    for section in item['sections']:
        context += r'\section{'+section[0]+'}\n'
        context += section[1]
    return context


with open('./demo_data.json', 'r', encoding='utf-8') as f:
    data = json.load(f)

print(json.dumps(get_paper(data[0])))


"\\title{CofiPara: A Coarse-to-fine Paradigm for Multimodal Sarcasm Target Identification with Large Multimodal Models}\n\\begin{abstract}\nSocial media abounds with multimodal sarcasm, and identifying sarcasm targets is particularly challenging due to the implicit incongruity not directly evident in the text and image modalities. Current methods for Multimodal Sarcasm Target Identification (MSTI) predominantly focus on superficial indicators in an end-to-end manner, overlooking the nuanced understanding of multimodal sarcasm conveyed through both the text and image. This paper proposes a versatile MSTI framework with a coarse-to-fine paradigm, by augmenting sarcasm explainability with reasoning and pre-training knowledge. Inspired by the powerful capacity of Large Multimodal Models (LMMs) on multimodal reasoning, we first engage LMMs to generate competing rationales for coarser-grained pre-training of a small language model on multimodal sarcasm detection. We then propose fine-tuning 

In [None]:
for item in data:
    context = get_paper_context(item)
    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": context},
    ]

    input_ids = tokenizer.apply_chat_template(messages+[{'role':'assistant','content':'\n\n## Reviewer\n'}], tokenize=False,add_generation_prompt=True)[:-4]
    outputs = llm.generate([input_ids], sampling_params)
    output = outputs[0]
    generated_text = output.outputs[0].text
    review_context = get_review_context(generated_text)