<a href="https://colab.research.google.com/github/jonbaer/googlecolab/blob/master/optillm_decoding.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
%%capture
!pip install optillm

In [None]:
from transformers import AutoModelForCausalLM, AutoTokenizer
from optillm.cot_decoding import cot_decode
from optillm.entropy_decoding import entropy_decode
import torch

model_name = "Qwen/Qwen2.5-0.5B-Instruct"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

messages = [
    {"role": "user", "content": "In a dance class of 20 students, 20% enrolled in contemporary dance, 25% of the remaining enrolled in jazz dance, and the rest enrolled in hip-hop dance. What percentage of the entire students enrolled in hip-hop dance?"}
]

# Generate the response using CoT decoding
result, confidence = cot_decode(model, tokenizer, messages, aggregate_paths=True, max_new_tokens=1024)
print(f"CoT Decoding:\n {result}")
# print(f"Confidence: {confidence}")

# Generate the response using Entropy decoding
result = entropy_decode(model, tokenizer, messages, max_new_tokens=1024)
print(f"\nEntropy Decoding:\n {result}")

CoT Decoding:
 Sure, let's determine how many students are enrolled in each type of dance and then calculate the percentage of all students who enroll in hip-hop dance.

1. **Determine the number of students enrolled in contemporary dance:**
   \[
   20\% \text{ of } 20 = 0.20 \times 20 = 4 \text{ students}
   \]

2. **Calculate the number of students remaining after those enrolled in contemporary dance:**
   \[
   20 - 4 = 16 \text{ students}
   \]

3. **Determine the number of students enrolled in jazz dance. Since jazz dance represents 25% of the remaining students, we calculate:**
   \[
   25\% \text{ of } 16 = 0.25 \times 16 = 4 \text{ students}
   \]

4. **Calculate the total number of students enrolled in jazz and hip-hop dance:**
   \[
   4 \text{ (contemporary dance)} + 4 \text{ (jazz dance)} = 8 \text{ students}
   \]

5. **Determine the number of students remaining who are enrolled in hip-hop dance:**
   \[
   20 - 8 = 12 \text{ students}
   \]

6. **Calculate the percentage

In [None]:
# Greedy decoding

def get_device():
  if torch.cuda.is_available():
    return torch.device("cuda")
  else:
    return torch.device("cpu")

device = get_device()
model = model.to(device)
# print(f"Using device: {device}")

# Prepare input with proper attention mask
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
attention_mask = torch.ones_like(input_ids)  # Create attention mask
input_ids = input_ids.to(device)
input_length = input_ids.shape[1]
attention_mask = attention_mask.to(device)

# Get pad and eos token ids
pad_token_id = tokenizer.pad_token_id
if pad_token_id is None:
    pad_token_id = tokenizer.eos_token_id

# Configure generation parameters properly for greedy decoding
output_ids = model.generate(
    input_ids,
    attention_mask=attention_mask,
    max_new_tokens=1024,
    do_sample=False,     # Greedy decoding
    num_beams=1,        # Single beam for greedy
    pad_token_id=pad_token_id,
    temperature=1.0,    # Remove or set to 1.0 for greedy
    top_p=1.0,         # Remove or set to 1.0 for greedy
    use_cache=True,    # Enable KV caching for faster generation
)

output_ids = output_ids.cpu()
# Decode only the newly generated tokens
response = tokenizer.decode(output_ids[0][input_length:], skip_special_tokens=True)
print(f"Greedy Decoding:\n {response}")

Greedy Decoding:
 To determine the percentage of the entire students enrolled in hip-hop dance, we need to follow these steps:

1. Calculate the number of students enrolled in contemporary dance.
2. Determine how many students remain after subtracting those who are enrolled in contemporary dance.
3. Calculate the number of students enrolled in jazz dance from the remaining students.
4. Find out how many students remain after subtracting those who are enrolled in jazz dance.
5. Determine the number of students enrolled in hip-hop dance from the remaining students.
6. Convert this number into a percentage of the total number of students.

Let's start with step 1:
- The total number of students is 20.
- 20% of the students enroll in contemporary dance.
\[ \text{Number of students in contemporary dance} = 0.20 \times 20 = 4 \]

Next, step 2:
- After subtracting the 4 students in contemporary dance, the number of students left is:
\[ 20 - 4 = 16 \]

Now, step 3:
- 25% of the remaining 16 st