In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import os
os.chdir("..")

In [39]:
import torch
import numpy as np
from transformers import AutoTokenizer
from transformers import AutoModelForCausalLM
from tqdm.auto import tqdm
from datasets import load_dataset
from cluster_intrep_repo.utils import initialize_tokenizer, tokenize_blocksworld_generation, THINK_TOKEN, THINK_START_TOKEN



os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"

compute_dtype = torch.bfloat16
device   = 'cuda:2'
model_id = "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B"

In [33]:
tokenizer.encode("adsad </think>")

[151646, 7664, 329, 220, 151649]

In [4]:
tokenizer = initialize_tokenizer(model_id)

In [5]:
blocksworld_type = "6-blocks"

dataset = load_dataset(f"dmitriihook/deepseek-r1-qwen-32b-planning-{blocksworld_type}")["train"]

In [6]:
model     = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=compute_dtype, attn_implementation="sdpa", 
                                                device_map=device)

Sliding Window Attention is enabled but not implemented for `sdpa`; unexpected results may be encountered.


Loading checkpoint shards:   0%|          | 0/8 [00:00<?, ?it/s]

In [8]:
item = dataset[0]
generation = item["generation"]

In [10]:
tokens = tokenize_blocksworld_generation(tokenizer, item,)
tokens.shape

torch.Size([1, 3909])

In [55]:
new_tokens = tokens[0, :1250] 
new_tokens = torch.cat([new_tokens, torch.tensor([THINK_TOKEN], dtype=torch.long)])

In [56]:
print(tokenizer.decode(new_tokens))

<｜begin▁of▁sentence｜><｜User｜>I am playing with a set of blocks where I need to arrange the blocks into stacks. Here are the actions I can do

Pick up a block
Unstack a block from on top of another block
Put down a block
Stack a block on top of another block

I have the following restrictions on my actions:
I can only pick up or unstack one block at a time.
I can only pick up or unstack a block if my hand is empty.
I can only pick up a block if the block is on the table and the block is clear. A block is clear if the block has no other blocks on top of it and if the block is not picked up.
I can only unstack a block from on top of another block if the block I am unstacking was really on top of the other block.
I can only unstack a block from on top of another block if the block I am unstacking is clear.
Once I pick up or unstack a block, I am holding the block.
I can only put down a block that I am holding.
I can only stack a block on top of another block if I am holding the block being

In [57]:

from transformers import LogitsProcessorList, LogitsProcessor

# Define custom logits processor
class DisableTokenLogitsProcessor(LogitsProcessor):
    def __init__(self, token_id):
        self.token_id = token_id

    def __call__(self, input_ids, scores):
        scores[:, self.token_id] = -float("inf")  # Prevent this token from being selected
        return scores

# Create a list of logits processors
logits_processor = LogitsProcessorList([DisableTokenLogitsProcessor(THINK_START_TOKEN)])


In [58]:
new_generation = model.generate(new_tokens.unsqueeze(0).to(device), do_sample=False, max_new_tokens=1000, logits_processor=logits_processor)

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:151643 for open-end generation.


In [61]:
new_generation.shape, tokens.shape

(torch.Size([1, 2251]), torch.Size([1, 3909]))

In [63]:
print(tokenizer.decode(new_tokens))

<｜begin▁of▁sentence｜><｜User｜>I am playing with a set of blocks where I need to arrange the blocks into stacks. Here are the actions I can do

Pick up a block
Unstack a block from on top of another block
Put down a block
Stack a block on top of another block

I have the following restrictions on my actions:
I can only pick up or unstack one block at a time.
I can only pick up or unstack a block if my hand is empty.
I can only pick up a block if the block is on the table and the block is clear. A block is clear if the block has no other blocks on top of it and if the block is not picked up.
I can only unstack a block from on top of another block if the block I am unstacking was really on top of the other block.
I can only unstack a block from on top of another block if the block I am unstacking is clear.
Once I pick up or unstack a block, I am holding the block.
I can only put down a block that I am holding.
I can only stack a block on top of another block if I am holding the block being

In [64]:
print(generation)
print()
print("============"*3)
print()
print(tokenizer.decode(new_generation[0]))

Okay, so I have this block-stacking problem to solve. Let me try to figure out the steps needed to get from the initial state to the goal state. I'll take it step by step because I'm still getting the hang of these kinds of problems.

First, let me understand the initial conditions. The blocks are A, B, C, D, E, F. The hand is empty. The clear blocks are A, C, and E. That means these blocks have nothing on top of them and are on the table or on another block that's clear.

Looking at the initial setup:

- Block B is on top of F.
- Block C is on top of D.
- Block D is on top of B.
- So, the stack from bottom to top is F, B, D, C.
- Blocks A and E are on the table, clear.
- Block F is also on the table.

So, the initial stacks are:
- F (on table) → B → D → C
- A (on table)
- E (on table)

The goal is to have:
- A on top of E
- B on top of D
- C on top of B
- E on top of C
- F on top of A

Wait, that seems a bit confusing. Let me parse the goal again:

Goal:
- A is on top of E
- B is on t