<div style="display: flex; justify-content: space-between; align-items: flex-start;">
    <div style="text-align: left;">
        <p style="color:#FFD700; font-size: 15px; font-weight: bold; margin-bottom: 1px; text-align: left;">Published on  May 1, 2025</p>
        <h4 style="color:#4B0082; font-weight: bold; text-align: left; margin-top: 6px;">Author: Jocelyn C. Dumlao</h4>
        <p style="font-size: 17px; line-height: 1.7; color: #333; text-align: center; margin-top: 20px;"></p>
        <a href="https://www.linkedin.com/in/jocelyn-dumlao-168921a8/" target="_blank" style="display: inline-block; background-color: #003f88; color: #fff; text-decoration: none; padding: 5px 10px; border-radius: 10px; margin: 15px;">LinkedIn</a>
        <a href="https://github.com/jcdumlao14" target="_blank" style="display: inline-block; background-color: transparent; color: #059c99; text-decoration: none; padding: 5px 10px; border-radius: 10px; margin: 15px; border: 2px solid #007bff;">GitHub</a>
        <a href="https://www.youtube.com/@CogniCraftedMinds" target="_blank" style="display: inline-block; background-color: #ff0054; color: #fff; text-decoration: none; padding: 5px 10px; border-radius: 10px; margin: 15px;">YouTube</a>
        <a href="https://www.kaggle.com/jocelyndumlao" target="_blank" style="display: inline-block; background-color: #3a86ff; color: #fff; text-decoration: none; padding: 5px 10px; border-radius: 10px; margin: 15px;">Kaggle</a>
    </div>
</div>

# <p style="padding:10px;background-color:#09abf2;font-family:newtimeroman;font-size:100%;text-align:center;border-radius:12px;font-weight:200;border: 6px outset #f2102e;">Import Libraries</p>

In [1]:
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
import json
import os

import warnings
warnings.filterwarnings('ignore')


## Reference: [Qwen 3 qwen-lm/qwen-3](https://www.kaggle.com/models/qwen-lm/qwen-3/)

# <p style="padding:10px;background-color:#09abf2;font-family:newtimeroman;font-size:100%;text-align:center;border-radius:12px;font-weight:200;border: 6px outset #f2102e;">Define paths to the datasets</p>

- The code first locates the JSON files containing the ARC (Abstraction and Reasoning Corpus) dataset.


In [2]:
# Define paths to the datasets
training_solutions_path = '/kaggle/input/arc-prize-2025/arc-agi_training_solutions.json' if os.path.exists('/kaggle/input/arc-prize-2025/arc-agi_training_solutions.json') else 'arc-agi_training_solutions.json'  
evaluation_solutions_path = '/kaggle/input/arc-prize-2025/arc-agi_evaluation_solutions.json' if os.path.exists('/kaggle/input/arc-prize-2025/arc-agi_evaluation_solutions.json') else 'arc-agi_evaluation_solutions.json' 
evaluation_challenges_path = '/kaggle/input/arc-prize-2025/arc-agi_evaluation_challenges.json' if os.path.exists('/kaggle/input/arc-prize-2025/arc-agi_evaluation_challenges.json') else 'arc-agi_evaluation_challenges.json'
sample_submission_path = '/kaggle/input/arc-prize-2025/sample_submission.json' if os.path.exists('/kaggle/input/arc-prize-2025/sample_submission.json') else 'sample_submission.json' 
training_challenges_path = '/kaggle/input/arc-prize-2025/arc-agi_training_challenges.json' if os.path.exists('/kaggle/input/arc-prize-2025/arc-agi_training_challenges.json') else 'arc-agi_training_challenges.json'
test_challenges_path = '/kaggle/input/arc-prize-2025/arc-agi_test_challenges.json' if os.path.exists('/kaggle/input/arc-prize-2025/arc-agi_test_challenges.json') else 'arc-agi_test_challenges.json'


# <p style="padding:10px;background-color:#09abf2;font-family:newtimeroman;font-size:100%;text-align:center;border-radius:12px;font-weight:200;border: 6px outset #f2102e;">Load the Data</p>

- It then loads the contents of these JSON files into Python dictionaries. This includes training puzzles, evaluation puzzles, solutions, and a sample submission file.
  

In [3]:
# Load the training data
def load_json(path):
    with open(path, 'r') as f:
        return json.load(f)

training_solutions = load_json(training_solutions_path)
evaluation_solutions = load_json(evaluation_solutions_path)
evaluation_challenges = load_json(evaluation_challenges_path)
sample_submission = load_json(sample_submission_path)
training_challenges = load_json(training_challenges_path)
test_challenges = load_json(test_challenges_path)

# <p style="padding:10px;background-color:#09abf2;font-family:newtimeroman;font-size:100%;text-align:center;border-radius:12px;font-weight:200;border: 6px outset #f2102e;">Choose a Model</p>

- Download a pre-trained Qwen model from Kaggle Hub (a place to share models).

In [4]:
# Model loading and setup
try:
    import kagglehub
    model_name = kagglehub.model_download("qwen-lm/qwen-3/transformers/0.6b")
    print(f"Model downloaded from Kaggle Hub: {model_name}")
except ImportError:
    print("Kaggle Hub not available.  Make sure you have kaggle installed and are in a kaggle environment with internet access.")
    model_name = "Qwen/Qwen-7B"  # or any other appropriate Qwen model you have access to
except Exception as e:
     print(f"Error downloading from Kaggle Hub: {e}.  Using a local or alternative model.")
     model_name = "Qwen/Qwen-7B" # Replace this if needed to a suitable model
     print(f"Trying model {model_name}")



Model downloaded from Kaggle Hub: /kaggle/input/qwen-3/transformers/0.6b/1


# <p style="padding:10px;background-color:#09abf2;font-family:newtimeroman;font-size:100%;text-align:center;border-radius:12px;font-weight:200;border: 6px outset #f2102e;">Set Up the Hardware</p>

- It determines if a GPU is available (using CUDA) and sets the processing device to either the GPU or the CPU.


In [5]:
# Check for CUDA availability and set device
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using device: {device}")

Using device: cuda


# <p style="padding:10px;background-color:#09abf2;font-family:newtimeroman;font-size:100%;text-align:center;border-radius:12px;font-weight:200;border: 6px outset #f2102e;">Load the Model and Tokenizer</p>

- This is the core step. It loads the Qwen model and its corresponding tokenizer. The tokenizer converts text into numbers that the model can understand, and vice versa. The model is also moved to the chosen device (GPU or CPU).


In [6]:
# Load tokenizer and model
try:
    tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) #trust_remote_code needed for some models
    model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto", trust_remote_code=True)
    model.eval() # Set model to evaluation mode

    if (device == "cuda"):
        model = model.to(device) # Move the model to the GPU
except Exception as e:
    print(f"Error loading model: {e}")
    raise  # Re-raise the exception so the program stops if the model cannot be loaded.


2025-05-07 19:02:32.108005: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1746644552.288761      19 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1746644552.341484      19 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered


# <p style="padding:10px;background-color:#09abf2;font-family:newtimeroman;font-size:100%;text-align:center;border-radius:12px;font-weight:200;border: 6px outset #f2102e;">Create a Chatbot Class</p>

- The chatbot will remember the conversation, and it can use that knowledge to make a better prediction.
  

In [7]:
class QwenChatbot:
    def __init__(self, model, tokenizer):
        self.tokenizer = tokenizer
        self.model = model
        self.history = []

    def generate_response(self, user_input, enable_thinking=True, max_tokens=512):  # Added max_tokens
        messages = self.history + [{"role": "user", "content": user_input}]

        text = self.tokenizer.apply_chat_template(
            messages,
            tokenize=False,
            add_generation_prompt=True,
            enable_thinking=enable_thinking
        )

        inputs = self.tokenizer(text, return_tensors="pt").to(self.model.device)  # Move input to the same device as the model

        with torch.no_grad():  # Disable gradient calculation for inference
            generated_ids = self.model.generate(
                **inputs,
                max_new_tokens=max_tokens  # Use max_tokens here
            )

        output_ids = generated_ids[0][len(inputs.input_ids[0]):].tolist()

        #parsing thinking content
        try:
            # rindex finding 151668 ()
            index = len(output_ids) - output_ids[::-1].index(151668)
        except ValueError:
            index = 0

        thinking_content = self.tokenizer.decode(output_ids[:index], skip_special_tokens=True).strip("\n")
        response = self.tokenizer.decode(output_ids[index:], skip_special_tokens=True).strip("\n")

        # Update history
        self.history.append({"role": "user", "content": user_input})
        self.history.append({"role": "assistant", "content": response})

        return thinking_content, response


# <p style="padding:10px;background-color:#09abf2;font-family:newtimeroman;font-size:100%;text-align:center;border-radius:12px;font-weight:200;border: 6px outset #f2102e;">Demonstration and Task Integration</p>

- The script includes an if __name__ == "__main__": block, which is executed when the script is run directly (not imported as a module).
- First input (without /think or /no_think tags, thinking mode is enabled by default)
- Second input with /no_think
- Third input with /think
- The script loops through a few challenges from the training set. For each challenge, it constructs a prompt by showing the model some examples (input/output pairs) and asking it to predict the output for a new input.

Create a Chatbot object and interact with the model. 
- The model can respond in two ways: The first way is that it will tell the thought about the answer. The second way is the real answer that users want to know.

In [8]:
# --- Demonstration and ARC Task Integration ---

# Initialize chatbot
chatbot = QwenChatbot(model, tokenizer)

# Example Usage with the provided test cases.
if __name__ == "__main__":

    # First input (without /think or /no_think tags, thinking mode is enabled by default)
    user_input_1 = "How many r's in strawberries?"
    print(f"User: {user_input_1}")
    thinking_content, response_1 = chatbot.generate_response(user_input_1)
    print(f"Thinking: {thinking_content}")
    print(f"Bot: {response_1}")
    print("----------------------")

    # Second input with /no_think
    user_input_2 = "Then, how many r's in blueberries? /no_think"
    print(f"User: {user_input_2}")
    thinking_content, response_2 = chatbot.generate_response(user_input_2, enable_thinking=False) # explicitly disable thinking for the demo
    print(f"Thinking: {thinking_content}") # should be empty
    print(f"Bot: {response_2}")
    print("----------------------")

    # Third input with /think
    user_input_3 = "Really? /think"
    print(f"User: {user_input_3}")
    thinking_content, response_3 = chatbot.generate_response(user_input_3, enable_thinking=True)
    print(f"Thinking: {thinking_content}")
    print(f"Bot: {response_3}")

    print("----------------------")


    

User: How many r's in strawberries?
Thinking: <think>
Okay, the user is asking, "How many r's in strawberries?" Let me break this down. First, I need to count the number of 'r's in the word "strawberries." Let me write it out: S-T-R-A-W-B-E-R-R-I-N-G-S. Now, I'll go through each letter one by one. Starting with the first letter 'S', there are no 'r's. Next, 'T' – no. Then 'R' – that's the first 'r'. Then 'A', 'W', 'B', 'E', 'R', 'R', 'I', 'N', 'G'. So here, there are two 'r's. Wait, let me check again. The word is "strawberries." Let me spell it out: S-T-R-A-W-B-E-R-R-I-N-G-S. Yes, the first 'r' and the second 'r' in the middle. So that's two. I think that's correct. The user might be testing if I can count them properly. Let me make sure there's no other 'r's. No, only two. So the answer should be 2.
</think>
Bot: How many r's in strawberries?  

There are **2** r's in the word "strawberries."  

**Answer:** 2
----------------------
User: Then, how many r's in blueberries? /no_think
T

# <p style="padding:10px;background-color:#09abf2;font-family:newtimeroman;font-size:100%;text-align:center;border-radius:12px;font-weight:200;border: 6px outset #f2102e;">Solve Challenges - ARC dataset</p>

In [9]:
# --- Example of integrating with the ARC dataset ---
# Let's try to answer one of the training challenges.
sample_challenge_id = list(training_challenges.keys())[0]  # Get the first challenge ID
challenge = training_challenges[sample_challenge_id]

# Construct a prompt that describes the task and provides examples
prompt = f"Solve the following abstract reasoning challenge.  Here's the challenge:\n\n"
for i, task in enumerate(challenge['train']): # Use examples from the training set to build the prompt
     prompt += f"Input {i+1}:\n"
     prompt += str(task['input']) + "\n"
     prompt += f"Output {i+1}:\n"
     prompt += str(task['output']) + "\n\n"

# Add the test input.  Tell the model to predict this output
prompt += "Now, predict the output for the following test input:\n"
prompt += str(challenge['test'][0]['input']) + "\n"
prompt += "Output:\n"  # Ask the model to generate the output

# Generate the response
print("----------------------")
print("Solving ARC Challenge:")
print(f"Challenge ID: {sample_challenge_id}")
thinking_content, arc_response = chatbot.generate_response(prompt)
print(f"Thinking: {thinking_content}")
print(f"Model's Prediction:\n{arc_response}")
print("----------------------")

----------------------
Solving ARC Challenge:
Challenge ID: 00576224
Thinking: 
Model's Prediction:
<think>
Okay, let's see. The user provided two examples and then a new input. The task is to predict the output for the test input [[3,2],[7,8]] based on the patterns in the previous examples. 

First, I need to understand the pattern from the first two outputs. Let me look at the first input [[7,9],[4,3]] and its output [[7,9,7,9,7,9],[4,3,4,3,4,3],[9,7,9,7,9,7],[3,4,3,4,3,4],[7,9,7,9,7,9],[4,3,4,3,4,3]]. 

Looking at the output, each line seems to have the same elements as the input but arranged in a different order. Let me check:

Input 1:
[7,9] → Output starts with [7,9,7,9,7,9]
So first two numbers are same as input, then two more, then repeats. Similarly, the next input [[4,3]] gives [4,3,4,3,4,3]. So it's repeating the input elements in each line, but with a different order. 

Now, the test input is [[3,2],[7,8]]. Let's apply the same pattern. 

First line: [3,2] → Output starts w