# Fine-tune Qwen3-1.7B for Chess Move Prediction

## Introduction

In this notebook, we showcase how to fine-tune the Qwen3-1.7B model on AWS Trainium using the Hugging Face Optimum Neuron library.
The goal of this task is chess move prediction — training the model to analyze chess positions in FEN format and select the best moves.

We will fine-tune the model using `optimum.neuron`, save the trained checkpoint, and then deploy it for inference with Optimum-Neuron[vllm], enabling high-performance, low-latency chess move prediction.

By the end of this notebook, you'll have a fine-tuned, Trainium-optimized Qwen3 model ready for deployment and real-time inference. This workflow demonstrates how to leverage the Optimum Neuron toolchain to efficiently train and serve large language models on AWS Neuron devices.

For this module, you will be using the [aicrowd/ChessExplained](https://huggingface.co/datasets/aicrowd/ChessExplained) dataset which consists of thousands of chess positions with expert analysis and move selections.

## About the Dataset

The dataset contains chess positions in FEN (Forsyth-Edwards Notation) format along with:
- Visual board representations
- List of legal moves
- Expert reasoning (in `<think>` tags)
- Best move selection (in `<uci_move>` tags)

**Dataset example:**

*Position (FEN):* `rnbq1rk1/ppp1bpp1/4pn1p/3p4/2PP4/2N1PN2/PP1B1PPP/R2QKB1R b KQ - 0 7`

*Legal moves:* `['g8h8', 'g8h7', 'f8e8', 'd8e8', 'c7c5', 'b7b6', 'a7a6', ...]`

*Expert analysis:* 
```
<think>
After Pawn moves to c5, this causes Black to attacks the pawn on d4. So c5 is the most logical. Position is drawish.
</think>

<uci_move>c7c5</uci_move>
```

By fine-tuning the model over several thousand of these chess examples, the model will learn to analyze positions and generate both reasoning and optimal moves.

This chess move prediction use case was selected so you can successfully fine-tune your model in a reasonably short amount of time (~25 minutes) which is appropriate for this workshop. The same techniques can be applied to more complex reasoning tasks such as strategic game playing, multi-step planning, and expert decision-making.

## Install requirements
This notebook uses [Hugging Face Optimum Neuron](https://github.com/huggingface/optimum-neuron) which works like an interface between the Hugging Face Transformers library and AWS Accelerators including AWS Trainium and AWS Inferentia. We will also install some other libraries like peft, trl etc.  You may see some errors from the pip dependency resolver.  This is expected. 


In [None]:
%cd /home/ubuntu/neuron-workshops/labs/FineTuning/HuggingFaceExample/01_finetuning/assets
%pip install -q -r requirements.txt


# Fine-tuning

In this section, we fine-tune the Qwen3-1.7B model on the chess move prediction task using Hugging Face Optimum Neuron. Here are the parameters we are going to pass - 

1. `--nnodes`:	Number of nodes (1 = single node)
2. `--nproc_per_node`: 	Processes per node (usually equals number of devices).
3. `--model_id, --tokenizer_id`:	Model and tokenizer identifiers (from Hugging Face or local path).
4. `--output_dir`:	Directory for saving checkpoints and logs.
5. `--bf16`:	Enables bfloat16 precision for faster, memory-efficient training.
6. `--gradient_checkpointing`:	Saves memory by recomputing activations during backprop.
7. `--gradient_accumulation_steps`:	Steps to accumulate gradients before optimizer update.
8. `--learning_rate`:	Initial training learning rate.
9. `--max_steps`:	Total number of training steps.
10. `--per_device_train_batch_size`:	Batch size per device.
11. `--tensor_parallel_size`:	Number of devices for tensor parallelism.
12. `--lora_r, --lora_alpha, --lora_dropout`:	LoRA hyperparameters — rank, scaling, and dropout rate.
13. `--dataloader_drop_last`:	Drops last incomplete batch.
14. `--disable_tqdm`: Disables progress bar.
15. `--logging_steps`:	Log interval (in steps).


In [None]:
!torchrun \
  --nnodes 1 \
  --nproc_per_node 2 \
  finetune_chess_model.py \
  --model_id Qwen/Qwen3-1.7B \
  --tokenizer_id Qwen/Qwen3-1.7B \
  --output_dir ~/ml/qwen-chess \
  --bf16 True \
  --gradient_checkpointing True \
  --gradient_accumulation_steps 1 \
  --learning_rate 5e-5 \
  --max_steps 1000 \
  --per_device_train_batch_size 2 \
  --tensor_parallel_size 2 \
  --lora_r 16 \
  --lora_alpha 32 \
  --lora_dropout 0.05 \
  --dataloader_drop_last True \
  --disable_tqdm True \
  --logging_steps 10


# Compilation

After completing the fine-tuning process, the next step is to compile the trained model for AWS Trainium inference using the Hugging Face Optimum Neuron toolchain.
Neuron compilation optimizes the model graph and converts it into a Neuron Executable File Format (NEFF), enabling efficient execution on NeuronCores.


In [None]:
!optimum-cli export neuron \
  --model ~/ml/qwen-chess/merged_model \
  --task text-generation \
  --sequence_length 2048 \
  --batch_size 4 \
  ~/ml/qwen-chess/compiled_model


# Inference

We will install the Optimum Neuron vllm library. Then, run inference using the compiled model.


In [None]:
%pip install optimum-neuron[vllm]


In [None]:
import os
from vllm import LLM, SamplingParams

llm = LLM(
    model="~/ml/qwen-chess/compiled_model", #local compiled model
    max_num_seqs=4,
    max_model_len=2048,
    device="neuron",
    tensor_parallel_size=2,
    override_neuron_config={})

example1="""
<|im_start|>user
You are an expert chess player looking at the following position in FEN format:

rnbq1rk1/ppp1bpp1/4pn1p/3p4/2PP4/2N1PN2/PP1B1PPP/R2QKB1R b KQ - 0 7

Briefly, FEN describes chess pieces by single letters [PNBRKQ] for white and [pnbrkq] for black. The pieces found in each rank are specified, starting at the top of the board (a8..h8) and describing all eight ranks.

Here is an additional visualization of the board (♔♕♖♗♘♙ = White pieces, ♚♛♜♝♞♟ = Black pieces):

a b c d e f g h
+---------------+
8 | ♜ ♞ ♝ ♛ · ♜ ♚ · | 8
7 | ♟ ♟ ♟ · ♝ ♟ ♟ · | 7
6 | · · · · ♟ ♞ · ♟ | 6
5 | · · · ♟ · · · · | 5
4 | · · ♙ ♙ · · · · | 4
3 | · · ♘ · ♙ ♘ · · | 3
2 | ♙ ♙ · ♗ · ♙ ♙ ♙ | 2
1 | ♖ · · ♕ ♔ ♗ · ♖ | 1
+---------------+
a b c d e f g h

The current side to move is black.
The possible legal moves for the side to move are: ['g8h8', 'g8h7', 'f8e8', 'd8e8', 'c7c5', 'b7b6', 'a7a6', 'h6h5', 'e6e5', 'g7g5', 'c7c6', 'b7b5', 'a7a5'].

Your task is to select the best move for the side to move. Output your thinking in <think> tags and the move in <uci_move> tags.<|im_end|>
<|im_start|>assistant
"""

example2="""
<|im_start|>user
You are an expert chess player. Analyze this position in FEN format:

r1bqkbnr/pppp1ppp/2n5/4p3/4P3/5N2/PPPP1PPP/RNBQKB1R w KQkq - 2 3

Here is the board visualization:

a b c d e f g h
+---------------+
8 | ♜ · ♝ ♛ ♚ ♝ ♞ ♜ | 8
7 | ♟ ♟ ♟ ♟ · ♟ ♟ ♟ | 7
6 | · · ♞ · · · · · | 6
5 | · · · · ♟ · · · | 5
4 | · · · · ♙ · · · | 4
3 | · · · · · ♘ · · | 3
2 | ♙ ♙ ♙ ♙ · ♙ ♙ ♙ | 2
1 | ♖ ♘ ♗ ♕ ♔ ♗ · ♖ | 1
+---------------+
a b c d e f g h

The current side to move is white.
Select the best move from: ['d2d4', 'f1c4', 'f1b5', 'b1c3', 'd2d3']

Output your analysis in <think> tags and your move choice in <uci_move> tags.<|im_end|>
<|im_start|>assistant
"""

example3="""
<|im_start|>user
Analyze this chess position in FEN format:

r2qkb1r/ppp2ppp/2n2n2/3pp1B1/1b1PP3/2N2N2/PPP2PPP/R2QKB1R w KQkq - 0 6

Board visualization:

a b c d e f g h
+---------------+
8 | ♜ · · ♛ ♚ ♝ · ♜ | 8
7 | ♟ ♟ ♟ · · ♟ ♟ ♟ | 7
6 | · · ♞ · · ♞ · · | 6
5 | · · · ♟ ♟ · ♗ · | 5
4 | · ♝ · ♙ ♙ · · · | 4
3 | · · ♘ · · ♘ · · | 3
2 | ♙ ♙ ♙ · · ♙ ♙ ♙ | 2
1 | ♖ · · ♕ ♔ ♗ · ♖ | 1
+---------------+
a b c d e f g h

White to move. Legal moves: ['g5f6', 'g5e7', 'g5h6', 'g5d2', 'c3b5', 'c3d5', 'f3d4', 'f3e5', 'd1d3', 'd1d2', 'e1d2']

Provide your reasoning and best move.<|im_end|>
<|im_start|>assistant
"""

prompts = [
    example1,
    example2,
    example3
]

sampling_params = SamplingParams(max_tokens=2048, temperature=0.8)
outputs = llm.generate(prompts, sampling_params)

print("#########################################################")

for output in outputs:
    prompt = output.prompt
    generated_text = output.outputs[0].text
    print(f"Prompt: {prompt!r}, \n\n Generated text: {generated_text!r} \n")
