This a google Colab notebook for running the model.

In [1]:
%%capture
!pip install unsloth
!pip install --force-reinstall --no-cache-dir --no-deps git+https://github.com/unslothai/unsloth.git


In [2]:
SECRETS_DIR = "/content/drive/My Drive/secrets"
DATA_DIR = "/content/drive/My Drive/data"
MODELS_DIR = "/content/drive/My Drive/models"

In [3]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [6]:
!ls

deepseek_model	drive  sample_data


In [5]:
import os
import shutil
os.listdir(MODELS_DIR)
os.makedirs("deepseek_model")
shutil.copy(MODELS_DIR+"/deepseek_model_blog.zip", "/content/deepseek_model")


'/content/deepseek_model/deepseek_model_blog.zip'

In [9]:
os.chdir("/content/deepseek_model")
!unzip -o deepseek_model_blog.zip

Archive:  deepseek_model_blog.zip
  inflating: chat_template.jinja     
  inflating: tokenizer.json          
  inflating: special_tokens_map.json  
  inflating: adapter_config.json     
  inflating: adapter_model.safetensors  
  inflating: README.md               
  inflating: tokenizer_config.json   


In [11]:
secret_file_path = SECRETS_DIR+'/huggingfacetoken.key'
with open(secret_file_path) as f:
  hf_token=f.read().strip()

In [None]:
from unsloth import FastLanguageModel

max_seq_length = 2048
dtype = None
load_in_4bit = True


model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "unsloth/DeepSeek-R1-Distill-Llama-8B",
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
    token = hf_token,
)

==((====))==  Unsloth 2025.6.5: Fast Llama patching. Transformers: 4.52.4.
   \\   /|    Tesla T4. Num GPUs = 1. Max memory: 14.741 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.7.0+cu126. CUDA: 7.5. CUDA Toolkit: 12.6. Triton: 3.3.0
\        /    Bfloat16 = FALSE. FA [Xformers = 0.0.30. FA2 = False]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


model.safetensors.index.json:   0%|          | 0.00/24.2k [00:00<?, ?B/s]

model-00001-of-000002.safetensors:   0%|          | 0.00/8.67G [00:00<?, ?B/s]

model-00002-of-000002.safetensors:   0%|          | 0.00/7.39G [00:00<?, ?B/s]

In [31]:
#from peft import PeftModel
adapter_path="/content/deepseek_model"
#model = PeftModel.from_pretrained(model, adapter_path)
model.load_adapter(adapter_path)

Prompt for testing with standard Finnish how that translates to the dialect.

In [32]:
prompt_style = """Below is an instruction that describes a task, paired with an input that provides further context.
Write a response that appropriately completes the request.

### Instruction:
Translate standard Finnish sentence to South Ostrobothnian dialect.

### Question:
{}

### Response:
<think>{}"""

Run test on single sentence.

In [34]:
standard_finnish = "Hevoskauppias ajoi uudella hevosellaan meidän talon ohi. Katsoin häntä aitan ovelta."


FastLanguageModel.for_inference(model)  # Unsloth has 2x faster inference!
inputs = tokenizer([prompt_style.format(standard_finnish, "")], return_tensors="pt").to("cuda")

outputs = model.generate(
    input_ids=inputs.input_ids,
    attention_mask=inputs.attention_mask,
    max_new_tokens=1200,
    use_cache=True,
)
response = tokenizer.batch_decode(outputs)
print("Dialect:")
print(response[0].split("### Response:")[1])


Dialect:

<think>

</think>
Hevooskauppias ajoo uureella hevoosellaan meirän taloon ohi. Kattoon sitä aita ovelta<｜end▁of▁sentence｜>


In [35]:
!ls DeepSeek-R1-South-Ostrobothnia

adapter_config.json	   README.md		    tokenizer.json
adapter_model.safetensors  special_tokens_map.json
chat_template.jinja	   tokenizer_config.json


Save and produce a quantized smaller sized model < 10 Gb

In [44]:
import shutil
new_model_local = "DeepSeek-R1-South-Ostrobothnia-full"

model.save_pretrained_merged(new_model_local, tokenizer) #, save_method = "merged_16bit",)

In [45]:
!du DeepSeek-R1-South-Ostrobothnia-full
!du DeepSeek-R1-South-Ostrobothnia

180784	DeepSeek-R1-South-Ostrobothnia-full
180784	DeepSeek-R1-South-Ostrobothnia
