In [1]:
%%capture
!pip install unsloth
!pip install trl
!pip install torch
!pip install transformers
!pip install datasets
!pip install huggingface_hub
!pip install wandb

In [2]:
!pip install --upgrade torch transformers



In [4]:
from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()
hugging_face_token = user_secrets.get_secret(" HUGGINGFACE_TOKEN")
cohere_api_key = user_secrets.get_secret("COHERE_API_KEY")
groq_api_key = user_secrets.get_secret("GROQ_API_KEY")
wandb_api_key = user_secrets.get_secret("WANDB_API_KEY")

In [3]:
# Modules for fine-tuning
# from unsloth import FastLanguageModel
import torch # Import PyTorch
from trl import SFTTrainer # Trainer for supervised fine-tuning (SFT)
from unsloth import is_bfloat16_supported # Checks if the hardware supports bfloat16 precision


from datasets import load_dataset # Lets you load fine-tuning datasets


Please restructure your imports with 'import unsloth' at the top of your file.
  from unsloth import is_bfloat16_supported # Checks if the hardware supports bfloat16 precision


🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
🦥 Unsloth Zoo will now patch everything to make training faster!


In [10]:
from huggingface_hub import login

login(hugging_face_token)

In [7]:
import wandb
# Retrieve and login with the stored WANDB API key
wandb.login(key=wandb_api_key)

# Initialize the W&B run
run = wandb.init(
    project="Fine-tune-DeepSeek-R1",
    job_type="training",
    anonymous="allow"
)

[34m[1mwandb[0m: Using wandb-core as the SDK backend.  Please refer to https://wandb.me/wandb-core for more information.
[34m[1mwandb[0m: Currently logged in as: [33mvotuananh-frtk[0m ([33mvotuananh-frtk-self-employed[0m). Use [1m`wandb login --relogin`[0m to force relogin
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc


In [11]:
# Set parameters
max_seq_length = 2048 # Define the maximum sequence length a model can handle (i.e. how many tokens can be processed at once)
dtype = None # Set to default 
load_in_4bit = True # Enables 4 bit quantization — a memory saving optimization 

In [12]:
# Load the DeepSeek R1 model and tokenizer using unsloth — imported using: from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="unsloth/DeepSeek-R1-Distill-Llama-8B",  # Load the pre-trained DeepSeek R1 model (8B parameter version)
    max_seq_length=max_seq_length, # Ensure the model can process up to 2048 tokens at once
    dtype=dtype, # Use the default data type (e.g., FP16 or BF16 depending on hardware support)
    load_in_4bit=load_in_4bit, # Load the model in 4-bit quantization to save memory
    token=hugging_face_token, # Use hugging face token
)

==((====))==  Unsloth 2025.3.15: Fast Llama patching. Transformers: 4.49.0.
   \\   /|    Tesla P100-PCIE-16GB. Num GPUs = 1. Max memory: 15.888 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.6.0+cu124. CUDA: 6.0. CUDA Toolkit: 12.4. Triton: 3.2.0
\        /    Bfloat16 = FALSE. FA [Xformers = 0.0.29.post3. FA2 = False]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


model.safetensors:   0%|          | 0.00/5.96G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/236 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/53.0k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/17.2M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/483 [00:00<?, ?B/s]

In [14]:
# Define a system prompt under prompt_style 
prompt_style = """Below is an instruction that describes a task, paired with an input that provides further context. 
Write a response that appropriately completes the request. 
Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.

### Instruction:
You are an expert in Economics in general and Marxian economics in particular with advanced knowledge on capitalism, mode of productions, theories of values, prics, and advanced economics theories with a critical eye. 
Please answer the following economic question.

### Question:
{}

### Response:
<think>{}"""

In [56]:
# Creating an economic question for inference
question = """ what is Estranged Labour  """ 

In [57]:
# Enable optimized inference mode for Unsloth models (improves speed and efficiency)
FastLanguageModel.for_inference(model)  # Unsloth has 2x faster inference!

# Format the question using the structured prompt (`prompt_style`) and tokenize it
inputs = tokenizer([prompt_style.format(question, "")], return_tensors="pt").to("cuda")  # Convert input to PyTorch tensor & move to GPU

# Generate a response using the model
outputs = model.generate(
    input_ids=inputs.input_ids, # Tokenized input question
    attention_mask=inputs.attention_mask, # Attention mask to handle padding
    max_new_tokens=1200, # Limit response length to 1200 tokens (to prevent excessive output)
    use_cache=True, # Enable caching for faster inference
)

# Decode the generated output tokens into human-readable text
response = tokenizer.batch_decode(outputs)

# Extract and print only the relevant response part (after "### Response:")
print(response[0].split("### Response:")[1])  


<think>
Okay, so I need to explain Estranged Labour in the context of Marxist economics. I'm a bit rusty on this, but let me try to piece it together. I remember that Estranged Labour is a concept from Karl Marx's works, probably Wages of Labour or somewhere in the 18th and 19th century French and British economics criticism. The term "estranged labour" means labour that is not performed in a direct, immediate connection with the producer. So, it's about the separation of the worker from their work.

Marx talks about how under capitalism, the worker doesn't own the means of production, so they have to sell their labour for a wage. This separation means the worker doesn't have control over the product of their labour. The worker's labour is abstract and not tied to the work itself, hence estranged. This abstraction allows the capitalist to extract the maximum amount of surplus value from the worker.

I should break this down into key points. First, the separation of worker from their w

In [25]:
# Updated training prompt style to add </think> tag 
train_prompt_style = """Below is an instruction that describes a task, paired with an input that provides further context. 
Write a response that appropriately completes the request. 
Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.

### Instruction:
You are an expert in Economics in general and Marxian economics in particular with advanced knowledge on capitalism, mode of productions, theories of values, prices, and advanced economics theories with a critical eye. 
Please answer the following economic question.
 

### Question:
{}

### Response:
<think>
{}
</think>
{}"""

In [18]:
import os

base_path = "/kaggle/input/"
print("Available datasets:", os.listdir(base_path))
dataset_path = "/kaggle/input/das-kapital"
print("Files inside dataset:", os.listdir(dataset_path))

Available datasets: ['das-kapital']
Files inside dataset: ['Economic and Philosophic Manuscripts.txt']


In [19]:
from datasets import Dataset

# Function to read and clean text
def read_and_clean(file_path):
    with open(file_path, "r", encoding="utf-8") as file:
        text = file.read()
    return " ".join(text.split())  # Normalize spaces

# Read all text files
file_paths = [os.path.join(dataset_path, f) for f in os.listdir(dataset_path) if f.endswith(".txt")]
texts = [read_and_clean(f) for f in file_paths]

# Create a Hugging Face Dataset
dataset = Dataset.from_dict({"text": texts})

# Show the first 500 characters
print("Dataset loaded successfully!")
print(dataset[0]["text"][:500])

Dataset loaded successfully!
Karl Marx Works 1844 Economic & Philosophic Manuscripts of 1844 Contents Preface First Manuscript Wages of Labour Profit of Capital 1. Capital 2. The Profit of Capital 3. The Rule of Capital Over Labour and the Motives of the Capitalist 4. The Accumulation of Capitals and the Competition Among the Capitalists Rent of Land Estranged Labour Second Manuscript Antithesis of Capital and Labour. Landed Property and Capital Third Manuscript Private Property and Labour Private Property and Communism Hum


=========================================================================================

In [20]:
# We need to format the dataset to fit our prompt training style 
EOS_TOKEN = tokenizer.eos_token  # Define EOS_TOKEN which the model when to stop generating text during training
EOS_TOKEN

'<｜end▁of▁sentence｜>'

In [28]:
print(dataset.column_names)

['text']


In [31]:
print(dataset["text"][0])

Karl Marx Works 1844 Economic & Philosophic Manuscripts of 1844 Contents Preface First Manuscript Wages of Labour Profit of Capital 1. Capital 2. The Profit of Capital 3. The Rule of Capital Over Labour and the Motives of the Capitalist 4. The Accumulation of Capitals and the Competition Among the Capitalists Rent of Land Estranged Labour Second Manuscript Antithesis of Capital and Labour. Landed Property and Capital Third Manuscript Private Property and Labour Private Property and Communism Human Needs & Division of Labour Under the Rule of Private Property The Power Of Money Critique of the Hegelian Dialectic and Philosophy as a Whole and Hegel’s Construction of The Phenomenology, November 1844 Plan for a Work on The Modern State, November 1844 Preface ||XXXIX| I have already announced in the Deutsch-Französische Jahrbücher the critique of jurisprudence and political science in the form of a critique of the Hegelian philosophy of law. While preparing it for publication, the interming

In [41]:
print(len(dataset))

1


In [None]:
# # Define formatting prompt function
# def formatting_prompts_func(examples):  # Takes a batch of dataset examples as input
#     inputs = examples["text"]       # Extracts the economic question from the dataset
#     cots = examples["Complex_CoT"]      # Extracts the chain-of-thought reasoning (logical step-by-step explanation)
#     outputs = examples["Response"]      # Extracts the final model-generated response (answer)
    
#     texts = []  # Initializes an empty list to store the formatted prompts
    
#     # Iterate over the dataset, formatting each question, reasoning step, and response
#     for input, cot, output in zip(inputs, cots, outputs):  
#         text = train_prompt_style.format(input, cot, output) + EOS_TOKEN  # Insert values into prompt template & append EOS token
#         texts.append(text)  # Add the formatted text to the list

#     return {
#         "text": texts,  # Return the newly formatted dataset with a "text" column containing structured prompts
#     }# Update dataset formatting
# dataset_finetune = dataset.map(formatting_prompts_func, batched = True)
# dataset_finetune["text"][0]


In [None]:
# # Update dataset formatting
# dataset_finetune = dataset.map(formatting_prompts_func, batched = True)
# dataset_finetune["text"][0]

### Step 3 — Setting up the model using LoRA

In [46]:
# Apply LoRA (Low-Rank Adaptation) fine-tuning to the model 
# --- Apply LoRA Fine-Tuning ---
model_lora = FastLanguageModel.get_peft_model(
    model,
    r=16,  # ✅ Increased rank for better adaptation (16 → 32)
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
    lora_alpha=32,  # ✅ Increased scaling factor (16 → 32) for better gradient updates
    lora_dropout=0.1,  # ✅ Added slight dropout to prevent overfitting
    bias="none",
    use_gradient_checkpointing="unsloth",
    random_state=3407,
    use_rslora=False,
    loftq_config=None,
)

Unsloth: Dropout = 0 is supported for fast patching. You are using dropout = 0.1.
Unsloth will patch all other layers, except LoRA matrices, causing a performance hit.


In [48]:
from transformers import TrainingArguments
from trl import SFTTrainer

# --- Training Arguments ---
args = TrainingArguments(
    per_device_train_batch_size=2,  # ✅ Reduced to avoid memory issues
    gradient_accumulation_steps=4,  # ✅ Keeps total batch size = 8
    num_train_epochs=1,  # ✅ More epochs for better learning
    warmup_ratio=0.03,
    max_steps=100,  # ✅ Increased training steps (was -1)
    learning_rate=2e-4,  # ✅ Slightly lower LR for stable training
    fp16=torch.cuda.is_available(),  
    bf16=torch.cuda.is_bf16_supported(),  
    logging_steps=10,  
    optim="adamw_8bit",  
    weight_decay=0.01,  
    lr_scheduler_type="linear",  
    seed=3407,  
    output_dir="outputs",  
    save_strategy="epoch",  
    save_total_limit=3,  # ✅ Keep last 3 checkpoints
)

# --- Fine-Tuning Trainer ---
trainer = SFTTrainer(
    model=model_lora,
    tokenizer=tokenizer,
    train_dataset=dataset,
    max_seq_length=2048,  # ✅ Increased for long-context understanding
    dataset_num_proc=1,
    args=args,
)

Unsloth: Tokenizing ["text"]:   0%|          | 0/1 [00:00<?, ? examples/s]

## Step 4 — Model training! 

In [49]:
# Start the fine-tuning process
trainer_stats = trainer.train()

==((====))==  Unsloth - 2x faster free finetuning | Num GPUs used = 1
   \\   /|    Num examples = 1 | Num Epochs = 100 | Total steps = 100
O^O/ \_/ \    Batch size per device = 2 | Gradient accumulation steps = 4
\        /    Data Parallel GPUs = 1 | Total batch size (2 x 4 x 1) = 8
 "-____-"     Trainable parameters = 41,943,040/8,000,000,000 (0.52% trained)


Step,Training Loss
10,2.2767
20,0.1072
30,0.0035
40,0.0005
50,0.0001
60,0.0001
70,0.0001
80,0.0001
90,0.0001
100,0.0001


In [50]:
# Save the fine-tuned model
wandb.finish()

0,1
train/epoch,▁▁▂▂▂▃▃▄▄▄▅▅▅▆▆▇▇▇██▁▂▂▂▃▃▄▄▄▅▅
train/global_step,▁▁▂▂▂▃▃▄▄▄▅▅▅▆▆▇▇▇██▁▂▂▂▃▃▄▄▄▅▅
train/grad_norm,▅▅▆▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁
train/learning_rate,▂▂▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃█▇▆▆▅▄▃▃▂▁
train/loss,█▇▅▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▆▁▁▁▁▁▁▁▁▁

0,1
total_flos,9273591044505600.0
train/epoch,100.0
train/global_step,100.0
train/grad_norm,0.00061
train/learning_rate,0.0
train/loss,0.0001
train_loss,0.23885
train_runtime,1522.9066
train_samples_per_second,0.525
train_steps_per_second,0.066


## Step 5 — Run model inference after fine-tuning

In [55]:
question = """what is Estranged Labour """

# Load the inference model using FastLanguageModel (Unsloth optimizes for speed)
FastLanguageModel.for_inference(model_lora)  # Unsloth has 2x faster inference!

# Tokenize the input question with a specific prompt format and move it to the GPU
inputs = tokenizer([prompt_style.format(question, "")], return_tensors="pt").to("cuda")

# Generate a response using LoRA fine-tuned model with specific parameters
outputs = model_lora.generate(
    input_ids=inputs.input_ids,          # Tokenized input IDs
    attention_mask=inputs.attention_mask, # Attention mask for padding handling
    max_new_tokens=1200,                  # Maximum length for generated response
    use_cache=True,                        # Enable cache for efficient generation
)

# Decode the generated response from tokenized format to readable text
response = tokenizer.batch_decode(outputs)

# Extract and print only the model's response part after "### Response:"
print(response[0].split("### Response:")[1])


<think>
Okay, so I need to explain Estranged Labour. I remember it's a concept from Marx, but I'm a bit fuzzy on the details. Let me try to piece this together.

First, Estranged Labour is also known as Alienated Labour. I think it's part of Marx's critique of the capitalist system. Labour here refers to work, but not just any work – work under capitalism, where the worker has no control over the production process.

Marx talked about the worker as a separate entity from the work they do. So the worker doesn't feel connected to their labour. This separation is crucial. I think it has to do with how in capitalism, the worker sells their labour as a commodity, which means they're not part of the product itself. This commodity-ification is where the estrangement comes in.

Why is this a problem? Well, because the worker can't control the means of production, they can't direct their work towards meaningful ends. They're just a part of the machinery. This lack of control means the work isn