## Crop & Fertilizer Recommendation with Fine-tuned DeepSeek Model

This notebook loads the fine-tuned `deepseek-llm-7b-base` model (`aryan6637/deepseek-crop-fertilizer-info-v3`) and performs inference to recommend crop types and fertilizers based on input soil and environmental parameters.

### 1. Installs (if not already installed)

In [3]:
!pip install unsloth

Collecting unsloth
  Downloading unsloth-2025.5.9-py3-none-any.whl.metadata (47 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m47.1/47.1 kB[0m [31m1.7 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting unsloth_zoo>=2025.5.11 (from unsloth)
  Downloading unsloth_zoo-2025.5.11-py3-none-any.whl.metadata (8.1 kB)
Collecting xformers>=0.0.27.post2 (from unsloth)
  Downloading xformers-0.0.30-cp311-cp311-manylinux_2_28_x86_64.whl.metadata (1.0 kB)
Collecting bitsandbytes (from unsloth)
  Downloading bitsandbytes-0.46.0-py3-none-manylinux_2_24_x86_64.whl.metadata (10 kB)
Collecting tyro (from unsloth)
  Downloading tyro-0.9.24-py3-none-any.whl.metadata (11 kB)
Collecting trl!=0.15.0,!=0.9.0,!=0.9.1,!=0.9.2,!=0.9.3,>=0.7.9 (from unsloth)
  Downloading trl-0.18.1-py3-none-any.whl.metadata (11 kB)
Collecting fsspec<=2025.3.0,>=2023.1.0 (from fsspec[http]<=2025.3.0,>=2023.1.0->datasets>=3.4.1->unsloth)
  Downloading fsspec-2025.3.0-py3-none-any.whl.metadata (11 kB)
Collecting n

### 2. Imports

In [4]:
import unsloth
import torch
from unsloth import FastLanguageModel
import pickle
import os
from huggingface_hub import hf_hub_download, login

print(f"Unsloth version: {unsloth.__version__}")
print(f"PyTorch version: {torch.__version__}")

🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.


2025-06-04 08:42:34.661615: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1749026554.868408      35 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1749026554.927128      35 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered


🦥 Unsloth Zoo will now patch everything to make training faster!
Unsloth version: 2025.5.9
PyTorch version: 2.7.0+cu126


### 3. Hugging Face Login (Optional)
Required if your model repository is private or if you need to interact with the Hub in other ways. For public models, this might not be strictly necessary for downloading.

In [None]:
# Replace 'YOUR_HF_TOKEN_HERE' with your actual Hugging Face token if needed
# hf_token = "YOUR_HF_TOKEN_HERE" 
# login(token=hf_token)

### 4. Model Loading Configuration

In [None]:
model_repo_id = "aryan6637/deepseek-crop-fertilizer-info-v3"
max_seq_length = 1024  # Must match the max_seq_length used during training
dtype = None 
load_in_4bit = True  # Use 4-bit quantization to save memory

### 5. Load Model and Tokenizer
Unsloth's `FastLanguageModel` will load the base model and automatically apply the LoRA adapters from your Hugging Face repository.

In [6]:
print(f"Loading model from {model_repo_id}...")
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name=model_repo_id,
    max_seq_length=max_seq_length,
    dtype=dtype,
    load_in_4bit=load_in_4bit,
    # token=hf_token, # Add your token here if the model is private
)
print("Model and tokenizer loaded successfully.")

# Ensure pad token is set (Unsloth usually handles this, but good to double-check)
if tokenizer.pad_token is None:
    print("Tokenizer does not have a pad token. Setting to eos_token.")
    tokenizer.pad_token = tokenizer.eos_token
    # Update model config if necessary, though FastLanguageModel might handle this
    model.config.pad_token_id = tokenizer.pad_token_id

print(f"Pad token: {tokenizer.pad_token}, ID: {tokenizer.pad_token_id}")
print(f"EOS token: {tokenizer.eos_token}, ID: {tokenizer.eos_token_id}")

Loading model from aryan6637/deepseek-crop-fertilizer-info-v3...
==((====))==  Unsloth 2025.5.9: Fast Llama patching. Transformers: 4.51.3.
   \\   /|    Tesla T4. Num GPUs = 2. Max memory: 14.741 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.7.0+cu126. CUDA: 7.5. CUDA Toolkit: 12.6. Triton: 3.3.0
\        /    Bfloat16 = FALSE. FA [Xformers = 0.0.30. FA2 = False]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


pytorch_model.bin.index.json:   0%|          | 0.00/22.5k [00:00<?, ?B/s]

pytorch_model-00001-of-00002.bin:   0%|          | 0.00/9.97G [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/23.6k [00:00<?, ?B/s]

pytorch_model-00002-of-00002.bin:   0%|          | 0.00/3.85G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/121 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/792 [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/4.61M [00:00<?, ?B/s]

deepseek-ai/deepseek-llm-7b-base does not have a padding token! Will use pad_token = <|PAD_TOKEN|>.


adapter_model.safetensors:   0%|          | 0.00/150M [00:00<?, ?B/s]

Unsloth 2025.5.9 patched 30 layers with 0 QKV layers, 0 O layers and 0 MLP layers.


Model and tokenizer loaded successfully.
Pad token: <|PAD_TOKEN|>, ID: 100015
EOS token: <｜end▁of▁sentence｜>, ID: 100001


In [11]:
def format_inference_prompt(final_parameters):
    instruction_text = (
        f"Given the following soil and environmental parameters:\n"
        f"- Temperature: {final_parameters['Temparature']}°C\n"
        f"- Humidity: {final_parameters['Humidity']}%\n"
        f"- Moisture: {final_parameters['Moisture']}\n"
        f"- Soil Type: {final_parameters['Soil Type']}\n"
        f"- Nitrogen: {final_parameters['Nitrogen']} ppm\n"
        f"- Potassium: {final_parameters['Potassium']} ppm\n"
        f"- Phosphorous: {final_parameters['Phosphorous']} ppm\n\n"
        f"Predict the suitable Crop Type and Fertilizer Name, and provide brief information about how they work or their characteristics."
    )
    # Alpaca format - the model was trained to generate text after "### Response:\n"
    formatted_prompt = f"### Instruction:\n{instruction_text}\n\n### Response:\n"
    return formatted_prompt

### 8. Perform Inference

In [None]:
# Move model to GPU if available
if torch.cuda.is_available():
    model.to("cuda")
    print("Model moved to GPU.")
else:
    print("CUDA not available. Running on CPU. This might be very slow.")

def predict(final_parameters):
    """
    Predict the crop type and fertilizer name based on the provided parameters.
    """
    inference_prompt = format_inference_prompt(final_parameters)
    inputs = tokenizer(inference_prompt, return_tensors="pt").to(model.device)

    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            max_new_tokens=150,  # Adjust based on expected output length
            eos_token_id=tokenizer.eos_token_id,
            pad_token_id=tokenizer.pad_token_id if tokenizer.pad_token_id is not None else tokenizer.eos_token_id,
            do_sample=True,      # Set to True for more creative/varied responses
            temperature=0.6,     # Lower for more factual, higher for more random (0.1-1.0)
            top_p=0.9,           # Nucleus sampling: considers the smallest set of tokens whose cumulative probability exceeds top_p
        )

    full_response_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
    response_only_text = full_response_text[len(inference_prompt):].strip()
    return response_only_text


Model moved to GPU.

Generating response...

--- Full Generated Text (with prompt) ---
### Instruction:
Given the following soil and environmental parameters:
- Temperature: 45.0°C
- Humidity: 45.0%
- Moisture: 23.0
- Soil Type: red
- Nitrogen: 45 ppm
- Potassium: 45 ppm
- Phosphorous: 34 ppm

Predict the suitable Crop Type and Fertilizer Name, and provide brief information about how they work or their characteristics.

### Response:
Recommended Crop Type: Sugarcane
Recommended Fertilizer: 20-20


--- Model's Recommendation ---
Recommended Crop Type: Sugarcane
Recommended Fertilizer: 20-20


In [None]:
inference_prompt = format_inference_prompt(final_parameters)
# Tokenize the prompt
device = "cuda" if torch.cuda.is_available() else "cpu"
inputs = tokenizer(inference_prompt, return_tensors="pt").to(device)

# Generate response
print("\nGenerating response...")
with torch.no_grad(): # Ensure no gradients are calculated during inference
    outputs = model.generate(
        **inputs,
        max_new_tokens=150,  # Adjust based on expected output length (crop name + fertilizer name + brief info)
        eos_token_id=tokenizer.eos_token_id,
        pad_token_id=tokenizer.pad_token_id if tokenizer.pad_token_id is not None else tokenizer.eos_token_id,
        do_sample=True,      # Set to True for more creative/varied responses
        temperature=0.6,     # Lower for more factual, higher for more random (0.1-1.0)
        top_p=0.9,           # Nucleus sampling: considers the smallest set of tokens whose cumulative probability exceeds top_p
        # num_beams=1,       # Use 1 for greedy/sampling, >1 for beam search (slower but can be better)
    )

# Decode and print the full generated text (including prompt)
full_response_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print("\n--- Full Generated Text (with prompt) ---")
print(full_response_text)

# Extract only the newly generated part (the actual response)
response_only_text = full_response_text[len(inference_prompt):].strip()
print("\n--- Model's Recommendation ---")
print(response_only_text)

### 9. Interactive Inference Loop (Optional)
You can uncomment and run the cell below to interactively provide input and get recommendations.